mongodb WARNING!
- [+] it is 100% open source 😀
- [?] currently Server Side Public License (SSPL) licenced (wiki)
- [?] 3-letter-agencies might have been involved in the initial funding mongodb (which might or might not be a problem) (Wiki)
- [-] documentation chaos 🙁 also the cli program named “mongo” was renamed “mongosh” which means, A LOT of howtos and documentation is still using the old name
- [-] support not reachable, also the forum does not show a lot of activity 🙁
interesting examples can be found here: https://www.mongodb.com/docs/drivers/rust/current/fundamentals/performance/
let client = Client::with_uri_str("").await?; let data = doc! { "title": "1984", "author": "George Orwell" };
mongodb is written in C++ but most examples are written for rust? interesting 😀
dwaves now also using codeberg.org instead of github (hopefully it will never be sold) for uploading open source code and hopfeully good cooperation with the open source community 🙂
https://codeberg.org/dwaves/PYTHON_EXAMPLES_ONLINE/
# this test was performed inside debian13 kvm qemu vm running on above hardware su - root # install mongodb according to latest howto apt -y install gnupg curl; # this used to be the howto install mongodb in 2025-03 curl -fsSL https://www.mongodb.org/static/pgp/server-8.0.asc | gpg -o /usr/share/keyrings/mongodb-server-8.0.gpg --dearmor; echo "deb [ signed-by=/usr/share/keyrings/mongodb-server-8.0.gpg ] http://repo.mongodb.org/apt/debian bookworm/mongodb-org/8.0 main" | tee /etc/apt/sources.list.d/mongodb-org-8.0.list; apt update; apt install -y mongodb-org; systemctl list-unit-files --type=service|grep mongo; mongod.service disabled enabled systemctl start mongod.service; systemctl status mongod.service; # get the benchmark scripts mkdir /scripts cd /scripts apt -y install python3-pymongo; # install requirements # single threaded Lord of the Rings based benchmark apt -y install python3-tqdm python3-pymongo; # install requirements ### WARNING! have sufficient disk space available! # 100mio records take ~24Gbytes ## install_requirements: apt -y install python3-tqdm python3-pymongo git clone https://codeberg.org/dwaves/PYTHON_EXAMPLES_ONLINE.git ## usage: cd PYTHON_EXAMPLES_ONLINE time python3 ./benchmark_mongodb_singlethread_LOTR_10mio.py; # test with 10mio records time python3 ./benchmark_mongodb_singlethread_LOTR_10mio.py 100_000_000; # yes in python numbers can be written like this, test with 100mio records # that's what it looks like # if user is running # terminal1: iostat -cdx 1 # terminal2: htop # never version with I/O tab (hit tab to switch) # terminal3: while true; do mongosh --eval "show dbs"; sleep 1; clear; done
# next: multi threaded version # answering the question: how is performance with a lot of parallel read, write, update access? # which means: python is starting to insert_one 2x 10x 100x threads at the same time # observation: more threads = write performance suffers, but read performance suffers even more # some results neofetch _,met$$$$$gg. user@debian12DevPython ,g$$$$$$$$$$$$$$$P. ---------------------- ,g$$P" """Y$$.". OS: Debian GNU/Linux 12 (bookworm) x86_64 ,$$P' `$$$. Host: KVM/QEMU (Standard PC (Q35 + ICH9, 2009) pc-q35-7.2) ',$$P ,ggs. `$$b: Kernel: 6.1.0-28-amd64 $$: $$. - ,d$$' Shell: bash 5.2.15 `"Y$b._ CPU: 10x Cores assigned to kvm vm: AMD Ryzen 5 5600G with Radeon Graphics (8) @ 3.928GHz Memory: 2685MiB / 7843MiB ### basic system info ### python version: 3.11.2 kernel version: Linux 6.1.0-28-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.119-1 (2024-11-22) mongodb version: db version v8.0.5 # run mongodb benchmark, with 100000x records 100x threads time python3 /scripts/benchmark_mongodb_multithread_singledatabase_v1-2.py 100000 100; # (more threads than cores) ##### results: ##### performance_insert: 6934.01 per second performance_read: 63.27 per second performance_change: 69.29 per second time python3 /scripts/benchmark_mongodb_multithread_singledatabase_v1-2.py 100000 10; # (threads matches cores count) ##### results: ##### performance_insert: 6993.00 per second performance_read: 634.00 per second performance_change: 675.20 per second time python3 /scripts/benchmark_mongodb_multithread_singledatabase_v1-2.py 100000 2; # (less threads than cores) ##### results: ##### performance_insert: 9660.00 per second performance_read: 4254.00 per second performance_change: 4650.00 per second time python3 /scripts/benchmark_mongodb_multithread_singledatabase_v1-2.py 100000 2; # best performance yet ##### results: ##### performance_insert: 9793.00 per second performance_read: 4285.50 per second performance_change: 4617.50 per second time python3 /scripts/benchmark_mongodb_multithread_singledatabase_v1-2.py 100000 1; # single thread slower ##### results: ##### performance_insert: 5852.00 per second performance_read: 5439.00 per second performance_change: 4974.00 per second neofetch .-/+oossssoo+/-. root@ubuntu2204 `:+ssssssssssssssssss+:` -------------------- -+ssssssssssssssssssyyssss+- OS: Ubuntu 22.04.4 LTS x86_64 .ossssssssssssssssssdMMMNysssso. Host: VMware Virtual Platform None /ssssssssssshdmmNNmmyNMMMMhssssss/ Kernel: 5.15.0-133-generic +ssssssssshmydMMMMMMMNddddyssssssss+ Uptime: 7 days, 11 hours, 54 mins ossyNMMMNyMMhsssssssssssssshmmmhssssssso 2x cores CPU: Intel Xeon E5-2690 v2 (2) @ 2.999GHz .ssssssssdMMMNhsssssssssshNMMMdssssssss. Memory: 2144MiB / 7937MiB ##### benchmark_mongodb_multithread_singledatabase_v1.py ##### ### basic system info ### python version: 3.10.12 kernel version: Linux 5.15.0-133-generic #144-Ubuntu SMP Fri Feb 7 20:47:38 UTC 2025 mongodb version: db version v6.0.15 time python3 /scripts/benchmark_mongodb_multithread_singledatabase_v1-2.py 100000 10 ##### results: ##### performance_insert: 2756.60 per second performance_read: 248.70 per second performance_change: 251.70 per second time python3 /scripts/benchmark_mongodb_multithread_singledatabase_v1-2.py 100000 100 ##### results: ##### performance_insert: 2353.99 per second performance_read: 22.41 per second performance_change: 23.54 per second time python3 /scripts/benchmark_mongodb_multithread_singledatabase_v1-2.py 100000 2 ##### results: ##### performance_insert: 3479.50 per second performance_read: 1548.50 per second performance_change: 1610.00 per second
also check out single threaded version:
# single threaded version (can start it multiple times as a job &)
wget https://dwaves.de/scripts/bench/benchmark_mongodb3.1.py.txt
mv benchmark_mongodb3.1.py.txt benchmark_mongodb3.1.py
time /usr/bin/python3 /scripts/benchmark_mongodb3.1.py; # run benchmark
btw in order to drop a database from bash oneliner
mongosh databasename --eval "db.dropDatabase()"
virtual environment fail: for some reason even when installing with pip or pip3 python can not find pymongo 🙁
apt install python3.13-venv; # install requirements python3 -m venv myenv; # create source myenv/bin/activate; # activate pip3 install pymongo; # install pip packages like robotframework /usr/bin/python3 from pymongo import MongoClient Traceback (most recent call last): File "", line 1, in from pymongo import MongoClient ModuleNotFoundError: No module named 'pymongo'
observations:
- RAM is not even 50% used
- CPU cores are mostly not 100% in use
- low disk activity according to iotop ~ 1.5Mbytes/sec
- less threads = faster?
hardware upgrades: while single thread performance of CPU definately would help, more RAM more CPU cores, faster HD will not help in this scenario…
the big question: how to optimize mongodb for performance?
sharding?
currently testing this, it’s not easy to setup especially because of the documentation chaos (sorry) this video was usefull
here is a script that automatically sets up 2x shards with 2x replsets each, the endpoint for pymongo to connect to is: localhost:26061
############ 4th attempt: how to setup mongodb + replSet + sharding, only 2x shards! ### vmname: debian-13-mongodb-xfs-all-in-one # attach a 2nd as fast as possible disk to vm (vdb) # format + mount vdb mkdir /media/vdb; mkfs.xfs -L vdb /dev/vdb; mount /dev/vdb /media/vdb; PATH_TO_MONGO1="/media/vda/mongo1"; # spreading shards over 2x disks mkdir -p $PATH_TO_MONGO1 PATH_TO_MONGO2="/media/vdb/mongo2"; mkdir -p $PATH_TO_MONGO2; # creating a dir that holds all other dir for this test system mkdir -v $PATH_TO_MONGO1/cfg1; # where config servers store data mkdir -v $PATH_TO_MONGO2/cfg2 mkdir -v $PATH_TO_MONGO1/a1 $PATH_TO_MONGO1/a2 mkdir -v $PATH_TO_MONGO2/b1 $PATH_TO_MONGO2/b2; # 2x shards each on a spearate disk mongod --configsvr --dbpath $PATH_TO_MONGO1/cfg1 --port 26051 --fork --logpath /var/log/log.cfg1 --replSet cfg; # starting 1st cfg server, will replicate 1st cfg server mongod --configsvr --dbpath $PATH_TO_MONGO2/cfg2 --port 26052 --fork --logpath /var/log/log.cfg2 --replSet cfg; # starting 2nd cfg server, will replicate 1st cfg server mongosh --port 26051 --eval 'rs.initiate()'; mongosh --port 26051 --eval 'rs.add("localhost:26052")'; mongod --shardsvr --replSet a --dbpath $PATH_TO_MONGO1/a1 --port 26001 --fork --logpath /var/log/log.a1; # start 1st shard of replicaset mongod --shardsvr --replSet a --dbpath $PATH_TO_MONGO1/a2 --port 26002 --fork --logpath /var/log/log.a2; mongod --shardsvr --replSet b --dbpath $PATH_TO_MONGO2/b1 --port 26101 --fork --logpath /var/log/log.b1; # start 2nd shard of replicaset mongod --shardsvr --replSet b --dbpath $PATH_TO_MONGO2/b2 --port 26102 --fork --logpath /var/log/log.b2; # initiate replication mongosh --port 26001 --eval 'rs.initiate()'; mongosh --port 26001 --eval 'rs.add("localhost:26002")'; mongosh --port 26101 --eval 'rs.initiate()'; mongosh --port 26101 --eval 'rs.add("localhost:26102")'; mongos --configdb "cfg/localhost:26051,localhost:26052" --fork --logpath /var/log/log.mongos1 --port 26061 mongosh --port 26061 --eval 'sh.addShard("a/localhost:26001")'; mongosh --port 26061 --eval 'sh.addShard("b/localhost:26101")'; mongosh --port 26061 --eval 'sh.status'; # enable sharding mongosh --port 26061 --eval 'use benchmark2025'; mongosh --port 26061 --eval 'show dbs'; mongosh --port 26061 --eval 'sh.enableSharding("benchmark2025")'; mongosh --port 26061 --eval 'sh.shardCollection("benchmark2025.document",{_id:1})'; mongosh --port 26061 use config; db.shards.find(); chown -R mongodb:mongodb $PATH_TO_MONGO1; chown -R mongodb:mongodb $PATH_TO_MONGO2; # startup order # 1. config servers (mongod instances) # 2. shards (mongod instances) # 3. mongos routers (mongos instances)
ChatGPT?
question: how to tune mongodb for performance? (maybe even faster parallelism?)
answer: WARNING! SOME PARAMETERS MIGHT NOT EVEN EXIST (ChatGPT likes to make up imaginary parameters and settings)
From your observations, it looks like MongoDB is not resource-starved. Your system has plenty of RAM, CPU usage is moderate, and disk I/O is not maxed out. The fact that reducing the number of threads seems to improve performance suggests that thread contention, locking, or some form of internal overhead is at play. Let’s break this down and explore performance tuning.
🔍 Possible Performance Bottlenecks
- Thread contention & parallelism overhead
- If fewer threads are performing better, it might indicate lock contention or scheduling inefficiencies.
- Too many concurrent queries can overwhelm MongoDB’s internal resource management.
- Query Execution Efficiency
- Are the queries well-indexed?
- Are they using efficient execution plans? (
explain("executionStats")
)
- Write Concern & Read Preference
- High write concern (
majority
) can slow writes. - If too many reads are hitting primary instead of secondaries, that can be an issue.
- High write concern (
- Connection Pooling & Network Latency
- Too many open connections can cause bottlenecks.
- Use
db.serverStatus().connections
to check current usage.
- Storage Engine & Data Layout
- WiredTiger caches efficiently, but inefficient indexing or large documents might cause slowdowns.
- Check
wiredTiger.cache
indb.serverStatus()
to see cache utilization.
⚡ Performance Tuning for MongoDB
1️⃣ Parallelism & Threading
- Reduce number of worker threads if too much parallelism causes slowdowns.
- Check
wiredTiger.concurrentWriteTransactions
andwiredTiger.concurrentReadTransactions
indb.serverStatus()
. - Adjust the
wt_max_concurrent_transactions
parameter.
2️⃣ Index Optimization
- Run
db.collection.getIndexes()
to check if queries use the correct indexes. - Use
explain("executionStats")
on slow queries to ensure proper index usage. - Avoid multi-key indexes when possible, as they can slow down writes.
3️⃣ Improve Write Performance
- Batch inserts instead of single document writes.
- Use WriteConcern(0 or 1) if data durability isn’t critical.
- Turn on journaling for safer writes (
storage.journal.enabled
). - Tune the commitIntervalMs setting in WiredTiger.
4️⃣ Cache & Memory Usage
- MongoDB automatically uses free RAM for caching.
- Check
wiredTiger.cache["bytes currently in the cache"]
to see cache usage. - If working set fits in RAM, increasing cache size (
storage.wiredTiger.engineConfig.cacheSizeGB
) might help.
5️⃣ Storage & Disk I/O
- If using an SSD, ensure XFS filesystem (best for WiredTiger).
- Use compression like snappy to reduce I/O if CPU is underutilized.
- Disable Read-Ahead for better random access performance.
🔥 Final Thoughts
- Your bottleneck is likely thread contention rather than hardware limitations.
- Tuning parallelism (reducing excessive threading) seems to help.
- Check indexes & query execution plans for inefficiencies.
- Adjust write concern, connection pooling, and cache settings for further performance gains.
Want me to analyze a specific workload or explain()
output? 🚀
tidyup
delete all mongodbs that start with “benchmark*”
mongosh --eval ' db.adminCommand("listDatabases").databases.forEach(function(database) { if (database.name.startsWith("benchmark")) { print("Dropping: " + database.name); db.getSiblingDB(database.name).dropDatabase(); }})'
how to monitor disk usage
while true; do mongosh --eval "show dbs"; sleep 3; clear; done;
monitor harddisk activity:
GNU Linux – basic harddisk harddisk benchmarks with fio dd and python (many small files)
mongo politics
“On March 10, 2022, MongoDB warned its users in Russia and Belarus that their data stored on the MongoDB Atlas platform will be destroyed as a result of American sanctions related to the Russo-Ukrainian War.[46]” (Wiki)
liked this article?
- only together we can create a truly free world
- plz support dwaves to keep it up & running!
- (yes the info on the internet is (mostly) free but beer is still not free (still have to work on that))
- really really hate advertisement
- contribute: whenever a solution was found, blog about it for others to find!
- talk about, recommend & link to this blog and articles
- thanks to all who contribute!
