Files
knowledge-base/MONGODB.en.md
Stanislav Hubacek ef3c2f75b1 18.6.2026
2026-06-18 16:25:33 +02:00

5.3 KiB

🥬 MongoDB

Overview

MongoDB is the most widespread document-oriented NoSQL database. It stores data as BSON (binary JSON) documents with a flexible schema. Suitable for applications with rapid development where the schema frequently migrates or is diverse.

Data model

  • Database → Collection → Document (JSON/BSON)
  • Document — fields with key-value, nested objects, arrays
  • Flexible schema — each document can have different fields (but not recommended)
  • ObjectID — default primary key (12-byte: timestamp + machine + PID + counter)

Architecture

mongod (individual node)
  ├── WiredTiger storage engine (default since 3.2)
  │   ├── B-Tree indexes (B-Tree, not LSM)
  │   ├── MVCC (snapshot isolation)
  │   ├── Compression (zlib, snappy, zstd)
  │   └── Cache (WiredTiger internal cache)
  ├── Replication (replica set)
  │   ├── Primary (all writes)
  │   └── Secondary (replication, optional reads)
  └── Sharding (cluster)
      ├── mongos (router)
      ├── Config servers (metadata)
      └── Shards (replica sets)

Replica set

  • Primary node = all writes, secondary = replication (oplog)
  • Automatic failover (election among secondaries)
  • Up to 50 nodes in a replica set, max 7 voting nodes
  • Read preference: primary (default), primaryPreferred, secondary, secondaryPreferred, nearest

Sharding

  • Shard key = decisive for distribution
  • Range sharding — close data on the same shard (good for range queries, risk of hot spots)
  • Hashed sharding — even distribution (good for write throughput, bad for range queries)
  • Zoned sharding — data placed according to zones (geo-distribution, compliance)

Index types

Type Description
Single field Standard B-Tree index
Compound Multiple fields in index (order matters)
Multikey Index on array field — each value separately
Text Full-text search
Geospatial (2d, 2dsphere) Geo queries (near, within, intersect)
Hashed For hashed sharding
TTL Automatic document deletion after expiration
Wildcard Index on unknown/irregular fields

Aggregation pipeline

MongoDB pipeline framework for data transformations:

db.orders.aggregate([
  { $match: { status: "shipped" } },
  { $group: { _id: "$customer_id", total: { $sum: "$amount" } } },
  { $sort: { total: -1 } },
  { $limit: 10 }
])

Recommendations — where MongoDB is better

Area MongoDB Competition Why MongoDB
Flexible schema Schema-less, changes without migration PostgreSQL (ALTER TABLE + migration) Rapid development, MVP, frequent model changes
JSON / documents Native BSON, nested objects PostgreSQL (jsonb, but lacks $ operators) Simpler object mapping from code
Horizontal scaling Native sharding (mongos + config) MySQL (Vitess external) Built-in, simple to set up
Geo-distribution Zoned sharding, replica set per region Cassandra (AP model, different philosophy) CP from CAP, consistency + distribution
Aggregation Aggregation pipeline, $lookup (LEFT JOIN) PostgreSQL (SQL JOINs, more powerful) Useful for denormalized data
Development speed ORM-like (Mongoose), natural JSON SQL (schema first, migrations) Fastest time-to-market

When to use MongoDB

  • Rapid development / MVP — schema evolves frequently, no migrations
  • Catalog data — products with varying attributes (e-commerce, marketplace)
  • Content management — diverse content (blog, CMS, headless CMS)
  • Real-time analytics — aggregations, dashboards, event data
  • IoT / sensor data — diverse message structures
  • Mobile applications — JSON documents naturally map to API responses

When to use something else

  • Financial transactions → PostgreSQL (ACID, referential integrity)
  • Complex reports / JOINs → PostgreSQL or ClickHouse
  • Relationship data (friends, follows) → Neo4j (graph DB)
  • High-throughput writes → Cassandra (AP model, no master bottleneck)
  • Small data, single server → SQLite (simpler, no daemon)

MongoDB licensing

MongoDB changed its license in 2018 from GNU AGPL v3 to SSPL (Server Side Public License):

Variant License Price Conditions
MongoDB Community SSPL Free SSPL: if you offer MongoDB as a managed service, you must release the entire stack (incl. orchestration, monitoring) as open source. Internal use without restrictions
MongoDB Enterprise Advanced Commercial ~$10,000/server/year (Atlas: pay-per-use) Enterprise features (LDAP, Kerberos, auditing, encryption), 24/7 support
MongoDB Atlas Managed Pay-per-use (~$0.10-5.00/hour depending on instance) Fully managed, multi-cloud, auto-scaling, backup, monitoring

Impact: SSPL is similar to Redis model — self-hosted internal use without restrictions, cloud providers (AWS, Azure) cannot offer MongoDB as a managed service without commercial agreement. Alternative: FerretDB (open source proxy compatible with MongoDB wire protocol).

Sources

References, books, and standards: sources/databases/sources.en.md

Last revision: 2026-06-03