Fossil/knowledge-base

Fork 0

Files

Stanislav Hubacek ef3c2f75b1 18.6.2026

2026-06-18 16:25:33 +02:00

15 KiB

Raw Permalink Blame History

🗄️ Database Architecture

Database Classification

Relational (SQL)

DB	License	Use Case	Details
PostgreSQL	Open source	Universal, geospatial, analytics, AI	POSTGRESQL.en.md
MySQL / MariaDB	Open source	Web, LAMP stack, e-commerce	MYSQL.en.md
Microsoft SQL Server	Proprietary	Enterprise .NET, Windows ecosystem	—
Oracle DB	Proprietary	Enterprise, finance, mainframe, RAC cluster	ORACLE.en.md
Amazon Aurora	Managed	MySQL/PostgreSQL compatible, cloud-native	—

NoSQL

Type	DB	Use Case	Details
Document	MongoDB, Couchbase	JSON data, flexible schema	MONGODB.en.md
Key-Value / Cache	Redis, Memcached, DynamoDB	Cache, session store, real-time	REDIS.en.md
Wide-column	Cassandra, ScyllaDB	Time-series, IoT, big data	CASSANDRA.en.md
Vector	Pinecone, Qdrant, Milvus, pgvector	Embeddings, RAG, semantic search	VECTOR-DBS.en.md
Graph	Neo4j, Dgraph	Relationships, recommendations, social graphs	—

Storage Engines

Common concepts across databases: DATABASE-ENGINES.en.md

Transaction Isolation Levels

Level	Dirty Read	Non-repeatable Read	Phantom Read	Serialization Anomaly
Read Uncommitted	Yes (possible)	Yes	Yes	Yes
Read Committed	No (prevented)	Yes	Yes	Yes
Repeatable Read	No	No	No (PostgreSQL: No)	Yes
Serializable	No	No	No	No

Anomalies:

Dirty Read — reading data from an uncommitted transaction (data may be rolled back)
Non-repeatable Read — same query returns different data (another transaction updated the row in the meantime)
Phantom Read — same query returns new rows (another transaction inserted data matching the condition)
Serialization Anomaly — the result of transactions is not equivalent to any serial order

PostgreSQL vs MySQL Differences

PostgreSQL: Read Uncommitted behaves like Read Committed. Repeatable Read = Snapshot Isolation (also prevents phantom reads). Serializable = SSI.
MySQL InnoDB: Repeatable Read uses next-key locking (prevents phantom reads).

CAP Theorem

In a distributed system, only 2 out of 3 are possible: Consistency, Availability, Partition tolerance.

In practice: P is always required, we choose between CP (consistency) and AP (availability).

PACELC Extension

PACELC extends CAP with behavior under normal conditions (no partition):

Partition → Availability vs Consistency
Else (no partition) → Latency vs Consistency

DB	Partition Choice	Else Choice
Cassandra	AP (availability)	LC (low latency, eventual consistency)
DynamoDB (default)	AP	LC
MongoDB	CP (primary)	LC
PostgreSQL (single)	CP	CC
CockroachDB	CP	CC

Quorum Details

R (read quorum) + W (write quorum) > N (replication factor)
Typical: N=3, R=2, W=2 (tolerates 1 node down)
Sloppy quorum — when a node is unavailable, data is temporarily stored on another node
Hinted handoff — temporary write to another node with a hint, data is transferred upon recovery

Replication

Type	Description	Latency
Synchronous	Write confirmed only after replication to all nodes	High, but consistent
Asynchronous	Write confirmed immediately, replication in the background	Low, possible data loss
Semi-synchronous	Confirmation from majority of nodes	Compromise

Topologies

Leader-Follower (Master-Slave) — reads from replicas
Leader-Leader (Multi-master) — writes to multiple nodes
Quorum-based — R + W > N (Cassandra, DynamoDB)

Sharding

Data distribution across nodes based on a shard key.

                     ┌─────────┐
                     │  Proxy  │
                     │  Router │
                     └────┬────┘
               ┌──────────┼──────────┐
          ┌────▼───┐ ┌───▼────┐ ┌───▼────┐
          │Shard A │ │Shard B │ │Shard C │
          │ 0-100  │ │101-200 │ │201-300 │
          └────────┘ └────────┘ └────────┘

Methods

Method	Description	Advantage	Disadvantage
Hash-based	`shard_id = hash(key) % N`	Even distribution	Loss of range queries
Range-based	Data by range (A-M, N-Z)	Preserves ordering	Hot spots
Consistent hashing	Hash ring, vnodes	Min. rebalancing when number of shards changes	More complex

Routing

Proxy-based — application goes to proxy, which routes (Vitess, ProxySQL, mongos)
Client-side — application knows which shard to target
DNS-based — each shard has its own endpoint

Data Consistency Patterns

Pattern	Description	Example
Strong consistency	After a write, every read sees the latest data	Single DB, Raft, Spanner
Eventual consistency	After a write, data propagates over time	DNS, DynamoDB (default), Cassandra
Read-after-write	The author always sees their own write (others are eventual)	Social networks, comments
Causal consistency	Causally dependent operations are seen in the correct order	COPS, Orbe, MongoDB (causal clusters)
Monotonic reads	You do not see older data after seeing newer data	Cassandra (MONOTONIC_READ consistency)
Monotonic writes	Writes from a single client are in order	Queue-based, single leader

Data Migration

Schema Migration

V1__initial_schema.sql
V2__add_users_table.sql
V3__add_email_index.sql
V4__add_orders_table.sql

Zero-Downtime Migration

Expand — add new column/table (application tolerates both states)
Migrate — backfill data, update application to new schema
Contract — remove old column/table

Tools

Tool	Language	Strategy	Zero-Downtime	Rollback
Flyway	Java (multi-lang CLI)	Versioned SQL	Limited (additive only)	`undo` (limited, enterprise)
Liquibase	Java (multi-lang CLI)	Changesets (XML/YAML/JSON/SQL)	Yes (changeset design)	`rollback <count>`
Alembic	Python	Auto-generation, versioned	Yes (branching)	`downgrade`
Prisma Migrate	TypeScript	Declarative schema → diff	Yes (shadow DB)	`migrate diff`
gh-ost	Go	Triggerless online DDL (MySQL)	Yes (binlog stream)	No (progressive)
pgroll	Go	Online schema migration (PG)	Yes (views, multiple versions)	Yes (immediate)

SQL Antipatterns

Based on More SQL Antipatterns (Karwin, 2026) — 14 new antipatterns:

Language Antipatterns

Antipattern	Problem	Solution
Fear of JOINs	Manual pairing in application instead of JOIN	Use JOIN correctly
Relational Division	Finding sets in WHERE	Relational division (subquery with GROUP BY/HAVING)
Pagination via OFFSET	OFFSET is O(n) — the larger the offset, the slower	Keyset pagination (WHERE id > last_seen)
Non-Sargable queries	Functions on columns in WHERE (`WHERE YEAR(date) = 2026`)	Rewrite as range condition

Optimization Antipatterns

Antipattern	Problem	Solution
Premature denormalization	Denormalization without reason	Measure, then optimize
JSON overuse	JSON as a universal solution	Use JSON only for genuinely flexible data
Cacheless transactions	Relying on query cache (removed in MySQL 8)	Application-level caching

Application Antipatterns

Antipattern	Problem	Solution
Polling	Regularly querying for changes	LISTEN/NOTIFY, Kafka, Change Data Capture
Transaction encapsulation	Each model manages its own transaction	Unit of Work pattern
Fear of deadlocks	Trying to prevent all deadlocks	Mitigation, not prevention
Data hoarding	Storing everything forever	Data retention policies, archiving

Mini-Antipatterns

LIMIT without ORDER BY — nondeterministic results
NATURAL JOIN — fragile, implicit join condition
N+1 queries — query in a loop instead of JOIN/batch
Redundant indexes — duplicate/overlapping indexes unnecessarily slow writes

Designing Data-Intensive Applications (2nd Edition)

Kleppmann, Riccomini (2026) — substantially revised edition.

What's New Compared to 1st Edition

Area	What's New
Cloud-native	Storage = object store (S3, Blob), not local disk. Separation of control/data/compute plane
AI workloads	Vector indexes, DataFrames as a data model, batch processing for training data
Local-first software	DuckDB, PGlite, SQLite — databases running on laptop/edge, sync when connected
Formal methods	Randomized testing, formal verification (important for AI-generated code)
Legal & ethics	GDPR, ethics of predictive analytics, bias, algorithmic accountability
Streaming → SQL views	Materialize, incremental view maintenance — streaming as SQL

Key Principles (unchanged)

Reliability, Scalability, Maintainability — the three pillars of good data systems.

Apache Iceberg Lakehouse

Based on Architecting an Apache Iceberg Lakehouse (Merced, 2026):

What is a Data Lakehouse

An architecture combining the flexibility and low cost of a data lake (object storage) with the performance and governance of a data warehouse. Apache Iceberg is an open source table format.

Iceberg Metadata Architecture

Table metadata (.metadata.json)
    └── Snapshot manifest list
            └── Manifests (file-level stats)
                    └── Data files (Parquet/ORC/Avro)

Key Features

Feature	Description
ACID transactions	Safe concurrent read/write
Schema evolution	Add/drop/rename columns without rewrite
Time travel	Query historical snapshots
Partition evolution	Change partition strategy without data rewrite
Hidden partitioning	Automatic partition filters (user does not need to specify)
Multi-engine	Spark, Flink, Trino, Dremio, Snowflake over the same data

For a broader overview of the Big Data ecosystem (HDFS, Spark, Flink, Trino, Delta Lake, Hudi) see BIG-DATA.en.md.

When to Use Iceberg

Multi-tool access to the same governed data
ACID on lake data
Streaming + batch in a single table
Reducing duplication (one canonical copy instead of ETL to warehouse)

Best Practices

Connection pooling — PgBouncer, RDS Proxy, ProxySQL
Indexing based on query patterns — do not have unnecessary indexes
Read replicas for reporting and analytics
Backup & recovery — point-in-time recovery (PITR), regular tests
Query monitoring — slow query log, pg_stat_statements, performance_schema
Encryption at rest & in transit
Migrations in CI/CD — part of the pipeline, not manual
Choose DB based on workload — no single universal DB (polyglot persistence)

Database License Model Comparison

DB	License	Price (self-hosted)	Price (managed cloud)	Vendor lock-in	Note
PostgreSQL	PostgreSQL license (MIT-like)	$0	~$0.10-1.00/hr (RDS, CloudSQL, Aurora)	Low	Fully open source, no restrictions
MySQL	GPL v2 / Commercial (Oracle)	$0 (GPL) / ~$2,000/server/year (commercial)	~$0.10-1.00/hr (RDS, PlanetScale)	Medium (Oracle owned)	GPL = need to release application? (depends on distribution)
MariaDB	GPL v2 / Business Source	$0 (GPL)	~$0.10-1.00/hr (SkySQL)	Low	Fully compatible MySQL fork, no Oracle influence
Oracle SE2	Proprietary (per core)	~$17,500/core + 22% support/year	~$1-5/hr (RDS, OCI)	High	Core factor 0.5 (EPYC/Xeon), max 16 threads
Oracle EE	Proprietary (per core + options)	~$47,500/core + options + 22% support	~$2-30/hr (OCI, RDS)	High	Options double the price (RAC, partitioning, compression)
SQL Server Standard	Proprietary (per core + CAL)	~$1,000/core + $200/CAL	~$0.20-1.00/hr (Azure SQL)	Medium	Windows Server license required additionally
SQL Server Enterprise	Proprietary (per core + CAL)	~$7,000/core + $200/CAL	~$1-5/hr (Azure SQL)	Medium	AlwaysOn, partitioning, in-memory OLTP
MongoDB	SSPL (Community) / Commercial (Enterprise)	$0 (Community) / ~$10k/server/year (Enterprise)	~$0.10-5.00/hr (Atlas)	Medium	SSPL restricts managed cloud services
Redis	RSALv2 + SSPL (7.4+) / BSD (Valkey)	$0 (Valkey)	~$0.10-1.00/hr (ElastiCache, Memorystore → Valkey)	Low (Valkey)	Redis 7.4+ license change → Valkey fork
Cassandra	Apache 2.0	$0	~$0.10-1.00/hr (Keyspaces, Amazon Managed)	Low	Fully open source, no restrictions
ScyllaDB	Apache 2.0 (OSS) / Enterprise	$0 (OSS) / Enterprise subscription	~$0.50-3.00/hr (ScyllaDB Cloud)	Low (OSS)	Enterprise: monitoring, security, support
CockroachDB	BSL (Business Source License) / Enterprise	$0 (core) / Enterprise subscription	~$0.50-3.00/hr (CockroachDB Cloud)	Medium	BSL: converts to MIT after 3 years. Enterprise: multi-region, backup

Key Recommendations:

Lowest TCO: PostgreSQL (no license, broadest cloud support)
Highest vendor lock-in: Oracle (PL/SQL, proprietary options, expensive migration)
License risk: Redis (license change) → use Valkey for new projects
Cloud-native licensing: MongoDB Atlas, CockroachDB Cloud, ScyllaDB Cloud — pay-per-use, no license management

Resources

Links, books and standards: sources/databases/sources.en.md

Book	Authors	ISBN	Key Takeaway
Database Internals	Alex Petrov	978-1492040346	In-depth explanation of storage engines (B-Tree, LSM-Tree, WAL, MVCC), distributed systems
Designing Data-Intensive Applications (2nd ed.)	Kleppmann, Riccomini	—	Cloud-native, AI, local-first, formal methods
High Performance MySQL (4th ed.)	Schwartz, Zaitsev, Tkachenko	978-1492075292	MySQL architecture, schema/index optimization
Expert Oracle Architecture (3rd ed.)	Kyte, Kuhn	978-1484249602	Oracle architecture, RAC, Data Guard, tuning
AI-Ready PostgreSQL 18	Kumar, Linster	—	PostgreSQL as a unified platform for AI
More SQL Antipatterns	Bill Karwin (2026)	—	14 antipatterns, keyset pagination
Vector Databases	Borwankar (2026)	—	Embeddings, vector indexes, RAG
Architecting an Apache Iceberg Lakehouse	Merced (2026)	—	Lakehouse architecture, Iceberg metadata

15 KiB Raw Permalink Blame History