Compare commits

..

3 Commits

Author SHA1 Message Date
Stanislav Hubacek
3fa11ef0f6 comiiit 2026-06-11 15:27:28 +02:00
Stanislav Hubacek
95d1839f05 First batch 2026-06-11 15:27:28 +02:00
Stanislav Hubacek
c6fa0bff6a commit 2026-06-11 15:27:28 +02:00
76 changed files with 16557 additions and 0 deletions

BIN
.DS_Store vendored Normal file

Binary file not shown.

View File

@@ -0,0 +1,95 @@
---
description: >
Navrhne malé datové centrum / demo cluster pro virtualizaci na základě znalostí v KB.
Projde relevantní KB soubory (DATACENTERS, HYPERVISORS, STORAGE, SERVER-CONFIG, CONNECTIVITY,
NETWORKING, CLOUD) a vytvoří ucelený návrh včetně HW sestavy, topologie sítě, diskového
subsystému, konektivity a rozpočtu. Výstup zapíše do case-studies/<nazev>/README.md.
mode: subagent
permission:
edit: allow
read: allow
glob: allow
grep: allow
bash: allow
webfetch: allow
websearch: allow
---
Jsi **DC Designer Agent** — navrhuješ malá dema/produkční datová centra pro virtualizaci.
## Vstup
Uživatel zadá parametry:
- Počet hostů (např. 2-3 malé, 3-6 střední)
- Účel (demo, vývoj, produkce)
- Preferovaný hypervisor (VMware, Proxmox, Hyper-V, Nutanix AHV)
- Rozpočtová omezení (low-cost, střední, enterprise)
- Další požadavky (HA, FT, GPU, NVMe, FC SAN, …)
## Workflow
1. **Analýza požadavků** — identifikuj klíčové parametry a variantu dle rozpočtu / velikosti
2. **Rešerše KB** — načti relevantní KB soubory:
- `DATACENTERS.md` — rack, power, cooling, layout, cabling
- `HYPERVISORS.md` — výběr hypervisoru, varianty A/B/C/D, licence
- `SERVER-CONFIG.md` — konkrétní HW konfigurace podle varianty
- `STORAGE.md` — storage (local vs SAN vs HCI), vendor srovnání
- `CONNECTIVITY.md` — NIC, switching, cabling (Ethernet, FC)
- `NETWORKING.md` — network layout, VLAN, segmentation
- `CLOUD.md` — hybrid cloud možnosti, offload
- `HARDWARE.md` / `SERVER-HW.md` — CPU, RAM, GPU, cooling
3. **Syntéza návrhu** — sestav konzistentní návrh pokrývající:
- Serverová sestava (CPU/RAM/disk/NIC/model)
- Storage varianta (Local RAID, vSAN, Ceph, FC SAN)
- Network (switche, topologie, kabeláž)
- Rack layout (rozměry, pozice, chlazení, UPS)
- Hypervisor + licence
- Odhad rozpočtu (orientační ceny)
- Diagram topologie (text/ASCII/Mermaid)
4. **Zápis** — vytvoř `case-studies/<nazev>/README.md`
5. **Shrnutí** — na konci vypiš klíčová rozhodnutí a kompromisy
## Pravidla
- Vždy čerpej z KB — neuváděj informace, které nejsou podložené zdroji
- Pokud KB neobsahuje dostatek dat pro konkrétní rozhodnutí, poznamenej to explicitně
- Rozhodnutí zdůvodni — proč zrovna tato komponenta, jaké jsou alternativy
- Ceny uváděj jako orientační řádové odhady (např. "~$15 000$25 000")
- Piš česky, fakticky, strukturovaně
- Na konec přidej sekci "Použité zdroje z KB" s odkazy na konkrétní soubory
- Výstupní soubor opatři footer `*Poslední revize: YYYY-MM-DD*`
- Pokud už case-studies adresář neexistuje, vytvoř ho
## Varianty dle velikosti
### Varianta "Mini" (2-3 hosté, demo/učení)
- 2-3× single-socket server (AMD EPYC 4124 / Intel Xeon E-2400)
- 128-256 GB RAM
- Local NVMe + HDD
- 1× 10GbE L2 switch
- Hypervisor: Proxmox VE (free) nebo VMware vSphere Foundation
- UPS 1500 VA
### Varianta "Medium" (3-6 hostů, vývoj/test)
- 3-4× dual-socket (AMD EPYC 9254 / Intel Xeon 6526Y)
- 512 GB - 1 TB RAM
- HCI: vSAN nebo Ceph (3+ nodes mandatory)
- 2× 25GbE ToR switch
- Hypervisor: VMware VCF nebo Nutanix AHV
- UPS 3000 VA + ATS
### Varianta "Enterprise Light" (4-8 hostů, produkce)
- 4-6× dual-socket (AMD EPYC 9454 / Intel Xeon 6548Y)
- 1-2 TB RAM
- FC SAN: 2× controller + JBOD (all-flash)
- 2× 25GbE ToR + 2× 32Gb FC switch
- Hypervisor: VMware VCF s FC SAN
- 2× UPS 3000 VA + service bypass
## Příklad použití
Uživatel: "navrhni malé demo DC pro 3 hosty, Proxmox, low-cost"
→ Projdeš KB, vytvoříš návrh ve variantě Mini a zapíšeš do case-studies/proxmox-demo/README.md
Uživatel: "case study pro VMware cluster se 4 hosty a SAN"
→ Zpracuješ variantu Enterprise Light, zapíšeš do case-studies/vmware-san-cluster/README.md

View File

@@ -0,0 +1,68 @@
# kb-index — Knowledge Base Index Agent
Udržuje centrální rozcestník (`README.md` / `README.en.md`).
## Responsibilities
1. **Scan all KB files** — prochází všechny `.md` a `.en.md` soubory (mimo README a .opencode/)
2. **Extract cross-references** — hledá markdown odkazy `[text](file.md)` mezi KB soubory
3. **Update cross-reference matrix** — aktualizuje tabulku v README.md a README.en.md
4. **Validate links** — kontroluje, zda všechny interní odkazy vedou na existující soubory
5. **Detect orphans** — najde soubory, které nejsou nikde odkazovány
6. **Add new files** — přidá nové soubory do navigační tabulky
## Trigger
Spouštět po:
- Přidání nového souboru do KB
- Přidání nové sekce s křížovými referencemi
- Hromadné změně (překlad, restrukturalizace)
- Ruční požadavek: "aktualizuj rozcestník"
## Workflow
### 1. Scan files
Pomocí globu najdi všechny `*.md` a `*.en.md` v kořenu KB (ne v .opencode/, ne README).
### 2. Extract metadata
Pro každý soubor:
- Přečti prvních 5 řádků (pro název a popis)
- Najdi všechny odkazy `[text](path/to/file.md)` na jiné KB soubory
### 3. Classify files
| Kategorie | Příznak |
|-----------|---------|
| Hlavní téma | Root `.md` / `.en.md` bez detailní DB |
| Detailní DB | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VEKTOROVE-DB / VECTOR-DBS |
| DB koncepty | DATABAZOVE-ENGINY / DATABASE-ENGINES |
| Legacy index | HARDWARE, INFRASTRUCTURE |
| Case study | case-studies/*/README.md |
| Template | templates/ADR |
| Sources | sources/*/sources.md |
### 4. Update README.md
Aktualizuj sekce:
- **Navigace — Czech** — tabulka všech `.md` souborů
- **Navigation — English** — tabulka všech `.en.md` souborů
- **Cross-Reference Matrix** — tabulka s referencemi mezi soubory
- **Case Studies** — seznam case-studies/README.md
- **Doporučená literatura** — knihy z README
- **Zdroje / Sources** — tabulka sources souborů
- **Datum poslední aktualizace**
### 5. Validate
Zkontroluj:
- Každý interní odkaz v každém souboru → cíl existuje
- Každý soubor (mimo legacy indexů) je uveden v README navigaci
- Hlášení: "X validních odkazů, Y broken, Z orphan souborů"
### 6. Report
Po dokončení vrať summary:
- Počet zscanovaných souborů
- Počet nalezených cross-referencí
- Broken linky (pokud existují)
- Orphan soubory (pokud existují)

View File

@@ -0,0 +1,43 @@
---
description: >
Zpracovává [todo] položky v knowledge base. Hledá v souborech sources/<area>/sources.md položky se statusem
[todo], rešeršuje téma, zapracuje nové poznatky do příslušného KB souboru a změní status na [done].
Spouštět s konkrétním požadavkem, např. "zpracuj všechny todo v sources/cicd/". Používat pro rozšiřování
knowledge base o nová témata z nespracovaných zdrojů.
mode: subagent
permission:
edit: allow
read: allow
bash: allow
webfetch: allow
websearch: allow
---
Jsi **KB Research Agent** — tvým úkolem je systematicky zpracovávat `[todo]` položky v knowledge base.
## Workflow
1. **Analýza** — projdi `sources/<area>/sources.md` v zadané oblasti a identifikuj všechny řádky se `[todo]`
2. **Rešerše** — pro každou todo položku:
- Pokud má URL, načti obsah (webfetch)
- Pokud je to kniha / standard, vyhledej aktuální informace (websearch)
- Získej klíčové koncepty, definice, best practices
3. **Zapracování** — rozšiř příslušný `.md` soubor v kořeni KB o nové poznatky
4. **Update zdroje** — změň `[todo]` na `[done]`
## Pravidla
- Neodstraňuj existující obsah — pouze přidávej a rozšiřuj
- Udržuj konzistentní formát (tabulky, seznamy, hlavičky)
- Piš česky, fakticky, bez subjektivních názorů
- Každý nový koncept doplň krátkým popisem
- Pokud narazíš na `[done]` položku, nech ji být
- Na konci vytvoř summary: co bylo zpracováno, co bylo přidáno
## Příklady použití
Uživatel: "zpracuj všechny todo v sources/cicd/"
→ Projdeš sources/cicd/sources.md, zpracuješ každý [todo] záznam a rozšíříš CICD.md
Uživatel: "zpracuj [todo] položku o CAP theorem"
→ Najdeš konkrétní todo o CAP theorem (v sources/databases/), provedeš rešerši a rozšíříš DATABASES.md

View File

@@ -0,0 +1,89 @@
---
description: >
Kontroluje konzistenci, kvalitu a aktuálnost celé knowledge base. Prochází všechny .md soubory,
ověřuje formátování (tabulky, nadpisy, seznamy), křížové odkazy mezi soubory, duplicitní obsah,
zastaralé informace a konzistenci se zdroji v sources/. Spouštět např. "proveď review celé KB"
nebo "zkontroluj konzistenci CICD.md".
mode: subagent
permission:
edit: allow
read: allow
webfetch: allow
websearch: allow
---
Jsi **KB Reviewer** — tvým úkolem je auditovat kvalitu knowledge base.
## Kontrolní oblasti
### 1. Formátování a konzistence
- [ ] Všechny soubory mají stejnou strukturu nadpisů (začínají na `#`, sekce `##`)
- [ ] Tabulky mají konzistentní formát (zarovnání, oddělovače `|---|`)
- [ ] Kódové bloky používají ``` s jazykovým tagem
- [ ] Seznamy jsou jednotně odsazeny
- [ ] Diagramy (ASCII / Mermaid) jsou čitelné
### 2. Křížové odkazy
- [ ] Témata, která se překrývají mezi soubory, na sebe vzájemně odkazují
- Např. "monitoring v CICD" → odkaz na MONITORING.md
- Např. "cloud networking" → odkaz mezi CLOUD.md a NETWORKING.md
- [ ] README.md obsahuje všechny aktuální soubory
- [ ] Každý `.md` soubor v kořeni je zmíněn v README.md
### 3. Duplicity
- [ ] Stejný koncept není vysvětlen na více místech s rozdílnými informacemi
- [ ] Pokud se koncept opakuje, je konzistentní (stejná čísla, definice, doporučení)
### 4. Aktuálnost
- [ ] Verze nástrojů odpovídají aktuálním stabilním vydáním (ověř webem)
- [ ] EOL technologie jsou označeny nebo odstraněny
- [ ] Žádné "brzy bude" — pokud není splněno, označ jako outdated
- [ ] Licence a ceny (kde uvedeny) jsou aktuální
### 5. Konzistence se zdroji
- [ ] Každý fakt v KB by měl mít dohledatelný zdroj v `sources/`
- [ ] Pokud je zdroj v `sources/` označen `[done]`, měl by být odpovídající obsah v KB
- [ ] Pokud `sources/` obsahuje zdroj k tématu, které v KB chybí — upozorni
### 6. Pravopis a styl
- [ ] Žádné překlepy
- [ ] Konzistentní terminology (nepoužívat "VM" i "virtuální stroj" v jednom souboru)
- [ ] Anglicismy jsou tam kde dávají smysl (vysvětlené při prvním použití)
## Report
Na konci vygeneruj přehledný report:
```markdown
## Review report — YYYY-MM-DD
### Problémy (nutno opravit)
- [soubor.md:řádek] popis problému
### Doporučení
- [soubor.md] popis
### Stav
- ✅ Kontrola formátování: OK / N problémů
- ✅ Křížové odkazy: OK / N chybějících
- ✅ Duplicity: OK / N nalezeno
- ✅ Aktuálnost: OK / N zastaralých
- ✅ Konzistence se zdroji: OK / N nesrovnalostí
```
## Příklady použití
Uživatel: "proveď review celé KB"
→ Projdeš všechny soubory a vypíšeš kompletní report
Uživatel: "zkontroluj konzistenci NETWORKING.md"
→ Zaměříš se jen na jeden soubor, zkontroluješ ho ze všech úhlů
Uživatel: "najdi duplicity mezi CLOUD.md a INFRASTRUCTURE.md"
→ Porovnáš specifické soubory

View File

@@ -0,0 +1,53 @@
---
description: >
Vyhledává nové zdroje (knihy, články, dokumentace, nástroje, standardy, videa, certifikace)
pro rozšíření knowledge base. Prochází web, identifikuje relevantní materiály k zadané oblasti
a přidává je jako [todo] do příslušného sources/<area>/sources.md. Používat pro kontinuální
obohacování knowledge base o aktuální zdroje. Spouštět např. "najdi nové zdroje pro cloud architekturu".
mode: subagent
permission:
edit: allow
read: allow
webfetch: allow
websearch: allow
---
Jsi **KB Source Scout** — tvým úkolem je aktivně vyhledávat nové kvalitní zdroje pro knowledge base.
## Workflow
1. **Analýza stavu** — přečti `sources/<area>/sources.md` pro zadanou oblast, zjisti co už je zdokumentované
2. **Rešerše novinek** — pomocí websearch najdi nové zdroje:
- Oficiální dokumentace a whitepapery
- Knihy (ISBN, autor, vydání)
- Kvalitní články a blog posty
- Nástroje a frameworky
- Standardy a RFC
- Video kurzy a přednášky (konference)
- Certifikace
3. **Deduplikace** — zkontroluj, zda zdroj už není v sources.md
4. **Přidání** — doplň nové zdroje do příslušného `sources/<area>/sources.md` s tagem `[todo]`
## Kritéria kvality
- **Oficiální dokumentace** — preferovat primary sources (vendor docs, RFC, standardy)
- **Knihy** — preferovat vydání z posledních 3 let, u klasik (jako TCP/IP Illustrated) stačí starší
- **Články** — preferovat autority v oboru (Brendan Gregg, Martin Kleppmann, Kelsey Hightower, ...)
- **Nástroje** — aktivní komunita, aktuální verze, open-source bonus
- Vyhýbej se: zjevně zastaralým materiálům (>5 let mimo obor), clickbaitům, nedůvěryhodným zdrojům
## Formát zápisu
Pro každý nový zdroj přidej řádek do tabulky v příslušném sources.md.
Udržuj konzistentní formát dle existujících záznamů v souboru.
## Příklady použití
Uživatel: "najdi nové zdroje pro cloud architekturu"
→ Prohledáš web, najdeš knihy, články, whitepapery o cloud architektuře z roku 2025/2026 a přidáš je do sources/cloud/sources.md jako [todo]
Uživatel: "scout infra"
→ Prohledáš nové zdroje pro infrastrukturu (hypervisory, DC, storage, hardware) a přidáš je do sources/infrastructure/sources.md
Uživatel: "najdi novinky v observability za poslední rok"
→ Zaměříš se na monitoring/observability, hledáš nové nástroje, články, verze a doplníš do sources/monitoring/sources.md

25
.opencode/opencode.json Normal file
View File

@@ -0,0 +1,25 @@
{
"$schema": "https://opencode.ai/config.json",
"agent": {
"kb-research": {
"description": "Zpracovává [todo] položky v knowledge base — rešerše a zapracování nových témat",
"mode": "subagent"
},
"kb-source-scout": {
"description": "Vyhledává nové zdroje (knihy, články, dokumentace) a přidává je do sources/ jako [todo]",
"mode": "subagent"
},
"kb-reviewer": {
"description": "Audituje konzistenci, aktuálnost, křížové odkazy, duplicity a formátování celé KB",
"mode": "subagent"
},
"dc-designer": {
"description": "Navrhne malé DC / demo cluster pro virtualizaci na základě KB a zapíše case study do case-studies/",
"mode": "subagent"
},
"kb-index": {
"description": "Udržuje centrální rozcestník README.md — scanuje soubory, extrahuje křížové reference, validuje odkazy, přidává nové soubory",
"mode": "subagent"
}
}
}

135
CASSANDRA.en.md Normal file
View File

@@ -0,0 +1,135 @@
# 🐘 Cassandra & ScyllaDB
## Overview
Apache Cassandra is a distributed wide-column NoSQL database designed for high availability and linear scalability with no single point of failure. Inspired by the Amazon Dynamo paper (2007) and Google Bigtable. ScyllaDB is a C++ reimplementation compatible with the Cassandra protocol, with drastically lower latency and higher throughput.
## Architecture (Dynamo-inspired)
### Consistent hashing
Data is divided on a hash ring, each node is responsible for a token range:
```text
0 ─── node A ─── hash(key1)
90 ─── node B ─── hash(key2)
180 ─── node C ─── hash(key3)
270 ─── node D ─── hash(key4)
```
- Adding/removing a node affects only K/N keys (thanks to virtual nodes)
- **Virtual nodes** (vnodes) — each physical node has ~100-200 tokens on the ring (more even distribution)
### Quorum (N, R, W)
- N = replication factor (typically 3)
- R = read quorum (typically 2)
- W = write quorum (typically 2)
- Condition: R + W > N (for strong-ish consistency)
- **Sloppy quorum** — when a node is unavailable, data is temporarily stored on another
- **Hinted handoff** — temporary write with hint, data transferred upon recovery
### Gossip protocol
Decentralized dissemination of membership information — each node periodically communicates with 1-3 random nodes. No central point of failure.
### Vector clocks
Capturing causality of object versions. On conflict (partition merge), both versions are returned — application merges.
### Merkle trees
Anti-entropy — hash tree for detecting divergence between replicas. Fast detection of which data ranges differ.
### Write path
```text
Client → Coordinator → [1. Write to commit log (disk)]
[2. Write to memtable (RAM)]
[3. Acknowledge client]
→ [4. Flush memtable → SSTable (periodically)]
→ [5. Compaction (merge SSTables)]
```
### Read path
```text
Client → Coordinator → [1. Check bloom filter]
[2. Check row cache / key cache]
[3. Read from SSTable (disk)]
[4. Merge with memtable]
[5. Repair if stale (read repair)]
```
## Cassandra vs ScyllaDB
| Feature | Cassandra | ScyllaDB |
|---------|-----------|----------|
| **Language** | Java (JVM) | C++ (seastar framework) |
| **Architecture** | Thread-per-connection | Shared-nothing, CPU sharding |
| **Latency** | 5-20 ms (typical) | 1-3 ms (typical) |
| **Throughput** | Good | 5-10× higher on same HW |
| **GC pauses** | Yes (JVM) | No (no GC) |
| **NUMA** | OS-dependent | Native NUMA aware |
| **Workload** | Standard | High-throughput, real-time |
| **Price** | Open source | Open source + Enterprise |
## Data model
- **Keyspace** = namespace (analogy to DB)
- **Table** = column definition (not schema-less)
- **Partition key** = hash key for ring distribution
- **Clustering columns** = ordering within a partition
- **Primary key** = Partition key + Clustering columns
## Recommendations — where Cassandra is better
| Area | Cassandra | Competition | Why Cassandra |
|------|-----------|-------------|---------------|
| **Write throughput** | Linear scaling, no master bottleneck | PostgreSQL (master writes) | Every node writes, no single point of failure |
| **Availability** | AP from CAP — always writable | MongoDB (CP, primary down = read-only) | "Always-writeable" philosophy |
| **Multi-DC** | Native, per-DC replication | CockroachDB (complex) | Simple configuration, latency-tolerant |
| **Time-series** | Wide-row model, TTL, compaction | InfluxDB (specialized) | Can combine with other workloads |
| **IoT / sensor data** | Linear scaling, no master | MongoDB (sharding complex) | Predictable performance under growth |
| **Geographic distribution** | Native multi-DC, hinted handoff | Spanner (vendor lock-in) | Open source, no dependencies |
### When to use Cassandra / ScyllaDB
- **IoT / sensor data ingest** — millions of writes/s, no data loss
- **Time-series at massive scale** — metrics, logs, event data
- **User activity history** — write-heavy workloads
- **Multi-DC applications** — data available in every location
- **Recommendation systems** — wide-row model for "what user has seen"
- **Message / event store** — high-throughput append with TTL
### When to use something else
- **Relations, JOINs, transactions** → PostgreSQL (Cassandra has no JOINs, limited transactions)
- **Full-text search** → Elasticsearch
- **Aggregation / OLAP** → ClickHouse (Cassandra is not an analytical DB)
- **Small data (< 100 GB)** → PostgreSQL (Cassandra overhead not worth it)
- **Frequent reads by secondary keys** → DynamoDB (SADA indexes) — Cassandra has limited secondary indexes
### ScyllaDB specific
ScyllaDB is advantageous when:
- You need 5-10× higher throughput on the same HW
- You have latency-sensitive workload (real-time scoring, ad-tech)
- You want to eliminate JVM/GC issues
- You need predictable performance (P99 < 5 ms)
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
### Recommended reading
| Paper / Book | Authors | Description |
|--------------|---------|-------------|
| Dynamo: Amazon's Highly Available Key-value Store (SOSP 2007) | DeCandia et al. | Foundational paper for Cassandra architecture |
| Cassandra: The Definitive Guide (3rd ed.) | E. Hewitt | Comprehensive guide to deployment and operations |
*Last revision: 2026-06-03*

135
CASSANDRA.md Normal file
View File

@@ -0,0 +1,135 @@
# 🐘 Cassandra & ScyllaDB
## Přehled
Apache Cassandra je distribuovaná wide-column NoSQL databáze navržená pro vysokou dostupnost a lineární škálovatelnost bez single point of failure. Inspirována Amazon Dynamo paperem (2007) a Google Bigtable. ScyllaDB je C++ reimplementace kompatibilní s Cassandra protokolem, s drasticky nižší latencí a vyšší propustností.
## Architektura (Dynamo-inspired)
### Consistent hashing
Data rozdělena na hash ringu, každý node zodpovídá za rozsah tokenů:
```text
0 ─── node A ─── hash(key1)
90 ─── node B ─── hash(key2)
180 ─── node C ─── hash(key3)
270 ─── node D ─── hash(key4)
```
- Přidání/odebrání nodu ovlivní jen K/N klíčů (díky virtual nodes)
- **Virtual nodes** (vnodes) — každý fyzický node má ~100-200 tokenů na ringu (rovnoměrnější distribuce)
### Quorum (N, R, W)
- N = replication factor (typicky 3)
- R = read quorum (typicky 2)
- W = write quorum (typicky 2)
- Podmínka: R + W > N (pro strong-ish konzistenci)
- **Sloppy quorum** — při nedostupnosti nodu, data dočasně uložena na jiném
- **Hinted handoff** — dočasný zápis s hintem, při obnově se data přenesou
### Gossip protocol
Decentralizované šíření membership informací — každý node periodicky komunikuje s 1-3 náhodnými nodes. Žádný centrální bod selhání.
### Vector clocks
Zachycení kauzality verzí objektu. Při konfliktu (partition merge) se vrací obě verze — aplikace merguje.
### Merkle trees
Anti-entropy — strom hashů pro detekci divergence mezi replikami. Rychlá detekce, které rozsahy dat jsou rozdílné.
### Write path
```text
Client → Coordinator → [1. Write to commit log (disk)]
[2. Write to memtable (RAM)]
[3. Acknowledge client]
→ [4. Flush memtable → SSTable (periodicky)]
→ [5. Compaction (merge SSTables)]
```
### Read path
```text
Client → Coordinator → [1. Check bloom filter]
[2. Check row cache / key cache]
[3. Read from SSTable (disk)]
[4. Merge with memtable]
[5. Repair if stale (read repair)]
```
## Cassandra vs ScyllaDB
| Vlastnost | Cassandra | ScyllaDB |
|-----------|-----------|----------|
| **Jazyk** | Java (JVM) | C++ (seastar framework) |
| **Architektura** | Thread-per-connection | Shared-nothing, CPU sharding |
| **Latence** | 5-20 ms (typicky) | 1-3 ms (typicky) |
| **Propustnost** | Dobrá | 5-10× vyšší na stejný HW |
| **GC pauzy** | Ano (JVM) | Ne (žádný GC) |
| **NUMA** | OS-dependent | Nativní NUMA aware |
| **Workload** | Standardní | High-throughput, real-time |
| **Cena** | Open source | Open source + Enterprise |
## Data model
- **Keyspace** = namespace (analogie DB)
- **Table** = definice sloupců (ne schema-less)
- **Partition key** = hash klíč pro distribuci na ringu
- **Clustering columns** = řazení v rámci partition
- **Primary key** = Partition key + Clustering columns
## Doporučení — v čem je Cassandra lepší
| Oblast | Cassandra | Konkurence | Proč Cassandra |
|--------|-----------|------------|----------------|
| **Zápisová propustnost** | Lineární škálování, žádný master bottleneck | PostgreSQL (master writes) | Každý node zapisuje, žádný single point of failure |
| **Dostupnost** | AP z CAP — vždy zapisovatelná | MongoDB (CP, primary down = read-only) | "Always-writeable" filozofie |
| **Multi-DC** | Nativní, režim per DC | CockroachDB (komplexní) | Jednoduchá konfigurace, latency-tolerant |
| **Time-series** | Wide-row model, TTL, compaction | InfluxDB (specializovaná) | Lze kombinovat s dalšími workloady |
| **IoT / sensor data** | Lineární škálování, žádný master | MongoDB (sharding komplexní) | Předvídatelný výkon při růstu |
| **Geografická distribuce** | Nativní multi-DC, hinted handoff | Spanner (vendor lock-in) | Open source, žádná závislost |
### Kdy použít Cassandra / ScyllaDB
- **IoT / sensor data ingest** — miliony zápisů/s, žádné ztráty
- **Time-series v masivním měřítku** — metriky, logy, event data
- **Uživatelské activity history** — zápisově těžké workloady
- **Multi-DC aplikace** — data dostupná v každé lokalitě
- **Doporučovací systémy** — wide-row model pro "co viděl uživatel"
- **Message / event store** — high-throughput append s TTL
### Kdy použít něco jiného
- **Relace, JOINy, transakce** → PostgreSQL (Cassandra nemá JOINy, omezené transakce)
- **Full-text search** → Elasticsearch
- **Agregace / OLAP** → ClickHouse (Cassandra není analytická DB)
- **Malá data (< 100 GB)** → PostgreSQL (Cassandra overhead se nevyplatí)
- **Časté ready podle vedlejších klíčů** → DynamoDB (SADA indexy) — Cassandra má omezené secondary indexy
### ScyllaDB specific
ScyllaDB je výhodná když:
- Potřebujete 5-10× vyšší propustnost na stejném HW
- Máte latency-sensitive workload (real-time scoring, ad-tech)
- Chcete eliminovat JVM/GC problémy
- Potřebujete předvídatelný výkon (P99 < 5 ms)
## Zdroje
Odkazy, knihy a standardy: [sources/databases/sources.md](sources/databases/sources.md)
### Doporučená literatura
| Paper / Kniha | Autoři | Popis |
|---------------|--------|-------|
| Dynamo: Amazon's Highly Available Key-value Store (SOSP 2007) | DeCandia et al. | Zakladatelský paper pro Cassandra architekturu |
| Cassandra: The Definitive Guide (3rd ed.) | E. Hewitt | Komplexní průvodce nasazením a provozem |
*Poslední revize: 2026-06-03*

679
CICD.en.md Normal file
View File

@@ -0,0 +1,679 @@
# 🔄 CI/CD and DevOps
## CI/CD Pipeline
```
Code Commit → Build → Test → Package → Deploy to Staging → Integration Tests → Deploy to Production
```
### Detailed Pipeline Stages
```
1. Checkout ──→ 2. Lint ──→ 3. Test ──→ 4. Build ──→ 5. Scan ──→ 6. Publish ──→ 7. Deploy
│ │ │
ESLint/ Unit/Integ/ SAST/SCA/
Prettier e2e tests Container scan
```
| Stage | Tools | What Happens |
|-------|-------|--------------|
| **Checkout** | git clone, fetch | Retrieve code from repository, including submodules |
| **Lint** | ESLint, Prettier, RuboCop, golangci-lint | Static code analysis, formatting |
| **Test (unit)** | Jest, pytest, JUnit | Fast tests (ms to s), no dependencies |
| **Test (integration)** | Testcontainers, Docker Compose | Tests with DB, message queue, external services |
| **Test (e2e)** | Playwright, Cypress, Selenium | Full-stack tests in the browser |
| **Build** | Docker build, go build, npm build, Maven | Compilation, artifact assembly |
| **Scan (SAST)** | Semgrep, SonarQube, CodeQL | Static security analysis |
| **Scan (DAST)** | OWASP ZAP, Burp Suite | Dynamic analysis (running application) |
| **Scan (SCA)** | Dependabot, Snyk, Trivy | Dependency and CVE analysis |
| **Publish** | Docker push, npm publish, Maven deploy | Upload artifact to registry |
| **Deploy** | ArgoCD, Terraform, Helm, kubectl | Deploy to target environment |
### Continuous Integration (CI)
- Automatic build and tests on every commit
- Fast feedback loop (< 10 min)
- Linting, type checking, unit tests, security scan (SAST)
### Continuous Delivery (CD)
- Automatic deployment to staging / test environments
- Manual approval for production (optional)
- Smoke tests after deployment
### Continuous Deployment
- Fully automatic deployment to production
- Requires high confidence in tests and monitoring
- Feature flags for risk management
## GitHub Actions Detail
### Workflow Syntax
```yaml
name: CI Pipeline
on:
push:
branches: [main]
pull_request:
branches: [main]
env:
NODE_VERSION: "22"
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- run: npm ci
- run: npm run lint
test:
runs-on: ubuntu-latest
needs: lint
strategy:
matrix:
node-version: [22, 24]
steps:
- uses: actions/checkout@v4
- name: Run tests
run: npm test
```
### Matrix Builds
- Run the same jobs with different parameters (OS, language version, architecture)
- `strategy.matrix` — parameter combinations (Cartesian product)
- `strategy.fail-fast` — stop all if one fails
### Reusable Workflows
```yaml
# .github/workflows/deploy.yml (called)
on:
workflow_call:
inputs:
environment:
required: true
type: string
secrets:
cloud_role:
required: true
# Call in caller workflow
jobs:
deploy:
uses: ./.github/workflows/deploy.yml
with:
environment: staging
secrets:
cloud_role: ${{ secrets.STAGING_ROLE }}
```
### Composite Actions
- Custom actions without needing a separate repository
- Combination of `run`, `uses`, `shell` steps
- Use case: standardize lint/test/build across repositories
### Self-hosted Runners
- Own infrastructure for running GitHub Actions
- Use case: private network, GPU, specific HW, compliance
- Scaling: actions-runner-controller (Kubernetes), auto-scaling groups
- Security: job isolation, ephemeral runners
## GitLab CI Detail
```yaml
stages:
- lint
- test
- build
- deploy
variables:
DOCKER_IMAGE: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
lint:
stage: lint
image: node:22
script:
- npm ci
- npm run lint
test:
stage: test
image: node:22
needs: ["lint"]
script:
- npm test
artifacts:
paths:
- coverage/
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
deploy-staging:
stage: deploy
needs: ["build"]
rules:
- if: $CI_COMMIT_BRANCH == "main"
environment:
name: staging
url: https://staging.example.com
script:
- kubectl set image deployment/app app=$DOCKER_IMAGE
```
**Concepts**:
- **Stages** — sequential phases (each stage can have multiple parallel jobs)
- **Rules** — execution conditions (branch, tag, changes, variables) — replaces `only/except`
- **Needs** — DAG dependencies (job doesn't have to wait for entire stage)
- **Artifacts** — file sharing between jobs (binaries, reports, cache)
- **Environments** — deployment tracking (rollback, history, approvals)
### DAG Pipelines (Needs)
```
lint ──→ test ──→ build ──→ deploy-staging ──→ deploy-prod
build-arm ──→ test-arm
```
- Defines dependencies between jobs (not necessarily stages)
- Enables parallelization of independent jobs
- Reduces overall pipeline time
## Infrastructure as Code (IaC)
| Tool | Type | Language |
|------|------|----------|
| Terraform | Declarative | HCL |
| OpenTofu | Declarative | HCL (Terraform fork) |
| Pulumi | Declarative | TypeScript, Python, Go, C# |
| AWS CDK | Declarative | TypeScript, Python, Java, C# |
| CloudFormation | Declarative | YAML/JSON (AWS) |
| Azure ARM/Bicep | Declarative | Bicep, JSON |
| Ansible | Imperative/Config | YAML |
| Chef/Puppet | Config mgmt | Ruby DSL |
### Infrastructure as Code (2nd Edition) — Kief Morris
Key reference for designing and operating dynamic cloud infrastructure with IaC. The book is tool-agnostic — it focuses on patterns and practices, not specific tools.
#### Three Fundamental Practices
| Practice | Description |
|----------|-------------|
| **Define everything as code** | All infrastructure defined in code, version control, repeatability |
| **Continuously test and deliver** | Every change goes through a pipeline with automated tests |
| **Small, independent pieces** | Small, loosely coupled components — easier change and testing |
#### Principles of Cloud Infrastructure
- **Systems reproducible** — infrastructure can be recreated from code at any time
- **Systems disposable** — instances can be destroyed and recreated
- **Systems consistent** — all environments identical (no snowflake servers)
- **Processes repeatable** — automation instead of manual procedures
- **Design always changing** — infrastructure is constantly evolving (not build-and-forget)
#### Anti-patterns (Pitfalls)
| Anti-pattern | Description |
|--------------|-------------|
| **Snowflake server** | Each server different, cannot reproduce |
| **Configuration drift** | Manual changes → deviations from defined state |
| **Server sprawl** | Too many servers without management |
| **Fragile infrastructure** | Changes often break the system |
| **Automation fear** | Fear of automation → manual interventions |
#### Book Structure (4 Parts)
1. **Foundations** — framework of tools and technologies for cloud platforms
2. **Working with infrastructure stacks** — defining, provisioning, testing and CD of infrastructure changes
3. **Working with servers and application runtime platforms** — provisioning and configuring servers and clusters
4. **Working with large systems and teams** — workflow, governance, architectural patterns for multiple teams
#### IaC Code Organization
| Pattern | Description |
|---------|-------------|
| **Monorepo** | One repository for everything — build-time integration, suitable for small teams |
| **Microrepo** | Separate repository for each project — isolation, suitable for large teams |
| **Domain organization** | Organizing code by domain concepts (not by technology) |
**Recommendations:**
- Infrastructure and applications can be in the same or separate repository depending on organizational structure (Team Topologies)
- Per-environment configuration files (test, staging, production) stored within the project
- Tests belong to the project, integration tests can be in a separate project
- Infrastructure code should not directly deploy applications — use OS packaging (RPM, deb)
#### Expand-Contract Pattern for Infrastructure Changes
Same principle as database migrations:
1. **Expand** — add new resource (old version still running)
2. **Migrate** — move traffic / dependencies to the new resource
3. **Contract** — remove old resource
Prevents outages when refactoring infrastructure.
## Terraform Detail
#### State Locking Mechanism
| Backend | Locking Mechanism | Note |
|---------|-------------------|------|
| **S3 + DynamoDB** | DynamoDB (ConditionalPut) | Most common, cheap, simple |
| **Terraform Cloud** | Built-in (API) | SaaS, audit logs, VCS integration |
| **Azure Storage** | Azure Blob Lease | Similar to S3 model |
| **GCS** | Cloud Storage Object Hold | Limited |
| **Consul** | Consul KV session_lock | High-availability |
| **PostgreSQL** | pg_advisory_lock / row lock | Custom backend |
#### State Backends Comparison
| Property | S3 + DynamoDB | Terraform Cloud | Consul |
|----------|---------------|----------------|--------|
| Cost | $ (S3 + DynamoDB) | $$ (free tier limited) | $$ (infra) |
| Team workflow | GitHub Actions + OIDC | Native RBAC, runs | Custom |
| Locking | DynamoDB | Built-in | Consul session |
| History | S3 versioning | Full history, diff | None |
| Remote ops | No (state only) | Yes (remote runs) | No |
| Encryption | SSE-S3/KMS | At rest + in transit | TLS |
#### Workspaces vs Terragrunt
| Aspect | Terraform Workspaces | Terragrunt |
|--------|---------------------|------------|
| **State separation** | One backend, key: `env:/workspace` | Separate backend per env |
| **Code reuse** | Same code, different variables | DRY configuration, modules |
| **Risk** | Accidentally `apply` to wrong workspace | Isolated backends |
| **When to use** | Simple projects, <5 envs | Microservices, multi-env, multi-team |
| **Extra features** | — | Dependency, include, before_hook |
#### Provider Versioning
```hcl
terraform {
required_version = ">= 1.5, < 2.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.23"
}
}
}
```
- `~> 5.0` — only patch versions (5.x, x ≥ 0)
- `>= 2.23, < 3.0` — any 2.x from 2.23
- `~>` constraints prevent breaking changes in major/minor
### Terraform Workflow
```
terraform init → Download provider modules
terraform plan → Show changes
terraform apply → Apply changes
terraform destroy → Destroy infrastructure
terraform validate → Syntax validation
terraform fmt → Format HCL
```
### State Management
- Remote state (S3, Terraform Cloud, Azure Storage)
- State locking (DynamoDB, Consul)
- Workspaces for environment separation
### Terraform: Up and Running (3rd ed.) — Yevgeniy Brikman
Practical guide to Terraform from the founder of Gruntwork. The 3rd edition (2022) adds over 100 pages of new content, updates from Terraform 0.12 to 1.2, and two new chapters.
#### What's New in the 3rd Edition
| New Feature | Description |
|-------------|-------------|
| **Chapter: Secrets management** | Managing secrets with Terraform — Vault, AWS Secrets Manager, KMS, OIDC, `sensitive` variables |
| **Chapter: Multiple providers** | Working with multiple regions, accounts, clouds including Kubernetes (AWS EKS) |
| **Terraform 1.0+** | Backward compatibility promise, stability, HashiCorp IPO |
| **Provider versioning** | `required_providers` block + `terraform.lock.hcl` (lock file) |
| **Module iteration** | `count` and `for_each` on modules (since Terraform 0.13) |
| **Variable validation** | `validation {}` blocks, `precondition` / `postcondition` |
| **Refactoring** | `moved` blocks — safe refactoring without manual state manipulation |
| **CI/CD security** | OIDC authentication, isolated workers for `terraform apply` |
#### Secrets Management with Terraform
```hcl
# Variable marked as sensitive — never shown in log
variable "db_password" {
type = string
sensitive = true
}
# Reading secrets from AWS Secrets Manager
data "aws_secretsmanager_secret" "db" {
name = "production/db/master"
}
data "aws_secretsmanager_secret_version" "db" {
secret_id = data.aws_secretsmanager_secret.db.id
}
```
**Recommended Security Hierarchy:**
1. **OIDC** — most secure, no creds on CI server (GitHub Actions → IAM role)
2. **IAM role** — instance profile (EC2, ECS, EKS)
3. **Environment variables** — limited, risk of log leakage
4. **Isolated workers** — separate worker with admin permissions, API only `plan`/`apply`
#### Testing Terraform Code
| Layer | Tools | Description |
|-------|-------|-------------|
| **Static analysis** | `terraform validate`, `tflint`, `tfsec`, `checkov` | Code analysis without execution |
| **Plan testing** | `conftest` + OPA (Rego), `terraform plan` parse | Plan validation against policy |
| **Unit tests** | Terratest (Go), `terraform fmt`, `terraform validate` | Testing modules in isolation |
| **Integration tests** | Terratest (Go) | Actual provisioning + assert |
| **End-to-end tests** | Terratest | Full stack, smoke tests |
#### Policy Enforcement
```rego
# OPA / conftest — deny public S3 bucket
package main
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_s3_bucket"
resource.change.after.acl == "public-read"
msg = sprintf("%s must not be public", [resource.address])
}
```
#### Production-grade Checklist by Brikman
1. **Small modules** — one module = one thing (single responsibility)
2. **Composable modules** — modules can be composed into larger units
3. **Testable modules** — each module has tests (Terratest)
4. **Releasable modules** — versioning (Git tags, Terraform Registry)
5. **Version control** — everything in git, including `.terraform.lock.hcl`
6. **Remote state** — S3 + DynamoDB or Terraform Cloud
7. **CI/CD pipeline**`plan` on MR, `apply` after merge to main
8. **Secrets management** — no secrets in plaintext in code
9. **Policy as code** — OPA / Sentinel for compliance
10. **Sandbox environment** — each developer has their own isolated environment
#### Golden Rule of Terraform
> **Master branch state must always be in sync with the production environment.**
> Never run `terraform apply` manually locally on production — always via CI/CD.
## Dockerfile Best Practices
```dockerfile
# Multi-stage build
FROM node:22-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Runtime stage — distroless
FROM gcr.io/distroless/nodejs22-debian12
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER nonroot:nonroot
EXPOSE 3000
CMD ["dist/server.js"]
```
**Rules**:
- **Multi-stage build** — separate build tools from runtime
- **Distroless images** — minimal attack surface (no shell, package manager)
- **Non-root user** — USER nonroot (security best practice)
- **Layer caching** — copy less-frequently changing files first (package.json → npm ci → code)
- **Small base image** — Alpine (5 MB), distroless (minimal), scratch (Go static binary)
- **Healthcheck** — HEALTHCHECK instruction for orchestrator
- **Labels** — LABEL maintainer, version, git commit
- **.dockerignore** — minimize build context
## Artifact Management
### Docker Registries
| Registry | Public/Private | Cost | Integration |
|----------|---------------|------|-------------|
| **Docker Hub** | Both | Public free, private $5/month | GitHub Actions, GitLab |
| **ECR (AWS)** | Private | $0.10/GB/month + data transfer | IAM, ECS, EKS |
| **GHCR (GitHub)** | Both | Public free, private 500 MB free | GitHub Actions, npm |
| **GCR / Artifact Registry** | Private | $0.10/GB/month | GKE, Cloud Build |
| **ACR (Azure)** | Private | $0.11/GB/month | AKS, Azure DevOps |
| **Harbor** | Private (self-hosted) | Free (open source) | Custom, CNCF |
### Helm Charts
- **Repository** — index.yaml + chart .tgz on HTTP server (S3, GitHub Pages, ChartMuseum)
- **OCI registry** — Helm 3.8+ supports storing charts in OCI registries (ECR, GHCR, Harbor)
- **Versioning** — chart version (package) + app version (application)
### SBOM (Software Bill of Materials)
- **SPDX** / **CycloneDX** — standard SBOM formats
- Generation: Trivy, Syft, grype
- Use case: supply chain security, compliance (EO 14028, EU CRA)
## Configuration and Secrets
| Tool | Description |
|------|-------------|
| Vault (HashiCorp) | Dynamic secrets, encryption-as-a-service |
| AWS Secrets Manager | Managed, auto-rotation |
| Azure Key Vault | Managed, HSM support |
| GCP Secret Manager | Managed |
| SOPS | Encryption in git repos |
| Sealed Secrets | Encrypted secrets for Kubernetes |
### Secret Management Workflows
**Vault Agent Injector** (Kubernetes)
- Sidecar container (vault-agent) injects secrets into the pod
- Secrets mounted as tmpfs volume (not into environment variables)
- Auto-rotation: vault-agent periodically refreshes secrets
**External Secrets Operator** (Kubernetes)
- CRD: `ExternalSecret` → creates `Secret` in K8s
- Backend: AWS Secrets Manager, Azure Key Vault, GCP Secret Manager, Vault
- Push-based refresh: change in external store → propagate to K8s
**Sealed Secrets**
- `kubeseal` encrypts Secret on the cluster (controller has private key)
- Encrypted manifest (SealedSecret) can be safely in git
- Controller decrypts on deploy
## GitOps
- **Principle**: Git is the single source of truth
- **Tools**: ArgoCD, Flux, Rancher Fleet
- Pull-based deploy — agent in the cluster watches repo and applies changes
- Auto-sync + drift detection
## Environment Promotion (dev → staging → prod)
```
Code → Dev (auto-deploy) → Staging (auto + smoke tests) → Prod (manual approval + gating)
```
**Quality Gates**:
1. **Unit tests** — pass rate 100 %, code coverage ≥ 80 %
2. **Integration tests** — all critical paths pass
3. **SAST scan** — no critical/high vulnerabilities
4. **SCA scan** — no known critical CVEs
5. **Container scan** — all fixable vulns addressed
6. **Smoke tests** — after staging deploy (health endpoint, basic flow)
7. **Manual approval** — for production (optional with CD)
## Deployment Strategies
| Strategy | Description | Risk |
|----------|-------------|------|
| **Rolling update** | Gradual instance replacement | Low |
| **Blue/Green** | Two identical environments, traffic switch | Medium |
| **Canary** | % traffic to new version, gradual increase | Low |
| **Feature flag** | Toggle feature on/off without deploy | Very low |
| **A/B testing** | Different versions for different users | Low |
## Git Branching Strategies
| Strategy | Description | Suitable For |
|----------|-------------|--------------|
| **Trunk-based** | Single main branch, short feature branches (< 1 day) | CD, microservices, mature teams |
| **GitHub Flow** | Main + feature branches, PRs, simple | Startups, web apps |
| **GitLab Flow** | Main + environment branches (staging, prod) + feature branches | Enterprise, regulated |
| **GitFlow** | Develop + main + feature/release/hotfix branches | Release-based, enterprise legacy |
| **One Flow** | Simplified GitFlow (no develop branch) | Medium teams |
## Rollback Strategies
| Strategy | Description | Speed | Risk |
|----------|-------------|-------|------|
| **Forward fix** | New deploy with hotfix | Slow (build + deploy) | Low |
| **Rollback (revert commit)** | Git revert, new deploy | Medium | Low |
| **Blue/Green switchback** | Switch back to old version | Instant | DB incompatibility |
| **Database rollback** | Revert DB migration (migrate down) | Slow | Data loss risk |
### Database Rollback Challenges
- **Breaking changes** — removing a column/table means rollback problem (data lost)
- **Best practice**: Expand → Migrate → Contract (never remove in a single deploy)
- **Tooling**: Flyway undo (limited), Liquibase rollback, pgroll (Postgres)
- **Feature flags** as prevention — new code is behind a flag, rollback = disable flag
## CI/CD Design Patterns
Modern CI/CD pipelines solve recurring problems using design patterns:
| Pattern | Description |
|---------|-------------|
| **Pipeline as Code** | Pipeline defined in YAML/Kotlin DSL (`.gitlab-ci.yml`, `.github/workflows/`) |
| **Immutable Pipeline** | Each build is an artifact, never changed |
| **Quality Gate** | Branch protection, required checks, code coverage threshold |
| **Deployment Strategy** | Blue/Green, Canary, Rolling (see table below) |
| **GitOps** | Pull-based deploy with auto-sync and drift detection |
| **Shift-Left Security** | SAST/DAST/SCA part of the pipeline |
| **Dependency Caching** | Cache layer between pipeline runs |
## Shift Left Security
### SCA (Software Composition Analysis)
| Tool | Type | Integration |
|------|------|-------------|
| **Dependabot** | GitHub native | GitHub, auto-PR for fix |
| **Renovate** | Multi-platform | GitHub, GitLab, Bitbucket |
| **Snyk** | SaaS + CLI | All platforms, Docker, IaC |
| **Trivy** | CLI, OSS | CI/CD pipeline (GitHub Actions, GitLab) |
### SAST (Static Application Security Testing)
| Tool | Languages | Characteristics |
|------|-----------|----------------|
| **Semgrep** | 30+ (Python, Java, Go, JS/TS) | Fast, custom rules, CI-native |
| **SonarQube** | 30+ | Comprehensive, quality gates, tech debt |
| **CodeQL** | 12 (C++, C#, Go, Java, JS/TS, Python) | GitHub native, query-based |
| **Checkmarx** | 30+ | Enterprise, CxSAST, CxFlow |
| **Fortify** | 30+ | Enterprise, SAST + DAST |
### Container Scanning
| Tool | Description |
|------|-------------|
| **Trivy** | OSS, scans OS packages + language-specific + IaC |
| **Grype** | OSS, from Anchore, fast, Syft for SBOM |
| **Clair** | Red Hat, OSS, OCI-compatible |
| **Docker Scout** | Docker Desktop / CLI, integration with Docker Hub |
## AI-Native Software Delivery (20252026)
AI is transforming DevOps 2.0:
- **AI-assisted CI/CD** — automatic pipeline failure diagnosis, resource allocation optimization
- **Agent Control Protocol (ACP)** / **Model Context Protocol (MCP)** — standards for AI agent interaction with tooling
- **AI-driven cost management** — FinOps cloud optimization
- **Intelligent test selection** — ML determines which tests to run based on code changes
- **Self-healing pipelines** — AI auto-detects and fixes common issues
New tools: Harness (AI-native CD), GitLab 19.0 (agentic MR workflows, secrets manager), Octopus Deploy.
## Pipeline Tools
- **GitHub Actions** — integrated with GitHub, large marketplace
- **GitLab CI** — native in GitLab, auto DevOps
- **Jenkins** — oldest, extensible, self-hosted
- **CircleCI** — SaaS, fast
- **Argo Workflows** — Kubernetes native
- **Buildkite** — hybrid (own agents, SaaS orchestrator)
## Best Practices
- **Idempotent pipeline** — repeated runs give the same result
- **Immutable infrastructure** — never modify a running server, always redeploy
- **Shift left** — tests and security as early as possible in the pipeline
- **Artifact management** — all builds versioned in registry (Docker Hub, ECR, GHCR)
- **Dependency caching** — speed up pipeline (npm ci, pip cache, Docker layer caching)
- **Fail fast** — pipeline fails as early as possible on error
## Resources
Links, books and standards: [sources/cicd/sources.md](sources/cicd/sources.md)
### Recommended Reading
| Book | Authors | ISBN | Key Contribution |
|------|---------|------|-----------------|
| The DevOps Handbook | Kim, Humble, Debois, Willis | 978-1942788003 | CALMS principles (Culture, Automation, Lean, Measurement, Sharing), flow map, deployment pipeline |
| Continuous Delivery | Humble, Farley | 978-0321601912 | Deployment pipeline, commit stage, acceptance tests, capacity testing, zero-downtime release |
| CI/CD Design Patterns | Bajpai, Schildmeijer, Piwosz, Mishra | 978-1-83588-965-7 | 30+ design patterns for CI/CD — pipeline patterns, GitOps, security, testing, deployment strategies |
| DevOps Frameworks, Techniques, and Tools | Vijayakumaran, Kofler, Öggl, Springer | 978-1-4932-2670-2 | Framework for DevOps adoption, tool comparison (Jenkins vs GitLab vs GitHub Actions), techniques for monitoring and observability |
- **Quality gates** — automated checks before every promotion to the next environment
- **Pipeline visibility** — dashboard with current status of all pipelines (GitHub, GitLab, ArgoCD)
## OpenStack CI/CD
OpenStack ecosystem uses its own CI/CD tools:
### Zuul
- CI/CD system developed by the OpenStack community (now standalone, used outside OpenStack)
- **Gating** — changes are tested before merge (not after merge) — prevents breaking main branch
- **Ansible-based** — jobs are Ansible playbooks
- **Nodepool** — dynamic test VM allocation in the cloud (OpenStack, AWS)
- **Pipeline** — check, gate, post, periodic, tag, release
### OpenStack Infra (OpenDev)
- Public CI infrastructure for OpenStack projects
- Tools: Gerrit (code review), Zuul (CI), Nodepool (test nodes), Storyboard (issue tracking)
- Base jobs: tempest (integration tests), grenade (upgrade tests), devstack-gate (gate tests)
### Integration with External Tools
- **Terraform** — OpenStack provider for provisioning (terraform-provider-openstack)
- **Ansible** — openstack.cloud collection for managing OpenStack resources
- **Packer** — build OpenStack images (openstack builder)
- **Jenkins** — older CI, still used in some distributions
*Last revised: 2026-06-03*

679
CICD.md Normal file
View File

@@ -0,0 +1,679 @@
# 🔄 CI/CD a DevOps
## CI/CD Pipeline
```
Code Commit → Build → Test → Package → Deploy to Staging → Integration Tests → Deploy to Production
```
### Detailní pipeline stages
```
1. Checkout ──→ 2. Lint ──→ 3. Test ──→ 4. Build ──→ 5. Scan ──→ 6. Publish ──→ 7. Deploy
│ │ │
ESLint/ Unit/Integ/ SAST/SCA/
Prettier e2e tests Container scan
```
| Stage | Nástroje | Co se děje |
|-------|----------|------------|
| **Checkout** | git clone, fetch | Stažení kódu z repozitáře, včetně submodulů |
| **Lint** | ESLint, Prettier, RuboCop, golangci-lint | Statická analýza kódu, formátování |
| **Test (unit)** | Jest, pytest, JUnit | Rychlé testy (ms až s), bez závislostí |
| **Test (integration)** | Testcontainers, Docker Compose | Testy s DB, message queue, externí služby |
| **Test (e2e)** | Playwright, Cypress, Selenium | Full-stack testy v prohlížeči |
| **Build** | Docker build, go build, npm build, Maven | Kompilace, sestavení artifactu |
| **Scan (SAST)** | Semgrep, SonarQube, CodeQL | Statická analýza bezpečnosti |
| **Scan (DAST)** | OWASP ZAP, Burp Suite | Dynamická analýza (běžící aplikace) |
| **Scan (SCA)** | Dependabot, Snyk, Trivy | Analýza závislostí a CVE |
| **Publish** | Docker push, npm publish, Maven deploy | Nahrání artifactu do registru |
| **Deploy** | ArgoCD, Terraform, Helm, kubectl | Nasazení do cílového prostředí |
### Continuous Integration (CI)
- Automatické sestavení a testy při každém commitu
- Rychlá feedback smyčka (< 10 min)
- Linting, type checking, unit testy, security scan (SAST)
### Continuous Delivery (CD)
- Automatické deploye do staging / test prostředí
- Ruční schválení do produkce (optional)
- Smoke testy po deployi
### Continuous Deployment
- Plně automatický deploy do produkce
- Vyžaduje vysokou důvěru v testy a monitoring
- Feature flagy pro řízení rizika
## GitHub Actions detail
### Workflow syntax
```yaml
name: CI Pipeline
on:
push:
branches: [main]
pull_request:
branches: [main]
env:
NODE_VERSION: "22"
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- run: npm ci
- run: npm run lint
test:
runs-on: ubuntu-latest
needs: lint
strategy:
matrix:
node-version: [22, 24]
steps:
- uses: actions/checkout@v4
- name: Run tests
run: npm test
```
### Matrix builds
- Spouští stejné joby s různými parametry (OS, jazyková verze, architektura)
- `strategy.matrix` — kombinace parametrů (kartézský součin)
- `strategy.fail-fast` — zastavení všech při selhání jednoho
### Reusable workflows
```yaml
# .github/workflows/deploy.yml (called)
on:
workflow_call:
inputs:
environment:
required: true
type: string
secrets:
cloud_role:
required: true
# Volání v caller workflow
jobs:
deploy:
uses: ./.github/workflows/deploy.yml
with:
environment: staging
secrets:
cloud_role: ${{ secrets.STAGING_ROLE }}
```
### Composite actions
- Vlastní akce bez nutnosti samostatného repozitáře
- Kombinace `run`, `uses`, `shell` kroků
- Use case: standardizace lint/test/build napříč repozitáři
### Self-hosted runners
- Vlastní infrastruktura pro běh GitHub Actions
- Use case: privátní síť, GPU, specifický HW, compliance
- Škálování: actions-runner-controller (Kubernetes), auto-scaling groups
- Bezpečnost: izolace jobů, ephemeral runners
## GitLab CI detail
```yaml
stages:
- lint
- test
- build
- deploy
variables:
DOCKER_IMAGE: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
lint:
stage: lint
image: node:22
script:
- npm ci
- npm run lint
test:
stage: test
image: node:22
needs: ["lint"]
script:
- npm test
artifacts:
paths:
- coverage/
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
deploy-staging:
stage: deploy
needs: ["build"]
rules:
- if: $CI_COMMIT_BRANCH == "main"
environment:
name: staging
url: https://staging.example.com
script:
- kubectl set image deployment/app app=$DOCKER_IMAGE
```
**Koncepty**:
- **Stages** — sekvenční fáze (každá stage může mít více jobů paralelně)
- **Rules** — podmínky spuštění (branch, tag, changes, variables) — nahrazuje `only/except`
- **Needs** — DAG závislosti (job nemusí čekat na celou stage)
- **Artifacts** — předávání souborů mezi joby (binárky, reporty, cache)
- **Environments** — sledování deployů (rollback, history, approvals)
### DAG pipelines (needs)
```
lint ──→ test ──→ build ──→ deploy-staging ──→ deploy-prod
build-arm ──→ test-arm
```
- Definuje závislosti mezi joby (ne nutně stages)
- Umožňuje paralelizaci nezávislých jobů
- Snižuje celkový čas pipeline
## Infrastructure as Code (IaC)
| Nástroj | Typ | Jazyk |
|---------|-----|-------|
| Terraform | Declarative | HCL |
| OpenTofu | Declarative | HCL (fork Terraformu) |
| Pulumi | Declarative | TypeScript, Python, Go, C# |
| AWS CDK | Declarative | TypeScript, Python, Java, C# |
| CloudFormation | Declarative | YAML/JSON (AWS) |
| Azure ARM/Bicep | Declarative | Bicep, JSON |
| Ansible | Imperative/Config | YAML |
| Chef/Puppet | Config mgmt | Ruby DSL |
### Infrastructure as Code (2. vydání) — Kief Morris
Klíčová reference pro navrhování a provozování dynamické cloudové infrastruktury pomocí IaC. Kniha je tool-agnostic — zaměřuje se na vzory a postupy, ne na konkrétní nástroje.
#### Tři základní praktiky
| Praktika | Popis |
|----------|-------|
| **Define everything as code** | Veškerá infrastruktura definovaná v kódu, version control, repeatabilita |
| **Continuously test and deliver** | Každá změna prochází pipeline s automatickými testy |
| **Small, independent pieces** | Malé, volně provázané komponenty — snadnější změna a testování |
#### Principy cloudové infrastruktury
- **Systems reproducible** — infrastructure can be recreated from code at any time
- **Systems disposable** — instance mohou být zničeny a znovu vytvořeny
- **Systems consistent** — všechny prostředí identická (žádné snowflake servery)
- **Processes repeatable** — automatizace namísto manuálních postupů
- **Design always changing** — infrastruktura se neustále vyvíjí (není build-and-forget)
#### Anti-vzory (pitfalls)
| Anti-vzor | Popis |
|-----------|-------|
| **Snowflake server** | Každý server jiný, nelze reprodukovat |
| **Configuration drift** | Ruční změny → odchylky od definovaného stavu |
| **Server sprawl** | Příliš mnoho serverů bez správy |
| **Fragile infrastructure** | Křehká infrastruktura — změny často rozbijí systém |
| **Automation fear** | Strach z automatizace → ruční zásahy |
#### Struktura knihy (4 části)
1. **Foundations** — rámec nástrojů a technologií pro cloud platformy
2. **Working with infrastructure stacks** — definice, provisionování, testování a CD změn infrastruktury
3. **Working with servers and application runtime platforms** — provisionování a konfigurace serverů a clusterů
4. **Working with large systems and teams** — workflow, governance, architektonické vzory pro více týmů
#### Organizace IaC kódu
| Vzor | Popis |
|------|-------|
| **Monorepo** | Jeden repozitář pro vše — build-time integrace, vhodný pro malé týmy |
| **Microrepo** | Samostatný repozitář pro každý projekt — izolace, vhodný pro velké týmy |
| **Domain organization** | Organizace kódu podle doménových konceptů (ne podle technologií) |
**Doporučení:**
- Infrastruktura a aplikace mohou být ve stejném nebo odděleném repozitáři záleží na organizační struktuře (Team Topologies)
- Konfigurační soubory per-environment (test, staging, production) ukládat v rámci projektu
- Testy patří k projektu, integrační testy mohou být v samostatném projektu
- Infrastrukturní kód by neměl přímo deployovat aplikace — použít OS packaging (RPM, deb)
#### Expand-Contract pattern pro změny infrastruktury
Stejný princip jako u databázových migrací:
1. **Expand** — přidat nový resource (nestará verze stále běží)
2. **Migrate** — přesunout traffic / závislosti na nový resource
3. **Contract** — odstranit starý resource
Zabraňuje výpadkům při refaktorování infrastruktury.
## Terraform detail
#### State locking mechanism
| Backend | Locking mechanism | Poznámka |
|---------|------------------|----------|
| **S3 + DynamoDB** | DynamoDB (ConditionalPut) | Nejčastější, levný, jednoduchý |
| **Terraform Cloud** | Built-in (API) | SaaS, audit logy, VCS integration |
| **Azure Storage** | Azure Blob Lease | Podobný S3 modelu |
| **GCS** | Cloud Storage Object Hold | Omezené |
| **Consul** | Consul KV session_lock | High-availability |
| **PostgreSQL** | pg_advisory_lock / row lock | Vlastní backend |
#### State backends comparison
| Vlastnost | S3 + DynamoDB | Terraform Cloud | Consul |
|-----------|--------------|----------------|--------|
| Cena | $ (S3 + DynamoDB) | $$ (free tier omezený) | $$ (infra) |
| Team workflow | GitHub Actions + OIDC | Native RBAC, runs | Vlastní |
| Locking | DynamoDB | Built-in | Consul session |
| History | S3 versioning | Full history, diff | None |
| Remote ops | Ne (pouze state) | Ano (remote runs) | Ne |
| Encryption | SSE-S3/KMS | At rest + in transit | TLS |
#### Workspaces vs Terragrunt
| Aspekt | Terraform Workspaces | Terragrunt |
|--------|---------------------|------------|
| **Separace stavu** | Jeden backend, klíč: `env:/workspace` | Samostatný backend per env |
| **Code reuse** | Stejný kód, jiné proměnné | DRY konfigurace, moduly |
| **Riziko** | Omylem `apply` do špatného workspace | Izolované backends |
| **Kdy použít** | Jednoduché projekty, <5 env | Mikroservice, multi-env, multi-team |
| **Extra features** | — | Dependency, include, before_hook |
#### Provider versioning
```hcl
terraform {
required_version = ">= 1.5, < 2.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.23"
}
}
}
```
- `~> 5.0` — pouze patch verze (5.x, x ≥ 0)
- `>= 2.23, < 3.0` — jakákoli 2.x od 2.23
- `~>` constraints zabraňují breaking changes v major/minor
### Terraform workflow
```
terraform init → Stáhne provider moduly
terraform plan → Zobrazí změny
terraform apply → Aplikuje změny
terraform destroy → Zničí infrastrukturu
terraform validate → Validace syntaxe
terraform fmt → Formátování HCL
```
### State management
- Remote state (S3, Terraform Cloud, Azure Storage)
- State locking (DynamoDB, Consul)
- Workspaces pro oddělení prostředí
### Terraform: Up and Running (3rd ed.) — Yevgeniy Brikman
Praktický průvodce Terraformem od zakladatele Gruntwork. 3. vydání (2022) přidává přes 100 stran nového obsahu, aktualizaci z Terraform 0.12 na 1.2 a dvě nové kapitoly.
#### Co je nového ve 3. vydání
| Novinka | Popis |
|---------|-------|
| **Kapitola: Secrets management** | Správa tajemství s Terraformem — Vault, AWS Secrets Manager, KMS, OIDC, `sensitive` proměnné |
| **Kapitola: Multiple providers** | Práce s vícero regiony, účty, cloudy včetně Kubernetes (AWS EKS) |
| **Terraform 1.0+** | Backward compatibility promise, stabilita, HashiCorp IPO |
| **Provider versioning** | `required_providers` blok + `terraform.lock.hcl` (lock file) |
| **Module iteration** | `count` a `for_each` na modulech (od Terraform 0.13) |
| **Variable validation** | `validation {}` bloky, `precondition` / `postcondition` |
| **Refactoring** | `moved` bloky — bezpečný refactoring bez ruční manipulace se state |
| **CI/CD security** | OIDC autentizace, isolated workers pro `terraform apply` |
#### Secrets management s Terraformem
```hcl
# Proměnná označená jako sensitive — nikdy se nezobrazí v logu
variable "db_password" {
type = string
sensitive = true
}
# Čtení secrets z AWS Secrets Manager
data "aws_secretsmanager_secret" "db" {
name = "production/db/master"
}
data "aws_secretsmanager_secret_version" "db" {
secret_id = data.aws_secretsmanager_secret.db.id
}
```
**Doporučená hierarchie bezpečnosti:**
1. **OIDC** — nejbezpečnější, bez creds na CI serveru (GitHub Actions → IAM role)
2. **IAM role** — instance profile (EC2, ECS, EKS)
3. **Environment variables** — omezené, riziko úniku v logu
4. **Isolated workers** — oddělený worker s admin permissions, API pouze `plan`/`apply`
#### Testing Terraform kódu
| Vrstva | Nástroje | Popis |
|--------|----------|-------|
| **Static analysis** | `terraform validate`, `tflint`, `tfsec`, `checkov` | Analýza kódu bez běhu |
| **Plan testing** | `conftest` + OPA (Rego), `terraform plan` parse | Validace plánu proti policy |
| **Unit tests** | Terratest (Go), `terraform fmt`, `terraform validate` | Testování modulů izolovaně |
| **Integration tests** | Terratest (Go) | Skutečné provisionování + assert |
| **End-to-end tests** | Terratest | Plný stack, smoke testy |
#### Policy enforcement
```rego
# OPA / conftest — zakázat veřejné S3 bucket
package main
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_s3_bucket"
resource.change.after.acl == "public-read"
msg = sprintf("%s must not be public", [resource.address])
}
```
#### Production-grade checklist dle Brikmana
1. **Small modules** — jeden modul = jedna věc (single responsibility)
2. **Composable modules** — moduly se dají skládat do větších celků
3. **Testable modules** — každý modul má testy (Terratest)
4. **Releasable modules** — verzování (Git tagy, Terraform Registry)
5. **Version control** — všechno v gitu, včetně `.terraform.lock.hcl`
6. **Remote state** — S3 + DynamoDB nebo Terraform Cloud
7. **CI/CD pipeline**`plan` na MR, `apply` po merge do main
8. **Secrets management** — žádné secrets v plaintextu v kódu
9. **Policy as code** — OPA / Sentinel pro compliance
10. **Sandbox prostředí** — každý vývojář má vlastní izolované prostředí
#### Zlaté pravidlo (Golden Rule of Terraform)
> **Master branch state musí být vždy v souladu s produkčním prostředím.**
> Nikdy nespouštět `terraform apply` ručně lokálně na produkci — vždy přes CI/CD.
## Dockerfile best practices
```dockerfile
# Multi-stage build
FROM node:22-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Runtime stage — distroless
FROM gcr.io/distroless/nodejs22-debian12
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER nonroot:nonroot
EXPOSE 3000
CMD ["dist/server.js"]
```
**Pravidla**:
- **Multi-stage build** — oddělení build tools od runtime
- **Distroless images** — minimal attack surface (žádný shell, package manager)
- **Non-root user** — USER nonroot (security best practice)
- **Layer caching** — nejdříve kopírovat málo se měnící soubory (package.json → npm ci → code)
- **Small base image** — Alpine (5 MB), distroless (minimální), scratch (Go static binary)
- **Healthcheck** — HEALTHCHECK instrukce pro orchestrátor
- **Labels** — LABEL maintainer, version, git commit
- **.dockerignore** — minimalizace build contextu
## Artifact management
### Docker registries
| Registry | Public/Private | Cena | Integrace |
|----------|---------------|------|-----------|
| **Docker Hub** | Obojí | Public free, private $5/měsíc | GitHub Actions, GitLab |
| **ECR (AWS)** | Private | $0.10/GB/měsíc + data transfer | IAM, ECS, EKS |
| **GHCR (GitHub)** | Obojí | Public free, private 500 MB free | GitHub Actions, npm |
| **GCR / Artifact Registry** | Private | $0.10/GB/měsíc | GKE, Cloud Build |
| **ACR (Azure)** | Private | $0.11/GB/měsíc | AKS, Azure DevOps |
| **Harbor** | Private (self-hosted) | Zdarma (open source) | Vlastní, CNCF |
### Helm charts
- **Repository** — index.yaml + chart .tgz na HTTP serveru (S3, GitHub Pages, ChartMuseum)
- **OCI registry** — Helm 3.8+ podporuje uložení chartů v OCI registrech (ECR, GHCR, Harbor)
- **Versioning** — chart version (balíček) + app version (aplikace)
### SBOM (Software Bill of Materials)
- **SPDX** / **CycloneDX** — standardní formáty SBOM
- Generování: Trivy, Syft, grype
- Use case: supply chain security, compliance (EO 14028, EU CRA)
## Konfigurace a tajemství
| Nástroj | Popis |
|---------|-------|
| Vault (HashiCorp) | Dynamic secrets, encryption-as-a-service |
| AWS Secrets Manager | Managed, auto-rotation |
| Azure Key Vault | Managed, HSM podpora |
| GCP Secret Manager | Managed |
| SOPS | Encryption v git repos |
| Sealed Secrets | Encrypted secrets pro Kubernetes |
### Secret management workflows
**Vault agent injector** (Kubernetes)
- Sidecar container (vault-agent) injectuje secrets do podu
- Secrets mountovány jako tmpfs volume (ne do environment variables)
- Auto-rotation: vault-agent periodicky refreshuje secrets
**External Secrets Operator** (Kubernetes)
- CRD: `ExternalSecret` → vytváří `Secret` v K8s
- Backend: AWS Secrets Manager, Azure Key Vault, GCP Secret Manager, Vault
- Push-based refresh: změna v externím store → propagate do K8s
**Sealed Secrets**
- `kubeseal` zašifruje Secret na clusteru (controller má privátní klíč)
- Zašifrovaný manifest (SealedSecret) může být bezpečně v gitu
- Controller decryptuje při deployi
## GitOps
- **Princip**: Git je jediný zdroj pravdy (single source of truth)
- **Nástroje**: ArgoCD, Flux, Rancher Fleet
- Pull-based deploy — agent v clusteru sleduje repo a aplikuje změny
- Auto-sync + drift detection
## Environment promotion (dev → staging → prod)
```
Code → Dev (auto-deploy) → Staging (auto + smoke tests) → Prod (manual approval + gating)
```
**Quality gates**:
1. **Unit tests** — pass rate 100 %, code coverage ≥ 80 %
2. **Integration tests** — all critical paths pass
3. **SAST scan** — no critical/high vulnerabilities
4. **SCA scan** — no known critical CVEs
5. **Container scan** — all fixable vulns addressed
6. **Smoke tests** — po deployi na staging (health endpoint, basic flow)
7. **Manual approval** — pro produkci (volitelné při CD)
## Deployment strategie
| Strategie | Popis | Riziko |
|-----------|-------|--------|
| **Rolling update** | Postupná výměna instancí | Nízké |
| **Blue/Green** | Dvě identická prostředí, přepojení trafficu | Střední |
| **Canary** | % trafficu na novou verzi, postupné zvyšování | Nízké |
| **Feature flag** | Zapnutí/vypnutí funkce bez deploye | Velmi nízké |
| **A/B testing** | Různé verze pro různé uživatele | Nízké |
## Git branching strategies
| Strategie | Popis | Vhodné pro |
|-----------|-------|-----------|
| **Trunk-based** | Jeden hlavní branch (main), krátké feature branche (< 1 den) | CD, microservices, mature teams |
| **GitHub Flow** | Main + feature branche, PRs, jednoduchý | Startupy, web apps |
| **GitLab Flow** | Main + environment branche (staging, prod) + feature branche | Enterprise, regulated |
| **GitFlow** | Develop + main + feature/release/hotfix branche | Release-based, enterprise legacy |
| **One Flow** | Zjednodušený GitFlow (bez develop branche) | Střední týmy |
## Rollback strategies
| Strategie | Popis | Rychlost | Riziko |
|-----------|-------|----------|--------|
| **Forward fix** | Nový deploy s hotfixem | Pomalá (build + deploy) | Nízké |
| **Rollback (revert commit)** | Revert gitu, nový deploy | Střední | Nízké |
| **Blue/Green switchback** | Přepojení zpět na starou verzi | Okamžitá | DB inkompatibilita |
| **Database rollback** | Reverze DB migrace (migrate down) | Pomalá | Data loss risk |
### Database rollback challenges
- **Breaking changes** — odstranění sloupce/tabulky znamená rollback problém (data ztracena)
- **Best practice**: Expand → Migrate → Contract (nikdy neodstraňovat v jednom deployi)
- **Tooling**: Flyway undo (limited), Liquibase rollback, pgroll (Postgres)
- **Feature flagy** jako prevence — nový kód je za flagem, rollback = vypnutí flagu
## CI/CD Design Patterns
Moderní CI/CD pipeline řeší opakující se problémy pomocí návrhových vzorů:
| Vzor | Popis |
|------|-------|
| **Pipeline as Code** | Pipeline definovaná v YAML/Kotlin DSL (`.gitlab-ci.yml`, `.github/workflows/`) |
| **Immutable Pipeline** | Každý build je artifact, nikdy se nemění |
| **Quality Gate** | Branch protection, required checks, code coverage threshold |
| **Deployment Strategy** | Blue/Green, Canary, Rolling (viz tabulka níže) |
| **GitOps** | Pull-based deploy s auto-sync a drift detection |
| **Shift-Left Security** | SAST/DAST/SCA součást pipeline |
| **Dependency Caching** | Cache layer mezi běhy pipeline |
## Shift left security
### SCA (Software Composition Analysis)
| Nástroj | Typ | Integrace |
|---------|-----|-----------|
| **Dependabot** | GitHub native | GitHub, auto-PR na fix |
| **Renovate** | Multi-platform | GitHub, GitLab, Bitbucket |
| **Snyk** | SaaS + CLI | Všechny platformy, Docker, IaC |
| **Trivy** | CLI, OSS | CI/CD pipeline (GitHub Actions, GitLab) |
### SAST (Static Application Security Testing)
| Nástroj | Jazyky | Charakteristika |
|---------|--------|----------------|
| **Semgrep** | 30+ (Python, Java, Go, JS/TS) | Rychlý, custom rules, CI-native |
| **SonarQube** | 30+ | Komplexní, quality gates, tech debt |
| **CodeQL** | 12 (C++, C#, Go, Java, JS/TS, Python) | GitHub native, query-based |
| **Checkmarx** | 30+ | Enterprise, CxSAST, CxFlow |
| **Fortify** | 30+ | Enterprise, SAST + DAST |
### Container scanning
| Nástroj | Popis |
|---------|-------|
| **Trivy** | OSS, skenuje OS packages + language-specific + IaC |
| **Grype** | OSS, od Anchore, rychlý, Syft pro SBOM |
| **Clair** | Red Hat, OSS, OCI-compatible |
| **Docker Scout** | Docker Desktop / CLI, integrace s Docker Hub |
## AI-Native Software Delivery (20252026)
AI transformuje DevOps 2.0:
- **AI-assisted CI/CD** — automatické diagnózy selhání pipeline, optimalizace resource alokace
- **Agent Control Protocol (ACP)** / **Model Context Protocol (MCP)** — standardy pro interakci AI agentů s toolingem
- **AI-driven cost management** — FinOps optimalizace cloudu
- **Intelligent test selection** — ML určuje, které testy spustit podle změn v kódu
- **Self-healing pipelines** — AI auto-detekuje a opravuje běžné problémy
Nové nástroje: Harness (AI-native CD), GitLab 19.0 (agentic MR workflows, secrets manager), Octopus Deploy.
## Nástroje pro pipeline
- **GitHub Actions** — integrovaný s GitHubem, velký marketplace
- **GitLab CI** — nativní v GitLabu, auto DevOps
- **Jenkins** — nejstarší, extensible, self-hosted
- **CircleCI** — SaaS, rychlý
- **Argo Workflows** — Kubernetes nativní
- **Buildkite** — hybrid (vlastní agenti, SaaS orchestrator)
## Best practices
- **Idempotentní pipeline** — opakované spuštění dává stejný výsledek
- **Immutable infrastructure** — nikdy neupravovat running server, vždy znovu nasadit
- **Shift left** — testy a security co nejdříve v pipeline
- **Artifact management** — všechny buildy verzované v registru (Docker Hub, ECR, GHCR)
- **Dependency caching** — urychlení pipeline (npm ci, pip cache, Docker layer caching)
- **Fail fast** — pipeline selže co nejdříve při chybě
## Zdroje
Odkazy, knihy a standardy: [sources/cicd/sources.md](sources/cicd/sources.md)
### Doporučená literatura
| Kniha | Autoři | ISBN | Klíčový přínos |
|-------|--------|------|----------------|
| The DevOps Handbook | Kim, Humble, Debois, Willis | 978-1942788003 | Principy CALMS (Culture, Automation, Lean, Measurement, Sharing), flow mapa, deployment pipeline |
| Continuous Delivery | Humble, Farley | 978-0321601912 | Deployment pipeline, commit stage, acceptance tests, capacity testing, zero-downtime release |
| CI/CD Design Patterns | Bajpai, Schildmeijer, Piwosz, Mishra | 978-1-83588-965-7 | 30+ návrhových vzorů pro CI/CD — pipeline patterns, GitOps, security, testing, deployment strategie |
| DevOps Frameworks, Techniques, and Tools | Vijayakumaran, Kofler, Öggl, Springer | 978-1-4932-2670-2 | Rámec pro adopci DevOps, srovnání nástrojů (Jenkins vs GitLab vs GitHub Actions), techniky pro monitoring a observabilitu |
- **Quality gates** — automated checks před každým povýšením do dalšího prostředí
- **Pipeline visibility** — dashboard s aktuálním stavem všech pipeline (GitHub, GitLab, ArgoCD)
## OpenStack CI/CD
OpenStack ekosystém používá vlastní CI/CD nástroje:
### Zuul
- CI/CD systém vyvinutý OpenStack komunitou (nyní samostatný, používaný i mimo OpenStack)
- **Gating** — změny se testují před merge (ne po merge) — zabraňuje rozbití main branche
- **Ansible-based** — jobs jsou Ansible playbooky
- **Nodepool** — dynamická alokace testovacích VM v cloudu (OpenStack, AWS)
- **Pipeline** — check, gate, post, periodic, tag, release
### OpenStack Infra (OpenDev)
- Veřejná CI infrastruktura pro OpenStack projekty
- Nástroje: Gerrit (code review), Zuul (CI), Nodepool (test nodes), Storyboard (issue tracking)
- Base jobs: tempest (integration tests), grenade (upgrade tests), devstack-gate (gate tests)
### Integrace s externími nástroji
- **Terraform** — OpenStack provider pro provisioning (terraform-provider-openstack)
- **Ansible** — openstack.cloud collection pro správu OpenStack zdrojů
- **Packer** — build OpenStack images (openstack builder)
- **Jenkins** — starší CI, stále používaný v některých distribucích
*Poslední revize: 2026-06-03*

495
CLOUD.en.md Normal file
View File

@@ -0,0 +1,495 @@
# ☁️ Cloud Architecture
## Providers
- **AWS** — largest market share, broadest portfolio
- **Azure** — strong integration with Microsoft ecosystem
- **GCP** — Kubernetes (GKE), data & ML, network connectivity
## Deployment Models
| Model | Description |
|-------|-------------|
| Public cloud | Shared provider infrastructure |
| Private cloud | Dedicated infrastructure (on-prem or hosted) |
| Hybrid cloud | Public + private interconnection |
| Multi-cloud | Multiple public providers |
## Multi-cloud Strategy
### Reasons for Multi-cloud
- **Vendor lock-in prevention** — risk diversification
- **Regulatory requirements** — data residency in specific regions
- **Best-of-breed** — each provider has strengths (AWS networking, Azure enterprise, GCP data/ML)
- **Acquisition scenarios** — merge & acquisition unification
### Multi-cloud Connectivity
| Method | Latency | Throughput | Cost |
|--------|---------|------------|------|
| Site-to-Site VPN | Medium | Limited | Low |
| Private interconnect (Direct Connect / ExpressRoute / Dedicated Interconnect) | Low | High | High |
| Cloud-to-cloud VPN | Medium | Medium | Medium |
| SD-WAN | Low | High | Medium |
### Challenges
- **Network complexity** — different VPC/VNet concepts, security models
- **IAM federation** — unified identities across clouds (SSO, SAML, OIDC)
- **Data gravity** — moving data between clouds is expensive and slow
- **Monitoring** — single pane of glass across clouds (Grafana, Datadog)
### Cloud Adoption Frameworks (CAF)
Each major provider has its own Cloud Adoption Framework for a structured approach to cloud adoption:
| Provider | Framework | Focus |
|----------|-----------|-------|
| AWS | AWS CAF | 6 perspectives: Business, People, Governance, Platform, Security, Operations |
| Azure | Microsoft CAF | 8 methodologies: Strategy, Plan, Ready, Migrate, Innovate, Govern, Manage, Secure |
| GCP | Google CAF | 4 pillars: Learn, Scale, Modernize, Operate |
Multi-Cloud Administration Guide (Mulder, 2024) recommends combining CAF frameworks across providers for unified governance models, especially in:
- **Interoperability** — standardization of APIs and IaC across clouds (Terraform, Pulumi)
- **Data governance** — unified policy for data residency and lifecycle
- **Compliance automation** — automated audits across clouds (AWS Config, Azure Policy, GCP Org Policies)
- **Access management** — identity federation and centralized RBAC
## Migration Strategies — 6 Rs
| Strategy | Description | Difficulty | Typical Scenario |
|----------|-------------|------------|------------------|
| **Rehost** (Lift & Shift) | Move VM/as-is without changes | Low | Quick migration, datacenter exit, minimal risk |
| **Replatform** (Lift & Reshape) | Migration with minor adjustments (e.g., RDS instead of self-managed DB) | Medium | Optimization without rewriting the application |
| **Refactor** (Re-architect) | Rewrite application as cloud-native (microservices, serverless) | High | Maximize cloud benefit, long-term strategy |
| **Repurchase** | Move to SaaS (e.g., Salesforce, Workday) | Low | Application is outdated, SaaS alternative exists |
| **Retire** | Decommission unused applications | Low | Application no longer in use |
| **Retain** | Keep on-prem | None | Regulatory reasons, too high migration risk |
### Decision Framework for 6 Rs
```
Start: Is the application needed?
├── No → Retire
└── Yes → Does a SaaS alternative exist?
├── Yes → Repurchase
└── No → Is refactoring worthwhile?
├── Yes → Refactor
└── No → Is platform change sufficient?
├── Yes → Replatform
└── No → Rehost
```
## Well-Architected Framework (AWS)
1. **Operational Excellence** — automation, monitoring, documentation
2. **Security** — IAM, encryption, compliance
3. **Reliability** — recovery, scaling, backup plans
4. **Performance Efficiency** — right-sizing, choosing the right services
5. **Cost Optimization** — FinOps, reserved instances, spot instances
6. **Sustainability** (since 2022) — carbon footprint, energy efficiency
Analogues: Azure Well-Architected Framework, GCP Architecture Framework
### Key Questions from Well-Architected Review (~60 questions)
**Operational Excellence (12 questions)**
- How are changes managed and automated?
- How are operations documented and shared within the team?
- How are expected and unexpected events reflected in operations?
- What runbooks exist for common operational scenarios?
- How is incident management and postmortem process conducted?
**Security (12 questions)**
- How is identity & access management implemented?
- How is data protected at rest and in transit?
- How is security incident detection ensured?
- What are the procedures for patch management and vulnerability remediation?
- How are infrastructure credentials and secrets managed?
**Reliability (12 questions)**
- How is service availability ensured during a component failure?
- How is backup and disaster recovery implemented?
- How do service limits (quotas, throttling) affect reliability?
- How does automatic scaling work under changing load?
- What are the SLI/SLO metrics and how are they monitored?
**Performance Efficiency (12 questions)**
- How is the correct type and size of compute/storage selected?
- How is the database layer optimized (indexes, queries, caching)?
- How is monitoring used to identify bottlenecks?
- How is scaling implemented (vertical vs horizontal)?
**Cost Optimization (12 questions)**
- How are costs allocated to teams/projects (chargeback/showback)?
- What tools are used for cost analysis?
- How are unused resources identified and eliminated?
- How is licensing optimized (BYOL, hybrid benefit)?
## Key Components
### Compute Layer
- **VM / instances** — EC2, Azure VMs, GCE
- **Container orchestration** — EKS, AKS, GKE
- **Serverless** — Lambda, Azure Functions, Cloud Functions
- **PaaS** — App Engine, Elastic Beanstalk, Azure App Service
### Compute Comparison Matrix (AWS EC2)
| Family | Type | vCPU:Memory | Use Case | Example Pricing (on-demand, us-east-1) |
|--------|------|-------------|----------|----------------------------------------|
| **General purpose** | M7g, m7i | 1:4 | Web servers, microservices, dev/test | m7i.large ~$0.088/h |
| **Compute optimized** | C7g, c7i | 1:2 | HPC, batch processing, CI/CD, gaming | c7i.large ~$0.078/h |
| **Memory optimized** | R7g, r7i, x2idn | 1:8 to 1:32 | In-memory DB (Redis), SAP HANA, real-time analytics | r7i.large ~$0.118/h |
| **Storage optimized** | I4i, im4gn | 1:4 + NVMe | Transactional DB, data warehousing, Kafka | i4i.large ~$0.138/h |
| **GPU / ML** | P5, g5, trn1 | GPU attach | AI training (P5), inference (g5), ML (trn1) | g5.xlarge ~$1.006/h |
See [GPU.md](GPU.md) for GPU model and configuration details.
### Storage
- **Object storage** — S3, Blob Storage, Cloud Storage
- **Block storage** — EBS, managed disks, persistent disks
- **File storage** — EFS, Azure Files, Filestore
- **CDN** — CloudFront, Azure CDN, Cloud CDN
### S3 Storage Classes
| Class | Availability | Retrieval Time | Price / GB / Month | Use Case |
|-------|-------------|----------------|--------------------|----------|
| **S3 Standard** | 99.99 % | milliseconds | ~$0.023 | Active data, frequent access |
| **S3 Intelligent-Tiering** | 99.9 % | milliseconds | ~$0.023 + monitoring fee | Unknown/variable access patterns |
| **S3 Standard-IA** | 99.9 % | milliseconds | ~$0.0125 | Less frequent but fast access |
| **S3 One Zone-IA** | 99.5 % | milliseconds | ~$0.01 | Reproducible data |
| **S3 Glacier Instant** | 99.9 % | milliseconds | ~$0.004 | Archive with occasional access |
| **S3 Glacier Flexible** | 99.99 % | 1-5 min (expedite) / 3-5 h (standard) | ~$0.0036 | Long-term archive |
| **S3 Glacier Deep Archive** | 99.99 % | 12 h (standard) / 48 h (bulk) | ~$0.00099 | Cheapest, compliance archives |
## Multi-AZ and Multi-Region Architecture
```
Region ┌──────────────────────────────┐
│ AZ-1 AZ-2 AZ-3 │
│ ┌───┐ ┌───┐ ┌───┐ │
│ │APP│──────│APP│──────│APP│ │
│ └─┬─┘ └─┬─┘ └─┬─┘ │
│ │ │ │ │
│ ┌─▼──────────▼──────────▼─┐ │
│ │ Load Balancer │ │
│ └────────────┬────────────┘ │
│ │ │
│ ┌────────────▼────────────┐ │
│ │ Database (Primary) │ │
│ │ + Read Replica │ │
│ └─────────────────────────┘ │
└──────────────────────────────┘
```
## Disaster Recovery Strategies
### DR Strategies on AWS (from least to most prepared)
| Strategy | RTO | RPO | Cost | Description |
|----------|-----|-----|------|-------------|
| **Backup & Restore** | hours | 24 h | Low | Regular data backups to S3/Glacier, restore in DR region |
| **Pilot Light** | tens of minutes | minutes | Medium | Minimal running copy (DB, core services), scale on failover |
| **Warm Standby** | minutes | seconds | High | Reduced production copy running, scale on failover |
| **Active-Active (Multi-Region)** | seconds | < 1 s | Very high | Fully active in multiple regions, traffic routing (Route53, Global Accelerator) |
Key books on the topic:
- **Engineering Resilient Systems on AWS** (Schwarz, Moran, Bachmeier, 2024) — practical labs for resilience patterns: back off and retry, multi-Region failover, circuit breaker, chaos engineering using AWS Fault Injection Simulator
- **Building Resilient Architectures on AWS** (2025) — data security, backup strategies, recovery plan automation
### Chaos Engineering
Deliberate fault injection to verify system resilience:
- **AWS Fault Injection Simulator (FIS)** — managed fault injection for EC2, ECS, EKS, RDS
- **Tools**: Chaos Mesh (Kubernetes), Gremlin, Litmus
- **Process**: define hypothesis → run experiment → measure impact → improve system
- **Safety**: experiments in isolated environment, safety controls, automatic rollback
## Cloud Design Patterns
### Strangler Fig
Gradually replacing parts of a monolithic application with microservices.
- Legacy functionality is progressively redirected to new services
- Strangler Fig proxy (route headers, feature flags) controls traffic migration
- Advantage: incremental value delivery without big-bang rewrite
### Circuit Breaker
Prevents cascading failures when a dependent service fails.
- Three states: **Closed** (normal operation), **Open** (requests immediately fail), **Half-Open** (test request after timeout)
- Parameters: failure threshold, timeout (reset timeout), half-open max requests
- Implementations: resilience4j, Hystrix (legacy), Istio (envoy), AWS App Mesh
### Saga
Distributed transaction across microservices — a series of local transactions with compensating actions.
- **Choreography** — each service publishes an event, the next service reacts (Kafka, EventBridge)
- **Orchestration** — central orchestrator manages steps (Step Functions, Temporal, Camunda)
### CQRS (Command Query Responsibility Segregation)
Separation of write (Command) and read (Query) models.
- Command model: optimized for writes (normalized, transactional)
- Query model: optimized for reads (denormalized, read-optimized views)
- Eventual consistency between models (event bus propagates changes)
- Use case: reporting, audit logs, high-throughput systems
### Event Sourcing
Storing state as a sequence of events, not the current state.
- Each change is an append-only event in an event store
- Current state = fold of all events
- Advantages: audit trail, time travel, CQRS compatibility
- Implementations: EventStoreDB, Kafka (log), DynamoDB + CDC
### Additional Cloud Patterns (Wilder — Cloud Architecture Patterns)
| Pattern | Category | Description |
|---------|----------|-------------|
| **Horizontally Scaling Compute** | Scalability | Adding/removing instances based on load, elasticity |
| **Queue-Centric Workflow** | Scalability | Decoupling components via queues (SQS, RabbitMQ), async processing |
| **Auto-Scaling** | Scalability | Automatic scaling based on metrics (CPU, memory, request count) |
| **MapReduce** | Big Data | Distributed data processing (Hadoop, EMR, BigQuery) |
| **Database Sharding** | Big Data | Horizontal data partitioning across databases |
| **Busy Signal** | Failure Handling | Graceful degradation under overload (HTTP 503, throttling, backpressure) |
| **Node Failure** | Failure Handling | Detection and automatic recovery from compute node failure |
| **Colocation** | Distributed Users | Placing compute close to data to reduce latency |
| **Valet Key** | Distributed Users | Delegated storage access (SAS tokens, S3 presigned URLs) |
| **Multi-Site Deployment** | Distributed Users | Active deployment in multiple geographic locations |
## Evolutionary Architecture
Definition (Ford, Parsons, Kua, 2022): *An evolutionary architecture supports guided, incremental change across multiple dimensions.*
### Fitness Functions
Automated checks of architectural characteristics — analogous to tests for architecture:
| Type | Description | Example |
|------|-------------|---------|
| **Atomic** | Checks a single metric | Cyclomatic complexity < 10 |
| **Holistic** | Checks the overall system | End-to-end latency < 200 ms |
| **Triggered** | Triggered by event (CI/CD commit, deployment) | API contract verification |
| **Continuous** | Runs continuously in production | Monitoring dependency freshness |
| **Static** | Code analysis without execution | SonarQube, ESLint |
| **Dynamic** | Runtime analysis | Load tests, chaos experiments |
### Principles of Evolutionary Architecture
1. **Incremental change** — small, safe changes thanks to CI/CD, deployment pipelines, mature DevOps
2. **Fitness functions** — automated protection of architectural characteristics (scalability, performance, security)
3. **Coupling management** — conscious work with component connections (affinity, volatility, cycles)
4. **Evolutionary data** — database migrations as first-class citizens (evolutionary schemas, expand-contract pattern)
### Antipatterns
- **Big Design Up Front (BDUF)** — trying to design everything upfront, ignoring change
- **No Design at All** — absence of architectural thinking, purely emergent design
- **Premature Standardization** — introducing standards before the domain is understood
## Hybrid Cloud Connectivity
See also: [NETWORKING.md](NETWORKING.md) — network architecture (VPN, BGP, VPC design).
- **Site-to-Site VPN** — IPSec tunnel over the internet
- **Direct Connect / ExpressRoute / Dedicated Interconnect** — private physical connection
- **Cloud VPN / Transit Gateway** — hub-and-spoke topology
## Cost Optimization Detail
### Savings Plans vs Reserved Instances
| Property | Compute Savings Plan | EC2 Instance Savings Plan | Reserved Instances |
|----------|----------------------|---------------------------|-------------------|
| Flexibility | Instance family, region, OS | Instance family + region | Specific instance |
| Term | 1 or 3 years | 1 or 3 years | 1 or 3 years |
| Discount (typical) | ~30-50 % | ~40-60 % | ~40-60 % |
| Change instance | Yes (any) | Yes (within family) | No |
| Change region | Yes | No | No |
| Payment options | All Upfront / Partial / No Upfront | All Upfront / Partial / No Upfront | All Upfront / Partial / No Upfront |
### Spot Instance Best Practices
- **Diversification** — use a mix of instance types (spot fleet) for higher availability
- **Graceful handling** — application must handle termination notice (2 minute warning)
- **Checkpointing** — regular state saving for restart after spot interruption
- **Spot block** (AWS) — protection for 1-6 h (limited availability)
- **Use cases**: batch processing, CI/CD runners, stateless microservices, ML training
- **Avoid**: stateful workloads, databases (without special design)
## Organization and Governance
### AWS Organizations
```
Root OU
├── Security OU
│ ├── Audit Account (CloudTrail, Config)
│ └── Security Tooling Account (GuardDuty, Security Hub)
├── Infrastructure OU
│ ├── Network Account (Transit Gateway, VPN)
│ ├── Shared Services Account (AD, SSO)
│ └── Log Archive Account
├── Workloads OU
│ ├── Dev OU → individual dev accounts
│ ├── Staging OU → staging accounts
│ └── Prod OU → production accounts
└── Sandbox OU → isolated experimental accounts
```
- **SCP** (Service Control Policies) — whitelist/blacklist services at OU level
- **Tag policies** — enforce tagging across accounts
- **AI services opt-out** — control data usage in AWS AI services
### Azure Management Groups
```
Tenant Root Group
├── Platform MG
│ ├── Connectivity (hub VNet, ExpressRoute)
│ ├── Management (Log Analytics, Automation)
│ └── Identity (AD DS, PIM)
├── Application MG
│ ├── DEV (dev subscriptions)
│ ├── TEST (test subscriptions)
│ └── PROD (production subscriptions)
└── Sandbox MG
```
- **Azure Policy** — built-in and custom policies (similar to SCP)
- **Management Group hierarchy** — up to 6 levels deep
- **Subscription limits** — max 10,000 subscriptions per tenant
### GCP Projects
```
Organization Node
├── Folder: Platform
│ ├── Project: Shared Networking (VPC, Cloud NAT, VPN)
│ ├── Project: Security (Cloud KMS, Secret Manager, Chronicle)
│ └── Project: Monitoring (Cloud Monitoring, Logging)
├── Folder: Workloads
│ ├── Folder: Dev
│ │ └── Project: [app]-dev
│ ├── Folder: Staging
│ │ └── Project: [app]-staging
│ └── Folder: Prod
│ └── Project: [app]-prod
└── Folder: Sandbox
└── Project: [user]-sandbox
```
- **Organization policies** — constraints at organization/folder level
- **Resource Manager** — hierarchy: Organization → Folder → Project → Resources
- **Project limits** — max 30 projects (can be increased), 10k resources per project
## 12-Factor App Methodology
Methodology for building cloud-native applications (Heroku, 2011), expanded by the book **Multi-Cloud Handbook for Developers** (Natarajan, Jacob, 2024).
| # | Factor | Description | Cloud Implementation |
|---|--------|-------------|----------------------|
| 1 | **Codebase** | One repo, many deployments | Git + CI/CD pipeline |
| 2 | **Dependencies** | Explicit dependency declaration | package.json, requirements.txt, Docker image |
| 3 | **Config** | Configuration in environment variables | Secrets Manager, Parameter Store, env vars |
| 4 | **Backing services** | Dependent services as attached resources | RDS, S3, Redis — connection via connection string |
| 5 | **Build, release, run** | Strict separation of build stages | CI/CD pipeline (GitHub Actions, GitLab CI) |
| 6 | **Processes** | Application as stateless processes | Horizontal scaling, session in Redis |
| 7 | **Port binding** | Service exports port, not embedded in server | Express, FastAPI, Spring Boot on own port |
| 8 | **Concurrency** | Scaling via process model | Horizontal Pod Autoscaler (K8s), EC2 Auto Scaling |
| 9 | **Disposability** | Fast startup and graceful shutdown | Health checks, SIGTERM handling, preStop hooks |
| 10 | **Dev/Prod parity** | Minimal difference between environments | Docker, IaC (Terraform), same backing services |
| 11 | **Logs** | Logs as event streams | stdout/stderr → CloudWatch, ELK, Datadog |
| 12 | **Admin processes** | Admin tasks as one-off processes | DB migrations, data backfill — run in isolation |
### Multi-cloud Extensions (Multi-Cloud Handbook for Developers)
- **API-first design** — consistent API interfaces across clouds (REST, gRPC)
- **Domain-Driven Design (DDD)** — bounded contexts mapped to cloud services
- **Service Mesh** — Istio, Linkerd for observability, traffic management and security across clouds
- **GitOps** — declarative deployment with ArgoCD/Flux across Kubernetes clusters in different clouds
## Azure Cloud Native Architecture (Map Book)
Based on **The Azure Cloud Native Architecture Mapbook (2nd ed.)** (Eyskens, 2025) — 40+ architectural maps across domains:
### Domains of Architectural Maps
| Domain | Key Azure Services | Architectural Patterns |
|--------|-------------------|----------------------|
| **Infrastructure** | VNet, Azure Firewall, ExpressRoute, VPN Gateway | Hub-and-spoke, Virtual WAN, Private Link |
| **Applications** | App Service, API Management, Service Bus, Functions | Event-driven, Strangler Fig, Backend for Frontend |
| **Data** | Cosmos DB, SQL Database, Synapse, Data Lake | CQRS, Event Sourcing, Polyglot Persistence |
| **Container Orchestrators** | AKS, Azure Container Apps, ACA | Sidecar, Ambassador, Adapter (service mesh) |
| **AI** | Azure OpenAI, Cognitive Services, ML Studio | RAG, model fine-tuning, MLOps |
| **Security** | Entra ID, Defender for Cloud, Key Vault, Sentinel | Zero Trust, Defense in depth, JIT Access |
### Cloud Adoption Framework on Azure
- **Strategy** — business case, application catalog, portfolio rationalization
- **Plan** — landing zone design, governance baseline, subscription taxonomy
- **Ready** — landing zone implementation (ALZ), Azure Policy, Networking, Identity
- **Migrate** — assessment (Azure Migrate), rehost/replatform, test and cutover
- **Govern** — cost management, policy enforcement, compliance monitoring
## Cloud Provider Comparison
Based on **Cloud Computing: AWS, Azure, Google Cloud** (Sario, 2025):
| Area | AWS | Azure | GCP |
|------|-----|-------|-----|
| **Compute** | EC2, Lambda, ECS/EKS | VMs, Functions, AKS | GCE, Cloud Functions, GKE |
| **Storage** | S3, EBS, EFS | Blob, Disk, Files | Cloud Storage, Persistent Disk, Filestore |
| **Relational DB** | RDS (MySQL, PG, SQL Server, Oracle, MariaDB) | SQL Database, MySQL/PostgreSQL | Cloud SQL (MySQL, PG, SQL Server) |
| **NoSQL DB** | DynamoDB, ElastiCache | Cosmos DB, Redis Cache | Firestore, Bigtable, Memorystore |
| **Message queue** | SQS, SNS | Service Bus, Queue Storage | Pub/Sub, Tasks |
| **Observability** | CloudWatch, X-Ray | Monitor, Application Insights | Cloud Monitoring, Cloud Trace |
| **AI/ML** | SageMaker, Bedrock | Azure ML, OpenAI | Vertex AI, AutoML |
| **Pricing (compute)** | On-demand, Reserved, Spot, Savings Plan | Pay-as-you-go, Reserved, Spot | On-demand, Committed Use, Spot |
## OpenStack as Private Cloud
OpenStack is the dominant open-source platform for building private clouds (IaaS). It provides compute (Nova), networking (Neutron), and storage services (Cinder/Swift/Manila) with a unified API.
### Advantages over Commercial Solutions
- **Vendor-neutral API** — avoids lock-in (VMware, Hyper-V)
- **Multi-tenancy** — Keystone identity, RBAC, projects, quotas
- **Hybrid cloud ready** — federation with AWS/Azure/GCP, Terraform provisioning
- **Ecosystem** — hundreds of services (Heat orchestration, Magnum containers, Designate DNS)
### Suitable Scenarios
| Scenario | Key Services |
|----------|--------------|
| Data center with multi-tenancy and self-service | Nova, Neutron, Cinder, Horizon |
| Telco / NFVI / MEC | Neutron (DPDK, SR-IOV), Nova (NUMA pinning) |
| Science and HPC | Cyborg (GPU), Manila (NAS), Ironic (bare metal) |
| Academic clouds | Keystone federation, Trove (DBaaS) |
### Challenges
- Significant deployment and operations complexity
- Frequent API breaking changes between releases (cycle per year)
- Limited enterprise support outside commercial distributions (Red Hat, Canonical, Mirantis)
## Best Practices
- Use **infrastructure as code** (Terraform, Pulumi, CDK)
- Design for **failure** — every component can fail
- Implement **defense in depth** — security at every layer
- Monitor **costs** — tagging, budget alerts, anomaly detection
- Use **managed services** where it makes sense (less operations)
- **Least privilege** for all IAM roles and policies
## Resources
Links, books and standards: [sources/cloud/sources.md](sources/cloud/sources.md)
- **Cost tagging** — assign tags for chargeback/showback (Environment, Team, Cost Center, Application)
- **Automated compliance** — AWS Config, Azure Policy, GCP Org Policies for guardrails
- **Multi-account strategy** — AWS Control Tower, Azure Landing Zones, GCP Resource Hierarchy
### Recommended Reading
| Book | Authors | ISBN | Description |
|------|---------|------|-------------|
| The AI Cloud Infrastructure Blueprint | Thummarakoti, Vududala, Madupati, Kaushik | 978-1-041-16642-9 | End-to-end guide to designing, deploying, and managing AI systems on cloud platforms. Covers public/private/hybrid/multi-cloud models for AI, infrastructure for ML training and inference, MLOps. Target audience: architects, data scientists, DevOps. |
| AWS for Solutions Architects (3rd ed.) | Shrivastava, Srivastav, Thakur | 978-1-83664-193-3 | Practical guide to AWS architecture — compute (EC2, Lambda), storage (S3, EBS), databases (RDS, DynamoDB), networking, security, Well-Architected Framework, migration, cost optimization. Suitable for AWS Solutions Architect certification preparation. |
*Last revised: 2026-06-03*

495
CLOUD.md Normal file
View File

@@ -0,0 +1,495 @@
# ☁️ Cloud architektura
## Poskytovatelé
- **AWS** — největší tržní podíl, nejširší portfolio
- **Azure** — silná integrace s Microsoft ekosystémem
- **GCP** — Kubernetes (GKE), data & ML, síťová konektivita
## Modely nasazení
| Model | Popis |
|-------|-------|
| Public cloud | Sdílená infrastruktura poskytovatele |
| Private cloud | Vyhrazená infrastruktura (on-prem nebo hosted) |
| Hybrid cloud | Propojení public + private |
| Multi-cloud | Více veřejných poskytovatelů |
## Multi-cloud strategie
### Důvody pro multi-cloud
- **Vendor lock-in prevence** — diverzifikace rizika
- **Regulatorní požadavky** — data residency v konkrétních regionech
- **Best-of-breed** — každý provider má silné stránky (AWS networking, Azure enterprise, GCP data/ML)
- **Akviziční scénáře** — merge & acquisition sjednocení
### Multi-cloud connectivity
| Metoda | Latence | Propustnost | Náklady |
|--------|---------|-------------|---------|
| Site-to-Site VPN | Střední | Omezená | Nízké |
| Private interconnect (Direct Connect / ExpressRoute / Dedicated Interconnect) | Nízká | Vysoká | Vysoké |
| Cloud-to-cloud VPN | Střední | Střední | Střední |
| SD-WAN | Nízká | Vysoká | Střední |
### Výzvy
- **Síťová komplexita** — rozdílné VPC/VNet koncepty, security modely
- **IAM federace** — jednotné identity napříč cloudy (SSO, SAML, OIDC)
- **Data gravitace** — pohyb dat mezi cloudy je drahý a pomalý
- **Monitoring** — jeden pane of glass napříč cloudy (Grafana, Datadog)
### Cloud Adoption Frameworks (CAF)
Každý hlavní poskytovatel má vlastní Cloud Adoption Framework pro strukturovaný přístup k adopci cloudu:
| Poskytovatel | Rámec | Zaměření |
|-------------|-------|----------|
| AWS | AWS CAF | 6 perspektiv: Business, People, Governance, Platform, Security, Operations |
| Azure | Microsoft CAF | 8 metodik: Strategy, Plan, Ready, Migrate, Innovate, Govern, Manage, Secure |
| GCP | Google CAF | 4 pilíře: Learn, Scale, Modernize, Operate |
Multi-Cloud Administration Guide (Mulder, 2024) doporučuje kombinovat CAF rámce napříč poskytovateli pro jednotné governanční modely, zejména v oblastech:
- **Interoperabilita** — standardizace API a IaC napříč cloudy (Terraform, Pulumi)
- **Data governance** — jednotná politika pro data residency a životní cyklus dat
- **Compliance automation** — automatizované audity napříč cloudy (AWS Config, Azure Policy, GCP Org Policies)
- **Access management** — federace identit a centralizované RBAC
## Migrační strategie — 6 Rs
| Strategie | Popis | Náročnost | Typický scénář |
|-----------|-------|-----------|----------------|
| **Rehost** (Lift & Shift) | Přesun VM/as-is bez změn | Nízká | Rychlá migrace, datacentrum exit, minimální riziko |
| **Replatform** (Lift & Reshape) | Migrace s drobnými úpravami (např. RDS místo self-managed DB) | Střední | Optimalizace bez přepisování aplikace |
| **Refactor** (Re-architect) | Přepis aplikace na cloud-native (microservices, serverless) | Vysoká | Maximalizace cloudu, dlouhodobá strategie |
| **Repurchase** | Přechod na SaaS (např. Salesforce, Workday) | Nízká | Aplikace je zastaralá, existuje SaaS alternativa |
| **Retire** | Vypnutí nepotřebných aplikací | Nízká | Aplikace již není používaná, decommission |
| **Retain** | Ponechání on-prem | Žádná | Regulatorní důvody, příliš vysoké riziko migrace |
### Decision framework pro 6 Rs
```
Start: Je aplikace potřebná?
├── Ne → Retire
└── Ano → Existuje SaaS alternativa?
├── Ano → Repurchase
└── Ne → Vyplatí se refactoring?
├── Ano → Refactor
└── Ne → Stačí změna platformy?
├── Ano → Replatform
└── Ne → Rehost
```
## Well-Architected Framework (AWS)
1. **Operational Excellence** — automace, monitoring, dokumentace
2. **Security** — IAM, encryption, compliance
3. **Reliability** — recovery, škálování, záložní plány
4. **Performance Efficiency** — right-sizing, výběr správných služeb
5. **Cost Optimization** — FinOps, reserved instances, spot instances
6. **Sustainability** (od 2022) — carbon footprint, energy efficiency
Obdoby: Azure Well-Architected Framework, GCP Architecture Framework
### Klíčové otázky z Well-Architected Review (~60 otázek)
**Operational Excellence (12 otázek)**
- Jak jsou změny řízeny a automatizovány?
- Jak jsou operace dokumentovány a sdíleny v týmu?
- Jak jsou očekávané a neočekávané události reflektovány v operacích?
- Jaké runbooky existují pro běžné provozní scénáře?
- Jak probíhá incident management a postmortem proces?
**Security (12 otázek)**
- Jak je implementováno identity & access management?
- Jak jsou chráněna data v klidu a při přenosu?
- Jak je zajištěna detekce bezpečnostních incidentů?
- Jaké jsou postupy pro patch management a vulnerability remediation?
- Jak jsou řízeny infrastrukturní kredenciály a secrets?
**Reliability (12 otázek)**
- Jak je zajištěna dostupnost služby při výpadku komponenty?
- Jak je implementováno backup a disaster recovery?
- Jak service limity (quotas, throttling) ovlivňují spolehlivost?
- Jak probíhá automatické škálování při změně zátěže?
- Jaké jsou SLI/SLO metriky a jak jsou monitorovány?
**Performance Efficiency (12 otázek)**
- Jak je vybrán správný typ a velikost compute/ storage?
- Jak je optimalizována databázová vrstva (indexy, dotazy, caching)?
- Jak je monitoring využit k identifikaci úzkých hrdel?
- Jak je implementováno škálování (vertikální vs horizontální)?
**Cost Optimization (12 otázek)**
- Jak jsou náklady alokovány na týmy/projekty (chargeback/showback)?
- Jaké nástroje se používají pro analýzu nákladů?
- Jak jsou identifikovány a eliminovány nevyužité zdroje?
- Jak je optimalizováno licencování (BYOL, hybrid benefit)?
## Klíčové komponenty
### Výpočetní vrstva
- **VM / instance** — EC2, Azure VMs, GCE
- **Container orchestrace** — EKS, AKS, GKE
- **Serverless** — Lambda, Azure Functions, Cloud Functions
- **PaaS** — App Engine, Elastic Beanstalk, Azure App Service
### Compute comparison matrix (AWS EC2)
| Rodina | Typ | vCPU:Memory | Use case | Příklady cen (on-demand, us-east-1) |
|--------|-----|-------------|----------|--------------------------------------|
| **General purpose** | M7g, m7i | 1:4 | Web servery, microservices, dev/test | m7i.large ~$0.088/h |
| **Compute optimized** | C7g, c7i | 1:2 | HPC, batch processing, CI/CD, gaming | c7i.large ~$0.078/h |
| **Memory optimized** | R7g, r7i, x2idn | 1:8 až 1:32 | In-memory DB (Redis), SAP HANA, real-time analytics | r7i.large ~$0.118/h |
| **Storage optimized** | I4i, im4gn | 1:4 + NVMe | Transactional DB, data warehousing, Kafka | i4i.large ~$0.138/h |
| **GPU / ML** | P5, g5, trn1 | GPU attach | AI training (P5), inference (g5), ML (trn1) | g5.xlarge ~$1.006/h |
Viz [GPU.md](GPU.md) pro detail GPU modelů a konfigurací.
### Úložiště
- **Object storage** — S3, Blob Storage, Cloud Storage
- **Block storage** — EBS, managed disks, persistent disks
- **File storage** — EFS, Azure Files, Filestore
- **CDN** — CloudFront, Azure CDN, Cloud CDN
### S3 Storage Classes
| Třída | Dostupnost | Retrieval time | Cena / GB / měsíc | Use case |
|-------|-----------|----------------|-------------------|----------|
| **S3 Standard** | 99.99 % | milisekundy | ~$0.023 | Aktivní data, častý přístup |
| **S3 Intelligent-Tiering** | 99.9 % | milisekundy | ~$0.023 + monitoring fee | Neznámý / proměnlivý přístup |
| **S3 Standard-IA** | 99.9 % | milisekundy | ~$0.0125 | Méně častý přístup, ale rychlý |
| **S3 One Zone-IA** | 99.5 % | milisekundy | ~$0.01 | Znovu vytvořitelná data |
| **S3 Glacier Instant** | 99.9 % | milisekundy | ~$0.004 | Archiv s občasným přístupem |
| **S3 Glacier Flexible** | 99.99 % | 1-5 min (expedite) / 3-5 h (standard) | ~$0.0036 | Dlouhodobý archiv |
| **S3 Glacier Deep Archive** | 99.99 % | 12 h (standard) / 48 h (bulk) | ~$0.00099 | Nejlevnější, compliance archívy |
## Multi-AZ a Multi-Region architektura
```
Region ┌──────────────────────────────┐
│ AZ-1 AZ-2 AZ-3 │
│ ┌───┐ ┌───┐ ┌───┐ │
│ │APP│──────│APP│──────│APP│ │
│ └─┬─┘ └─┬─┘ └─┬─┘ │
│ │ │ │ │
│ ┌─▼──────────▼──────────▼─┐ │
│ │ Load Balancer │ │
│ └────────────┬────────────┘ │
│ │ │
│ ┌────────────▼────────────┐ │
│ │ Database (Primary) │ │
│ │ + Read Replica │ │
│ └─────────────────────────┘ │
└──────────────────────────────┘
```
## Disaster Recovery strategie
### DR strategie na AWS (od nejméně po nejvíce připravené)
| Strategie | RTO | RPO | Náklady | Popis |
|-----------|-----|-----|---------|-------|
| **Backup & Restore** | hodiny | 24 h | Nízké | Pravidelné zálohy dat do S3/Glacier, obnova v DR regionu |
| **Pilot Light** | desítky minut | minuty | Střední | Minimální běžící kopie (DB, core služby), škálování při failover |
| **Warm Standby** | minuty | sekundy | Vysoké | Běží zmenšená kopie produkce, škálování při failover |
| **Active-Active (Multi-Region)** | sekundy | < 1 s | Velmi vysoké | Plně aktivní ve více regionech, traffic routing (Route53, Global Accelerator) |
Klíčové knihy k tématu:
- **Engineering Resilient Systems on AWS** (Schwarz, Moran, Bachmeier, 2024) — praktické laby pro resilience vzory: back off and retry, multi-Region failover, circuit breaker, chaos engineering pomocí AWS Fault Injection Simulator
- **Building Resilient Architectures on AWS** (2025) — data security, backup strategie, automace recovery plánů
### Chaos Engineering
Cílené vnášení poruch do systému pro ověření odolnosti:
- **AWS Fault Injection Simulator (FIS)** — spravované fault injection pro EC2, ECS, EKS, RDS
- **Nástroje**: Chaos Mesh (Kubernetes), Gremlin, Litmus
- **Postup**: definice hypotézy → provedení experimentu → měření dopadu → zlepšení systému
- **Bezpečnost**: experimenty v izolovaném prostředí, safety controls, automatic rollback
## Cloud design patterns
### Strangler Fig
Postupné nahrazování částí monolitické aplikace microservices.
- Legacy funkcionalita se postupně přesměrovává na nové služby
- Strangler Fig proxy (route hlavičky, feature flagy) řídí přesun trafficu
- Výhoda: průběžné dodávání hodnoty bez big-bang přepisu
### Circuit Breaker
Zabránění kaskádovým selháním při výpadku závislé služby.
- Tři stavy: **Closed** (normální provoz), **Open** (requesty okamžitě failují), **Half-Open** (testovací request po timeoutu)
- Parametry: failure threshold, timeout (reset timeout), half-open max requests
- Implementace: resilience4j, Hystrix (legacy), Istio (envoy), AWS App Mesh
### Saga
Distribuovaná transakce napříč microservices — řada lokálních transakcí s kompenzačními akcemi.
- **Choreography** — každá služba publikuje událost, další služba reaguje (Kafka, EventBridge)
- **Orchestration** — centrální orchestrátor řídí kroky (Step Functions, Temporal, Camunda)
### CQRS (Command Query Responsibility Segregation)
Oddělení zápisových (Command) a čtecích (Query) modelů.
- Command model: optimalizovaný pro zápis (normalizovaný, transactionální)
- Query model: optimalizovaný pro čtení (denormalizovaný, read-optimized views)
- Eventual consistency mezi modely (event bus propaguje změny)
- Use case: reporting, audit logy, high-throughput systémy
### Event Sourcing
Ukládání stavu jako sekvence událostí (eventů), ne aktuálního stavu.
- Každá změna je append-only event v event store
- Současný stav = fold všech událostí
- Výhody: audit trail, time travel, CQRS kompatibilita
- Implementace: EventStoreDB, Kafka (log), DynamoDB + CDC
### Další cloudové patterny (Wilder — Cloud Architecture Patterns)
| Pattern | Kategorie | Popis |
|---------|-----------|-------|
| **Horizontally Scaling Compute** | Škálovatelnost | Přidávání/odebírání instancí dle zátěže, elasticita |
| **Queue-Centric Workflow** | Škálovatelnost | Decoupling komponent přes fronty (SQS, RabbitMQ), zpracování asynchronně |
| **Auto-Scaling** | Škálovatelnost | Automatické škálování na základě metrik (CPU, memory, request count) |
| **MapReduce** | Big Data | Distribuované zpracování dat (Hadoop, EMR, BigQuery) |
| **Database Sharding** | Big Data | Horizontální partition dat napříč databázemi |
| **Busy Signal** | Failure Handling | Graceful degradace při přetížení (HTTP 503, throttling, backpressure) |
| **Node Failure** | Failure Handling | Detekce a automatické zotavení z výpadku výpočetního uzlu |
| **Colocation** | Distribuovaní uživatelé | Umístění compute blízko datům pro snížení latence |
| **Valet Key** | Distribuovaní uživatelé | Delegovaný přístup ke storage (SAS tokeny, S3 presigned URLs) |
| **Multi-Site Deployment** | Distribuovaní uživatelé | Aktivní nasazení ve více geografických lokalitách |
## Evolutionary Architecture
Definice (Ford, Parsons, Kua, 2022): *Evoluční architektura podporuje řízenou, inkrementální změnu napříč více dimenzemi.*
### Fitness Functions
Automatizované kontroly architektonických charakteristik — obdoba testů pro architekturu:
| Typ | Popis | Příklad |
|-----|-------|---------|
| **Atomic** | Kontroluje jednu metriku | Cyclomatic complexity < 10 |
| **Holistic** | Kontroluje celkový systém | Latence end-to-end < 200 ms |
| **Triggered** | Spouštěná událostí (CI/CD commit, deployment) | Ověření API kontraktu |
| **Continous** | Běží nepřetržitě v produkci | Monitoring dependency freshness |
| **Static** | Analýza kódu bez běhu | SonarQube, ESLint |
| **Dynamic** | Analýza za běhu | Load testy, chaos experimenty |
### Principy evoluční architektury
1. **Inkrementální změna** — malé, bezpečné změny díky CI/CD, deployment pipelines, zralému DevOps
2. **Fitness funkce** — automatizovaná ochrana architektonických charakteristik (škálovatelnost, performance, bezpečnost)
3. **Správa couplingů** — vědomá práce s propojením komponent (affinity, volatility, cykly)
4. **Evoluční data** — databázové migrace jako first-class občan (evoluční schemata, expand-contract pattern)
### Antipatterny
- **Big Design Up Front (BDUF)** — snaha navrhnout vše předem, ignoruje změny
- **No Design at All** — absence architektonického myšlení, čistě emergentní design
- **Premature Standardization** — zavedení standardů dříve, než je známe domény
## Hybrid cloud konektivita
Viz také: [NETWORKING.md](NETWORKING.md) — síťová architektura (VPN, BGP, VPC design).
- **Site-to-Site VPN** — IPSec tunel přes internet
- **Direct Connect / ExpressRoute / Dedicated Interconnect** — privátní fyzické propojení
- **Cloud VPN / Transit Gateway** — hub-and-spoke topologie
## Cost optimization detail
### Savings Plans vs Reserved Instances
| Vlastnost | Compute Savings Plan | EC2 Instance Savings Plan | Reserved Instances |
|-----------|---------------------|---------------------------|-------------------|
| Flexibilita | Instance family, region, OS | Instance family + region | Specifická instance |
| Termín | 1 nebo 3 roky | 1 nebo 3 roky | 1 nebo 3 roky |
| Sleva (typicky) | ~30-50 % | ~40-60 % | ~40-60 % |
| Změna instance | Ano (libovolná) | Ano (v rámci rodiny) | Ne |
| Změna regionu | Ano | Ne | Ne |
| Payment options | All Upfront / Partial / No Upfront | All Upfront / Partial / No Upfront | All Upfront / Partial / No Upfront |
### Spot instance best practices
- **Diverzifikace** — používejte mix instance typů (spot fleet) pro vyšší dostupnost
- **Graceful handling** — aplikace musí zvládnout termination notice (2 minuty varování)
- **Checkpointing** — pravidelné ukládání stavu pro restart po spot přerušení
- **Spot block** (AWS) — ochrana na 1-6 h (omezená dostupnost)
- **Použití**: batch processing, CI/CD runners, stateless microservices, ML training
- **Vyhnout se**: stateful workloads, databáze (bez speciálního designu)
## Organizace a governance
### AWS Organizations
```
Root OU
├── Security OU
│ ├── Audit Account (CloudTrail, Config)
│ └── Security Tooling Account (GuardDuty, Security Hub)
├── Infrastructure OU
│ ├── Network Account (Transit Gateway, VPN)
│ ├── Shared Services Account (AD, SSO)
│ └── Log Archive Account
├── Workloads OU
│ ├── Dev OU → jednotlivé dev accounts
│ ├── Staging OU → staging accounts
│ └── Prod OU → production accounts
└── Sandbox OU → izolované experimentální účty
```
- **SCP** (Service Control Policies) — whitelist/blacklist služeb na OU úrovni
- **Tag policies** — enforcement tagování napříč účty
- **AI services opt-out** — kontrola použití dat v AWS AI službách
### Azure Management Groups
```
Tenant Root Group
├── Platform MG
│ ├── Connectivity (hub VNet, ExpressRoute)
│ ├── Management (Log Analytics, Automation)
│ └── Identity (AD DS, PIM)
├── Application MG
│ ├── DEV (dev subscriptions)
│ ├── TEST (test subscriptions)
│ └── PROD (production subscriptions)
└── Sandbox MG
```
- **Azure Policy** — built-in a custom policies (podobné SCP)
- **Management Group hierarchy** — až 6 úrovní hloubky
- **Subscription limits** — max 10 000 subscriptions na tenant
### GCP Projects
```
Organization Node
├── Folder: Platform
│ ├── Project: Shared Networking (VPC, Cloud NAT, VPN)
│ ├── Project: Security (Cloud KMS, Secret Manager, Chronicle)
│ └── Project: Monitoring (Cloud Monitoring, Logging)
├── Folder: Workloads
│ ├── Folder: Dev
│ │ └── Project: [aplikace]-dev
│ ├── Folder: Staging
│ │ └── Project: [aplikace]-staging
│ └── Folder: Prod
│ └── Project: [aplikace]-prod
└── Folder: Sandbox
└── Project: [user]-sandbox
```
- **Organization policies** — constrainty na úrovni organizace/folderu
- **Resource Manager** — hierarchie: Organization → Folder → Project → Resources
- **Project limits** — max 30 projektů (lze navýšit), resources per project 10k
## 12-Factor App metodologie
Metodologie pro building cloud-native aplikací (Heroku, 2011), rozšířená knihou **Multi-Cloud Handbook for Developers** (Natarajan, Jacob, 2024).
| # | Faktor | Popis | Cloudová implementace |
|---|--------|-------|----------------------|
| 1 | **Codebase** | Jeden repozitář, mnoho deploymentů | Git + CI/CD pipeline |
| 2 | **Dependencies** | Explicitní deklarace závislostí | package.json, requirements.txt, Docker image |
| 3 | **Config** | Konfigurace v proměnných prostředí | Secrets Manager, Parameter Store, env vars |
| 4 | **Backing services** | Závislé služby jako připojené zdroje | RDS, S3, Redis — připojení přes connection string |
| 5 | **Build, release, run** | Striktní oddělení fází sestavení | CI/CD pipeline (GitHub Actions, GitLab CI) |
| 6 | **Processes** | Aplikace jako bezstavové procesy | Horizontální škálování, session v Redis |
| 7 | **Port binding** | Služba exportuje port, není vložena do serveru | Express, FastAPI, Spring Boot na vlastním portu |
| 8 | **Concurrency** | Škálování pomocí procesního modelu | Horizontal Pod Autoscaler (K8s), EC2 Auto Scaling |
| 9 | **Disposability** | Rychlý start a graceful shutdown | Health checks, SIGTERM handling, preStop hooks |
| 10 | **Dev/Prod parity** | Co nejmenší rozdíl mezi prostředími | Docker, IaC (Terraform), stejné backing services |
| 11 | **Logs** | Logy jako event streamy | stdout/stderr → CloudWatch, ELK, Datadog |
| 12 | **Admin processes** | Administrativní úlohy jako one-off procesy | DB migrace, data backfill — spuštěno v izolaci |
### Rozšíření pro multi-cloud (Multi-Cloud Handbook for Developers)
- **API-first design** — konzistentní API rozhraní napříč cloudy (REST, gRPC)
- **Domain-Driven Design (DDD)** — ohraničené kontexty mapované na cloudové služby
- **Service Mesh** — Istio, Linkerd pro observabilitu, traffic management a security napříč cloudy
- **GitOps** — declarativní deployment s ArgoCD/Flux napříč Kubernetes clustery v různých cloudech
## Azure Cloud Native Architecture (mapová příručka)
Na základě **The Azure Cloud Native Architecture Mapbook (2nd ed.)** (Eyskens, 2025) — 40+ architektonických map napříč doménami:
### Domény architektonických map
| Doména | Klíčové služby Azure | Architektonické vzory |
|--------|---------------------|----------------------|
| **Infrastructure** | VNet, Azure Firewall, ExpressRoute, VPN Gateway | Hub-and-spoke, Virtual WAN, Private Link |
| **Applications** | App Service, API Management, Service Bus, Functions | Event-driven, Strangler Fig, Backend for Frontend |
| **Data** | Cosmos DB, SQL Database, Synapse, Data Lake | CQRS, Event Sourcing, Polyglot Persistence |
| **Container Orchestrators** | AKS, Azure Container Apps, ACA | Sidecar, Ambassador, Adapter (service mesh) |
| **AI** | Azure OpenAI, Cognitive Services, ML Studio | RAG, model fine-tuning, MLOps |
| **Security** | Entra ID, Defender for Cloud, Key Vault, Sentinel | Zero Trust, Defense in depth, JIT Access |
### Využití Cloud Adoption Framework na Azure
- **Strategy** — business case, katalog aplikací, racionalizace portfolia
- **Plan** — landing zone design, governance baseline, subscription taxonomy
- **Ready** — implementace landing zones (ALZ), Azure Policy, Networking, Identity
- **Migrate** — assessment (Azure Migrate), rehost/replatform, test a cutover
- **Govern** — cost management, policy enforcement, compliance monitoring
## Srovnání cloudových poskytovatelů
Na základě **Cloud Computing: AWS, Azure, Google Cloud** (Sario, 2025):
| Oblast | AWS | Azure | GCP |
|--------|-----|-------|-----|
| **Compute** | EC2, Lambda, ECS/EKS | VMs, Functions, AKS | GCE, Cloud Functions, GKE |
| **Storage** | S3, EBS, EFS | Blob, Disk, Files | Cloud Storage, Persistent Disk, Filestore |
| **Databáze relační** | RDS (MySQL, PG, SQL Server, Oracle, MariaDB) | SQL Database, MySQL/PostgreSQL | Cloud SQL (MySQL, PG, SQL Server) |
| **Databáze NoSQL** | DynamoDB, ElastiCache | Cosmos DB, Redis Cache | Firestore, Bigtable, Memorystore |
| **Message queue** | SQS, SNS | Service Bus, Queue Storage | Pub/Sub, Tasks |
| **Observabilita** | CloudWatch, X-Ray | Monitor, Application Insights | Cloud Monitoring, Cloud Trace |
| **AI/ML** | SageMaker, Bedrock | Azure ML, OpenAI | Vertex AI, AutoML |
| **Cena (compute)** | On-demand, Reserved, Spot, Savings Plan | Pay-as-you-go, Reserved, Spot | On-demand, Committed Use, Spot |
## OpenStack jako Private Cloud
OpenStack je dominantní open-source platforma pro budování private cloudu (IaaS). Poskytuje výpočetní (Nova), síťové (Neutron) a storage služby (Cinder/Swift/Manila) s jednotným API.
### Výhody oproti komerčním řešením
- **Vendor-neutral API** — vyhne se lock-in (VMware, Hyper-V)
- **Multi-tenancy** — Keystone identity, RBAC, projekty, quoty
- **Hybrid cloud ready** — federation s AWS/Azure/GCP, Terraform provisioning
- **Ekosystém** — stovky služeb (Heat orchestrace, Magnum containers, Designate DNS)
### Vhodné scénáře
| Scénář | Klíčové služby |
|--------|---------------|
| Datacentrum s multi-tenancy a self-service | Nova, Neutron, Cinder, Horizon |
| Telco / NFVI / MEC | Neutron (DPDK, SR-IOV), Nova (NUMA pinning) |
| Věda a HPC | Cyborg (GPU), Manila (NAS), Ironic (bare metal) |
| Akademické cloudy | Keystone federation, Trove (DBaaS) |
### Výzvy
- Významná komplexita nasazení a provozu
- Časté API breaking changes mezi releasy (cycle per year)
- Omezená enterprise podpora mimo komerční distribuce (Red Hat, Canonical, Mirantis)
## Best practices
- Používejte **infrastructure as code** (Terraform, Pulumi, CDK)
- Designujte pro **failure** — každá komponenta může spadnout
- Implementujte **defense in depth** — security na každé vrstvě
- Monitorujte **náklady** — taggování, budget alerts, anomaly detection
- Používejte **managed services** kde to dává smysl (méně operací)
- **Least privilege** pro všechny IAM role a politiky
## Zdroje
Odkazy, knihy a standardy: [sources/cloud/sources.md](sources/cloud/sources.md)
- **Cost tagging** — assign tags pro chargeback/showback (Environment, Team, Cost Center, Application)
- **Automated compliance** — AWS Config, Azure Policy, GCP Org Policies pro guardrails
- **Multi-account strategie** — AWS Control Tower, Azure Landing Zones, GCP Resource Hierarchy
### Doporučená literatura
| Kniha | Autoři | ISBN | Popis |
|-------|--------|------|-------|
| The AI Cloud Infrastructure Blueprint | Thummarakoti, Vududala, Madupati, Kaushik | 978-1-041-16642-9 | End-to-end průvodce návrhem, deploymentem a správou AI systémů na cloudových platformách. Pokrývá public/private/hybrid/multi-cloud modely pro AI, infrastrukturu pro ML trénování a inferenci, MLOps. Cílová skupina: architekti, data scientists, DevOps. |
| AWS for Solutions Architects (3rd ed.) | Shrivastava, Srivastav, Thakur | 978-1-83664-193-3 | Praktický průvodce AWS architekturou — compute (EC2, Lambda), storage (S3, EBS), databáze (RDS, DynamoDB), networking, security, Well-Architected Framework, migrace, cost optimization. Vhodné pro přípravu na AWS Solutions Architect certifikaci. |
*Poslední revize: 2026-06-03*

270
CONNECTIVITY.en.md Normal file
View File

@@ -0,0 +1,270 @@
# 🔌 Server connectivity — network and storage connectivity
## Ethernet — network connectivity
### Speeds and formats
| Speed | Designation | Form factor | Cabling | Standard year | Use case |
|----------|----------|-------------|---------|---------------|----------|
| **1 GbE** | 1000BASE-T | RJ45 (copper) | Cat5e/Cat6 | 1999 | Management, legacy |
| **10 GbE** | 10GBASE-T / SFP+ | RJ45 / SFP+ | Cat6A (30m) / Cat7 (100m) / DAC / SR/LR | 2006 | Common server, storage |
| **25 GbE** | 25GBASE-R | SFP28 | Cat8 (30m) / DAC (5m) / SR/LR (100m/10km) | 2016 | Standard for servers (2020+) |
| **40 GbE** | 40GBASE-R | QSFP+ | DAC (7m) / SR (150m) / LR (10km) | 2010 | Legacy, spine |
| **50 GbE** | 50GBASE-R | SFP56 | DAC / SR / LR | 2018 | Emerging server |
| **100 GbE** | 100GBASE-R | QSFP28 | DAC (3m) / SR4 (100m) / LR4 (10km) / PSM4 (500m) | 2015 | Spine, storage, AI |
| **200 GbE** | 200GBASE-R | QSFP56 | DAC / SR4 / DR4 | 2019 | AI/ML, HPC |
| **400 GbE** | 400GBASE-R | QSFP-DD / OSFP | DAC (2.5m) / SR8 (100m) / DR4 (500m) / FR4 (2km) | 2017 | AI training, hyperscale |
| **800 GbE** | 800GBASE-R | QSFP-DD800 / OSFP | DAC (2m) / SR8 (100m) / DR8 (500m) | 2024 | Next-gen AI/ML |
**Recommendations for servers (2026)**:
- **Standard**: 2× 25 GbE (management + data) or 2× 100 GbE for demanding workloads
- **AI/ML training**: 8× 400 GbE (InfiniBand preferred for GPU communication)
- **Storage**: 2× 25/100 GbE (iSCSI/NFS) or dedicated FC (16/32 Gbps)
### NIC form factor
| Form factor | PCIe lanes | Speed | Use case |
|------------|-----------|----------|----------|
| **OCP 3.0** | x8/x16 | 25/100/200 GbE | Modern servers (Dell, HPE), small form factor |
| **PCIe HHHL** | x8 | 25/50 GbE | Standard 1U/2U servers |
| **PCIe FHHL** | x16 | 100/200/400 GbE | GPU servers, high-density |
| **Mezzanine** | x8 | 10/25 GbE | Blade servers (HPE Synergy, Dell MX) |
| **LOM (LAN on Motherboard)** | — | 1/10/25 GbE | Integrated, basic connectivity |
### NIC features
| Feature | Description | Benefit |
|---------|-------|---------|
| **TSO/GRO** | TCP Segmentation Offload / Generic Receive Offload | Reduced CPU load for TCP |
| **LRO/LSO** | Large Receive/Send Offload | Equivalent of TSO/GRO for legacy |
| **RSS** | Receive Side Scaling | Distribution of incoming packets across multiple CPU cores |
| **RPS/RFS** | Receive Packet Steering / Flow Steering | Software RSS, cache affinity |
| **XDP** | eXpress Data Path | BPF-based packet processing (DDoS, load balancer) |
| **RDMA (RoCE v2)** | RDMA over Converged Ethernet | GPU direct communication, storage (NVMe-oF) |
| **iWARP** | RDMA over TCP | RDMA without special switch (higher latency) |
| **DPDK** | Data Plane Development Kit | Userspace for packet processing (VNF, vSwitch) |
| **VXLAN/NVGRE offload** | HW offload for tunneling | Overlay networking (VMware NSX, OpenStack) |
| **SR-IOV** | Single Root I/O Virtualization | Direct NIC access for VMs (VF), low latency |
| **Flow Bifurcation** | Split NIC traffic between kernel and DPDK | Concurrent management and high-speed data path |
| **PTP (IEEE 1588)** | Precision Time Protocol | Financial services, 5G, telco |
### NIC selection per workload
| Workload | Recommended NIC | Rationale |
|----------|---------------|------------|
| **Web / API servers** | 2× 25 GbE SFP28, OCP | Low cost, sufficient bandwidth |
| **Virtualization (VMware)** | 2× 25 GbE (SR-IOV, VXLAN offload) | SR-IOV for VMs, VXLAN for NSX |
| **Database (OLTP)** | 2× 25/100 GbE (RSS, low latency) | Low latency, RSS for CPU scaling |
| **Storage (NFS/iSCSI)** | 2× 25/100 GbE (RoCE v2) | RDMA for NVMe-oF, low latency |
| **Storage (FC SAN)** | 2× 32 Gb FC HBA | SAN for VMware VMFS, block storage |
| **AI/ML training** | 8× 400 GbE + InfiniBand NDR | GPU communication, data ingestion |
| **AI/ML inference** | 4× 100 GbE (RoCE v2) | Model serving, GPU direct |
| **HPC** | InfiniBand NDR 400 Gbps | MPI communication, low latency |
| **Telco / Edge** | 2× 25 GbE (DPDK, PTP) | VNF, 5G UPF, low latency |
---
## Storage connectivity
### Fibre Channel (FC) SAN
| Generation | Speed | Designation | Form factor | Reach (SMF) | Use case |
|----------|----------|----------|-------------|-------------|----------|
| **Gen 5** | 16 Gbps | 16GFC | SFP+ | 10 km | Legacy SAN |
| **Gen 6** | 32 Gbps | 32GFC | SFP28 | 10 km | Current standard |
| **Gen 7** | 64 Gbps | 64GFC | SFP56 | 10 km | Emerging, high-performance |
| **Gen 8** | 128 Gbps | 128GFC | QSFP28 | 10 km | Emerging (first production deployments) |
**HBA (Host Bus Adapter)**:
| Manufacturer | Model | Speed | PCIe | Ports | Features |
|---------|-------|----------|------|-------|----------|
| **Broadcom / Emulex** | LPe35000 | 32 GFC | PCIe 3.0 x8 | 1-2 | NVMe-FC, T10-PI, SR-IOV |
| **Broadcom / Emulex** | LPe36000 | 64 GFC | PCIe 4.0 x16 | 1-2 | NVMe-FC, FC-NVMe |
| **Marvell / QLogic** | QLE2770 | 32 GFC | PCIe 3.0 x8 | 1-2 | FC-NVMe, T10-PI |
| **Marvell / QLogic** | QLE2870 | 64 GFC | PCIe 4.0 x8 | 1-2 | NVMe-FC, 64GFC |
**FC SAN topology**:
```
Server ──HBA── FC Switch ──── Storage Array (FC port)
│ │
│ ┌────┴────┐
│ │ Fabric │
│ └─────────┘
──── ISL (Inter-Switch Link) ──── backup fabric (B)
```
**Zoning** (FC):
```
Zone A: Server1_HBA1 + Storage_Port1 (production)
Zone B: Server1_HBA2 + Storage_Port2 (backup fabric)
Zone C: Backup_Server + Storage_Target (backup)
```
### iSCSI
| Property | iSCSI | Note |
|-----------|-------|----------|
| **Transport** | TCP/IP (port 3260) | Over standard Ethernet |
| **Speed** | 1/10/25/100 GbE | Same as Ethernet |
| **Initiator** | SW (OS) or HW (TOE) | SW initiator free, ~5-10 % CPU load |
| **Multipathing** | MPIO (Multiple Connections per Session) | Up to 8 paths, active/active or active/passive |
| **CHAP** | Authentication | Mutual CHAP recommended |
| **Jumbo frames** | Recommended MTU 9000 | Reduced CPU overhead, higher throughput |
| **Use case** | Small and medium SAN, backup, DR | Cheaper than FC, lower performance |
**iSCSI configuration**:
```
# Software initiator (Linux)
iscsiadm -m discovery -t sendtargets -p 10.0.0.100:3260
iscsiadm -m node --login -T iqn.2024-05.storage:array01
# Multipath (dm-multipath)
mpathconf --enable --with_multipathd y
# /etc/multipath.conf: aliases, failback, rr_min_io
```
### NVMe-oF (NVMe over Fabrics)
| Transport | Protocol | Latency | CPU overhead | Use case |
|-----------|----------|---------|-------------|----------|
| **NVMe over FC** | FC-NVMe (FC Gen 6/7) | <10 µs | Low | Enterprise SAN, VMware |
| **NVMe over RDMA (RoCE v2)** | RDMA (RoCE) | <5 µs | Very low | AI/ML, HPC, K8s (CSI) |
| **NVMe over TCP** | TCP | ~50 µs | Moderate (10-20 % CPU) | Standard Ethernet, no RDMA |
| **NVMe over InfiniBand** | IB RC/UC | <3 µs | Lowest | HPC, AI training |
**NVMe-oF comparison**:
| Property | FC-NVMe | NVMe/RoCE | NVMe/TCP | NVMe/IB |
|-----------|---------|-----------|----------|---------|
| **Latency (target)** | ~8 µs | ~4 µs | ~50 µs | ~3 µs |
| **Bandwidth** | 64 Gbps | 100/200 GbE | 25/100 GbE | NDR 400 Gbps |
| **Requires special HW** | FC HBA + switch | RoCE NIC + DCB switch | Standard NIC | IB HCA + switch |
| **Ecosystem** | Broadcom, Marvell | NVIDIA, Broadcom | OS built-in | NVIDIA Mellanox |
| **Use case** | VMware, enterprise SAN | AI/ML, K8s, HPC | SMB, K8s, cost-effective | HPC, large AI |
### SAS (Serial Attached SCSI)
| Generation | Speed | Cabling | Reach | Use case |
|----------|----------|---------|-------|----------|
| **SAS 3** | 12 Gbps | SAS cable (SFF-8644) | 6-10 m | Legacy storage, DAS |
| **SAS 4** | 22.5 Gbps | SAS cable (SFF-8644) | 6-10 m | Current standard |
| **SAS 5** | 45 Gbps | SAS cable (SFF-8644) | 6-10 m | Emerging |
**SAS topology**: Server → SAS HBA → SAS expander → SAS disk (point-to-point, not shared like FC)
---
## Server connectivity — decision matrix
| Workload | Primary | Secondary | Management |
|----------|----------|-----------|------------|
| **Web / API** | 2× 25 GbE (LACP) | — | 1× 1 GbE BMC |
| **Database** | 2× 25/100 GbE (RSS) | 2× 32 Gb FC (SAN) | 1× 1 GbE BMC |
| **Virtualization** | 4× 25 GbE (SR-IOV) | 2× 32 Gb FC (VMFS) | 1× 1 GbE BMC |
| **Kubernetes** | 2× 25/100 GbE | — | 1× 1 GbE BMC |
| **Storage node** | 2× 100 GbE (RoCE) | 2× 25 GbE (management) | 1× 1 GbE BMC |
| **AI training** | 8× 400 GbE + IB NDR | 4× 100 GbE (storage) | 1× 1 GbE BMC |
| **AI inference** | 4× 100 GbE (RoCE) | 2× 25 GbE (management) | 1× 1 GbE BMC |
| **HPC** | InfiniBand NDR | 2× 100 GbE (storage) | 1× 1 GbE BMC |
---
## Server NIC placement (PCIe slot optimization)
```
2U Server (GPU/AI):
┌─────────────────────────────────────────────────┐
│ PCIe 0: GPU (x16) — NVLink / InfiniBand (x16) │
│ PCIe 1: GPU (x16) — NIC 100 GbE (x16) │
│ PCIe 2: GPU (x16) │
│ PCIe 3: GPU (x16) │
│ PCIe 4: GPU (x16) │
│ PCIe 5: GPU (x16) — NIC 100 GbE (x16) │
│ PCIe 6: Storage HBA / NIC (x8) │
│ PCIe 7: Management / OCP (x8) │
└─────────────────────────────────────────────────┘
1U Standard:
┌─────────────────────────────────┐
│ OCP: 2× 25 GbE (management) │
│ PCIe 0: NIC 25 GbE (x8) │
│ PCIe 1: Storage HBA / FC (x8) │
│ PCIe 2: GPU (x16, optional) │
│ PCIe 3: NVMe (x4, M.2) │
└─────────────────────────────────┘
```
### NVIDIA Mellanox ConnectX NICs
NVIDIA Mellanox is a leading manufacturer of NIC adapters for AI/HPC and cloud data centers.
| Model | PCIe | Max speed | Form factor | Key features |
|-------|------|-------------|-------------|------------------|
| **ConnectX-5** | PCIe 3.0 x16 | 100 GbE (dual) | HHHL | RoCE, NVMe-oF target offload, MPI offload |
| **ConnectX-6 Dx** | PCIe 4.0 x16 | 200 GbE (1-port) / 100 GbE (2-port) | HHHL, OCP 3.0 | ASAP² vSwitch offload, IPsec/TLS inline crypto, AES-XTS, 215 Mpps DPDK |
| **ConnectX-6 Lx** | PCIe 4.0 x8 | 25 GbE (dual) | HHHL, OCP 3.0 | RoCE, Secure Boot, low-power |
| **ConnectX-7** | PCIe 5.0 x16 | 400 GbE (1-port) / 200 GbE (2-port) | HHHL | NDR InfiniBand + 400GbE, GPUDirect, SHARP |
| **ConnectX-8** | PCIe 6.0 x16 | 800 GbE (1-port) / 400 GbE (2-port) | HHHL | XDR InfiniBand, sub-500ns latency, in-network computing, multi-host |
**Platforms**: Spectrum-X Ethernet (end-to-end AI networking), Quantum InfiniBand, BlueField DPU.
### Broadcom Emulex FC HBA
| Model | Speed | PCIe | Ports | Features |
|-------|----------|------|-------|----------|
| **LPe35000** (Gen 7) | 32 GFC | PCIe 3.0 x8 | 1-2 | NVMe-FC, T10-PI (DIF), SR-IOV, Silicon Root of Trust |
| **LPe35002** (Gen 7) | 32 GFC | PCIe 3.0 x8 | 2 | NVMe-FC, Secure Boot, digitally signed firmware |
| **LPe36000** (Gen 7) | 64 GFC | PCIe 4.0 x16 | 1-2 | First 64GFC HBA on the market, 10M IOPS, 3× better latency than Gen 6 |
**Key features**: NVMe over FC support, T10 DIF (Data Integrity Field), 10M MTBF, NIST SP 800-193 compliant. Gen 7 delivers up to 10M IOPS and 3× lower latency compared to Gen 6.
### NVMe-oF specification
NVMe over Fabrics (NVMe-oF) extends the NVMe protocol from local PCIe to network transports. First specification 1.0 released in June 2016, currently part of NVMe 2.3 (August 2025). Supported transports:
| Transport | Specification | Use case |
|-----------|------------|----------|
| **NVMe over PCIe** | NVMe Base | Local NVMe SSD |
| **NVMe over RDMA** (RoCE, InfiniBand, iWARP) | NVMe Transport | AI/ML, HPC, lowest latency <5 µs |
| **NVMe over TCP** | NVMe Transport | Standard Ethernet, no RDMA, latency ~50 µs |
| **NVMe over FC** (FC-NVMe) | INCITS T11 | Enterprise SAN, FC fabric |
NVMe 2.3 adds Computational Programs Command Set, Storage Level Management (SLM), and Zoned Namespaces (ZNS). NVMe-MI defines the management interface.
### Dell PowerEdge R760 — NIC placement
Dell R760 server supports:
- **OCP 3.0** adapters (up to 2×) — 1/10/25/100 GbE
- **PCIe Gen5** slots — 8× slots (6× FHHL + 2× LP)
- **LOM** — 2× 1 GbE Broadcom 5720 on motherboard
- Maximum NIC speed: 100 GbE (QSFP56)
- Supported types: RJ45, SFP+, SFP28, QSFP28, QSFP56
Recommended configurations:
- Standard: OCP 3.0 2× 25 GbE + PCIe storage HBA
- AI/ML: PCIe 100 GbE (riser config 1, slot 1-2) + GPU in other slots
### HPE Gen11 NIC options
HPE ProLiant Gen11 (DL360/DL380) supports:
- **OCP 3.0** slots (up to 2) — 10/25/100/200 GbE (Broadcom, Intel, NVIDIA Mellanox)
- **PCIe Gen5** adapters — 8× slots (DL380) / 3× slots (DL360)
- **iLO 6** dedicated management port (1 GbE)
- Supported NICs: Broadcom BCM57412 (10GbE), BCM57504 (25GbE), NVIDIA ConnectX-6 Dx (100GbE)
## Sources
Links, books, and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
### Recommended literature
| Book | Authors | ISBN | Description |
|-------|--------|------|-------|
| AI Data Center Network Design and Technologies (1st ed., 2026) | Subramaniam, Styszynski, Tambakuwala | 978-0-13-543628-8 | First vendor-agnostic guide to network design for AI training and inference. Covers high-radix fabric, lossless Ethernet/IP, UEC technologies, cooling and power for AI clusters. Authors from HPE Juniper Networking. |
*Last revision: 2026-06-03*

270
CONNECTIVITY.md Normal file
View File

@@ -0,0 +1,270 @@
# 🔌 Server connectivity — síťová a storage konektivita
## Ethernet — síťová konektivita
### Rychlosti a formáty
| Rychlost | Označení | Form factor | Kabeláž | Rok standardu | Use case |
|----------|----------|-------------|---------|---------------|----------|
| **1 GbE** | 1000BASE-T | RJ45 (copper) | Cat5e/Cat6 | 1999 | Management, legacy |
| **10 GbE** | 10GBASE-T / SFP+ | RJ45 / SFP+ | Cat6A (30m) / Cat7 (100m) / DAC / SR/LR | 2006 | Běžný server, storage |
| **25 GbE** | 25GBASE-R | SFP28 | Cat8 (30m) / DAC (5m) / SR/LR (100m/10km) | 2016 | Standard pro servery (2020+) |
| **40 GbE** | 40GBASE-R | QSFP+ | DAC (7m) / SR (150m) / LR (10km) | 2010 | Legacy, spine |
| **50 GbE** | 50GBASE-R | SFP56 | DAC / SR / LR | 2018 | Emerging server |
| **100 GbE** | 100GBASE-R | QSFP28 | DAC (3m) / SR4 (100m) / LR4 (10km) / PSM4 (500m) | 2015 | Spine, storage, AI |
| **200 GbE** | 200GBASE-R | QSFP56 | DAC / SR4 / DR4 | 2019 | AI/ML, HPC |
| **400 GbE** | 400GBASE-R | QSFP-DD / OSFP | DAC (2.5m) / SR8 (100m) / DR4 (500m) / FR4 (2km) | 2017 | AI training, hyperscale |
| **800 GbE** | 800GBASE-R | QSFP-DD800 / OSFP | DAC (2m) / SR8 (100m) / DR8 (500m) | 2024 | Next-gen AI/ML |
**Doporučení pro servery (2026)**:
- **Standard**: 2× 25 GbE (management + data) nebo 2× 100 GbE pro náročné workloady
- **AI/ML training**: 8× 400 GbE (InfiniBand preferován pro GPU communication)
- **Storage**: 2× 25/100 GbE (iSCSI/NFS) nebo dedikovaná FC (16/32 Gbps)
### Form factor NIC
| Form factor | PCIe lanes | Rychlost | Use case |
|------------|-----------|----------|----------|
| **OCP 3.0** | x8/x16 | 25/100/200 GbE | Moderní servery (Dell, HPE), small form factor |
| **PCIe HHHL** | x8 | 25/50 GbE | Standardní 1U/2U servery |
| **PCIe FHHL** | x16 | 100/200/400 GbE | GPU servery, high-density |
| **Mezzanine** | x8 | 10/25 GbE | Blade servery (HPE Synergy, Dell MX) |
| **LOM (LAN on Motherboard)** | — | 1/10/25 GbE | Integrovaný, základní konektivita |
### NIC features
| Feature | Popis | Benefit |
|---------|-------|---------|
| **TSO/GRO** | TCP Segmentation Offload / Generic Receive Offload | Snížení CPU zátěže pro TCP |
| **LRO/LSO** | Large Receive/Send Offload | Obdoba TSO/GRO pro legacy |
| **RSS** | Receive Side Scaling | Distribuce příchozích packetů přes více CPU jader |
| **RPS/RFS** | Receive Packet Steering / Flow Steering | Softwarové RSS, cache affinity |
| **XDP** | eXpress Data Path | BPF-based packet processing (DDoS, load balancer) |
| **RDMA (RoCE v2)** | RDMA over Converged Ethernet | GPU direct communication, storage (NVMe-oF) |
| **iWARP** | RDMA over TCP | RDMA bez speciálního switch (vyšší latence) |
| **DPDK** | Data Plane Development Kit | Uživatelský prostor pro packet processing (VNF, vSwitch) |
| **VXLAN/NVGRE offload** | HW offload pro tunelování | Overlay networking (VMware NSX, OpenStack) |
| **SR-IOV** | Single Root I/O Virtualization | Direct NIC access pro VM (VF), nízká latence |
| **Flow Bifurcation** | Split NIC traffic mezi kernel a DPDK | Souběžný management a high-speed data path |
| **PTP (IEEE 1588)** | Precision Time Protocol | Finanční služby, 5G, telco |
### NIC selection per workload
| Workload | Doporučená NIC | Zdůvodnění |
|----------|---------------|------------|
| **Web / API servery** | 2× 25 GbE SFP28, OCP | Nízká cena, dostatečná bandwidth |
| **Virtualizace (VMware)** | 2× 25 GbE (SR-IOV, VXLAN offload) | SR-IOV pro VM, VXLAN pro NSX |
| **Databáze (OLTP)** | 2× 25/100 GbE (RSS, low latency) | Nízká latence, RSS pro CPU scaling |
| **Storage (NFS/iSCSI)** | 2× 25/100 GbE (RoCE v2) | RDMA pro NVMe-oF, low latency |
| **Storage (FC SAN)** | 2× 32 Gb FC HBA | SAN pro VMware VMFS, block storage |
| **AI/ML training** | 8× 400 GbE + InfiniBand NDR | GPU communication, data ingestion |
| **AI/ML inference** | 4× 100 GbE (RoCE v2) | Model serving, GPU direct |
| **HPC** | InfiniBand NDR 400 Gbps | MPI communication, low latency |
| **Telco / Edge** | 2× 25 GbE (DPDK, PTP) | VNF, 5G UPF, low latency |
---
## Storage connectivity
### Fibre Channel (FC) SAN
| Generace | Rychlost | Označení | Form factor | Dosah (SMF) | Use case |
|----------|----------|----------|-------------|-------------|----------|
| **Gen 5** | 16 Gbps | 16GFC | SFP+ | 10 km | Legacy SAN |
| **Gen 6** | 32 Gbps | 32GFC | SFP28 | 10 km | Současný standard |
| **Gen 7** | 64 Gbps | 64GFC | SFP56 | 10 km | Emerging, high-performance |
| **Gen 8** | 128 Gbps | 128GFC | QSFP28 | 10 km | Emerging (první produkční nasazení) |
**HBA (Host Bus Adapter)**:
| Výrobce | Model | Rychlost | PCIe | Porty | Features |
|---------|-------|----------|------|-------|----------|
| **Broadcom / Emulex** | LPe35000 | 32 GFC | PCIe 3.0 x8 | 1-2 | NVMe-FC, T10-PI, SR-IOV |
| **Broadcom / Emulex** | LPe36000 | 64 GFC | PCIe 4.0 x16 | 1-2 | NVMe-FC, FC-NVMe |
| **Marvell / QLogic** | QLE2770 | 32 GFC | PCIe 3.0 x8 | 1-2 | FC-NVMe, T10-PI |
| **Marvell / QLogic** | QLE2870 | 64 GFC | PCIe 4.0 x8 | 1-2 | NVMe-FC, 64GFC |
**FC SAN topology**:
```
Server ──HBA── FC Switch ──── Storage Array (FC port)
│ │
│ ┌────┴────┐
│ │ Fabric │
│ └─────────┘
──── ISL (Inter-Switch Link) ──── backup fabric (B)
```
**Zoning** (FC):
```
Zone A: Server1_HBA1 + Storage_Port1 (production)
Zone B: Server1_HBA2 + Storage_Port2 (backup fabric)
Zone C: Backup_Server + Storage_Target (backup)
```
### iSCSI
| Vlastnost | iSCSI | Poznámka |
|-----------|-------|----------|
| **Transport** | TCP/IP (port 3260) | Po standardním ethernetu |
| **Rychlost** | 1/10/25/100 GbE | Stejná jako Ethernet |
| **Initiator** | SW (OS) nebo HW (TOE) | SW initiator zdarma, ~5-10 % CPU load |
| **Multipathing** | MPIO (Multiple Connections per Session) | Až 8 cest, active/active nebo active/passive |
| **CHAP** | Authentication | Mutual CHAP doporučen |
| **Jumbo frames** | Doporučeno MTU 9000 | Snížení CPU overhead, vyšší throughput |
| **Use case** | Malé a střední SAN, backup, DR | Levnější než FC, nižší výkon |
**iSCSI configuration**:
```
# Software initiator (Linux)
iscsiadm -m discovery -t sendtargets -p 10.0.0.100:3260
iscsiadm -m node --login -T iqn.2024-05.storage:array01
# Multipath (dm-multipath)
mpathconf --enable --with_multipathd y
# /etc/multipath.conf: aliases, failback, rr_min_io
```
### NVMe-oF (NVMe over Fabrics)
| Transport | Protokol | Latence | CPU overhead | Use case |
|-----------|----------|---------|-------------|----------|
| **NVMe over FC** | FC-NVMe (FC Gen 6/7) | <10 µs | Nízký | Enterprise SAN, VMware |
| **NVMe over RDMA (RoCE v2)** | RDMA (RoCE) | <5 µs | Velmi nízký | AI/ML, HPC, K8s (CSI) |
| **NVMe over TCP** | TCP | ~50 µs | Střední (10-20 % CPU) | Standardní Ethernet, bez RDMA |
| **NVMe over InfiniBand** | IB RC/UC | <3 µs | Nejnižší | HPC, AI training |
**NVMe-oF comparison**:
| Vlastnost | FC-NVMe | NVMe/RoCE | NVMe/TCP | NVMe/IB |
|-----------|---------|-----------|----------|---------|
| **Latence (target)** | ~8 µs | ~4 µs | ~50 µs | ~3 µs |
| **Bandwidth** | 64 Gbps | 100/200 GbE | 25/100 GbE | NDR 400 Gbps |
| **Requires special HW** | FC HBA + switch | RoCE NIC + DCB switch | Standard NIC | IB HCA + switch |
| **Ecosystem** | Broadcom, Marvell | NVIDIA, Broadcom | OS built-in | NVIDIA Mellanox |
| **Use case** | VMware, enterprise SAN | AI/ML, K8s, HPC | SMB, K8s, cost-effective | HPC, large AI |
### SAS (Serial Attached SCSI)
| Generace | Rychlost | Kabeláž | Dosah | Use case |
|----------|----------|---------|-------|----------|
| **SAS 3** | 12 Gbps | SAS cable (SFF-8644) | 6-10 m | Legacy storage, DAS |
| **SAS 4** | 22.5 Gbps | SAS cable (SFF-8644) | 6-10 m | Současný standard |
| **SAS 5** | 45 Gbps | SAS cable (SFF-8644) | 6-10 m | Emerging |
**SAS topology**: Server → SAS HBA → SAS expander → SAS disk (point-to-point, ne shared jako FC)
---
## Server connectivity — decision matrix
| Workload | Primární | Sekundární | Management |
|----------|----------|-----------|------------|
| **Web / API** | 2× 25 GbE (LACP) | — | 1× 1 GbE BMC |
| **Databáze** | 2× 25/100 GbE (RSS) | 2× 32 Gb FC (SAN) | 1× 1 GbE BMC |
| **Virtualizace** | 4× 25 GbE (SR-IOV) | 2× 32 Gb FC (VMFS) | 1× 1 GbE BMC |
| **Kubernetes** | 2× 25/100 GbE | — | 1× 1 GbE BMC |
| **Storage node** | 2× 100 GbE (RoCE) | 2× 25 GbE (management) | 1× 1 GbE BMC |
| **AI training** | 8× 400 GbE + IB NDR | 4× 100 GbE (storage) | 1× 1 GbE BMC |
| **AI inference** | 4× 100 GbE (RoCE) | 2× 25 GbE (management) | 1× 1 GbE BMC |
| **HPC** | InfiniBand NDR | 2× 100 GbE (storage) | 1× 1 GbE BMC |
---
## Server NIC placement (PCIe slot optimization)
```
2U Server (GPU/AI):
┌─────────────────────────────────────────────────┐
│ PCIe 0: GPU (x16) — NVLink / InfiniBand (x16) │
│ PCIe 1: GPU (x16) — NIC 100 GbE (x16) │
│ PCIe 2: GPU (x16) │
│ PCIe 3: GPU (x16) │
│ PCIe 4: GPU (x16) │
│ PCIe 5: GPU (x16) — NIC 100 GbE (x16) │
│ PCIe 6: Storage HBA / NIC (x8) │
│ PCIe 7: Management / OCP (x8) │
└─────────────────────────────────────────────────┘
1U Standard:
┌─────────────────────────────────┐
│ OCP: 2× 25 GbE (management) │
│ PCIe 0: NIC 25 GbE (x8) │
│ PCIe 1: Storage HBA / FC (x8) │
│ PCIe 2: GPU (x16, optional) │
│ PCIe 3: NVMe (x4, M.2) │
└─────────────────────────────────┘
```
### NVIDIA Mellanox ConnectX NICs
NVIDIA Mellanox je přední výrobce NIC adaptérů pro AI/HPC a cloud datová centra.
| Model | PCIe | Max rychlost | Form factor | Klíčové features |
|-------|------|-------------|-------------|------------------|
| **ConnectX-5** | PCIe 3.0 x16 | 100 GbE (dual) | HHHL | RoCE, NVMe-oF target offload, MPI offload |
| **ConnectX-6 Dx** | PCIe 4.0 x16 | 200 GbE (1-port) / 100 GbE (2-port) | HHHL, OCP 3.0 | ASAP² vSwitch offload, IPsec/TLS inline crypto, AES-XTS, 215 Mpps DPDK |
| **ConnectX-6 Lx** | PCIe 4.0 x8 | 25 GbE (dual) | HHHL, OCP 3.0 | RoCE, Secure Boot, low-power |
| **ConnectX-7** | PCIe 5.0 x16 | 400 GbE (1-port) / 200 GbE (2-port) | HHHL | NDR InfiniBand + 400GbE, GPUDirect, SHARP |
| **ConnectX-8** | PCIe 6.0 x16 | 800 GbE (1-port) / 400 GbE (2-port) | HHHL | XDR InfiniBand, sub-500ns latence, in-network computing, multi-host |
**Platformy**: Spectrum-X Ethernet (end-to-end AI networking), Quantum InfiniBand, BlueField DPU.
### Broadcom Emulex FC HBA
| Model | Rychlost | PCIe | Porty | Features |
|-------|----------|------|-------|----------|
| **LPe35000** (Gen 7) | 32 GFC | PCIe 3.0 x8 | 1-2 | NVMe-FC, T10-PI (DIF), SR-IOV, Silicon Root of Trust |
| **LPe35002** (Gen 7) | 32 GFC | PCIe 3.0 x8 | 2 | NVMe-FC, Secure Boot, digitálně podepsaný firmware |
| **LPe36000** (Gen 7) | 64 GFC | PCIe 4.0 x16 | 1-2 | První 64GFC HBA na trhu, 10M IOPS, 3× lepší latence než Gen 6 |
**Klíčové vlastnosti**: podpora NVMe over FC, T10 DIF (Data Integrity Field), 10M MTBF, NIST SP 800-193 compliant. Gen 7 přináší až 10M IOPS a 3× nižší latenci oproti Gen 6.
### NVMe-oF specifikace
NVMe over Fabrics (NVMe-oF) rozšiřuje NVMe protokol z lokálního PCIe na síťové transporty. První specifikace 1.0 vydána v červnu 2016, aktuálně součástí NVMe 2.3 (srpen 2025). Podporované transporty:
| Transport | Specifikace | Use case |
|-----------|------------|----------|
| **NVMe over PCIe** | NVMe Base | Lokální NVMe SSD |
| **NVMe over RDMA** (RoCE, InfiniBand, iWARP) | NVMe Transport | AI/ML, HPC, nejnižší latence <5 µs |
| **NVMe over TCP** | NVMe Transport | Standardní Ethernet, bez RDMA, latence ~50 µs |
| **NVMe over FC** (FC-NVMe) | INCITS T11 | Enterprise SAN, FC fabric |
NVMe 2.3 přidává Computational Programs Command Set, Storage Level Management (SLM), a Zoned Namespaces (ZNS). NVMe-MI definuje management rozhraní.
### Dell PowerEdge R760 — NIC placement
Server Dell R760 podporuje:
- **OCP 3.0** adaptéry (až 2×) — 1/10/25/100 GbE
- **PCIe Gen5** sloty — 8× slotů (6× FHHL + 2× LP)
- **LOM** — 2× 1 GbE Broadcom 5720 na základní desce
- Maximální rychlost NIC: 100 GbE (QSFP56)
- Supported typy: RJ45, SFP+, SFP28, QSFP28, QSFP56
Doporučené konfigurace:
- Standard: OCP 3.0 2× 25 GbE + PCIe storage HBA
- AI/ML: PCIe 100 GbE (riser config 1, slot 1-2) + GPU v ostatních slotech
### HPE Gen11 NIC options
HPE ProLiant Gen11 (DL360/DL380) podporuje:
- **OCP 3.0** sloty (až 2) — 10/25/100/200 GbE (Broadcom, Intel, NVIDIA Mellanox)
- **PCIe Gen5** adaptéry — 8× slotů (DL380) / 3× sloty (DL360)
- **iLO 6** dedikovaný management port (1 GbE)
- Podporované NIC: Broadcom BCM57412 (10GbE), BCM57504 (25GbE), NVIDIA ConnectX-6 Dx (100GbE)
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
### Doporučená literatura
| Kniha | Autoři | ISBN | Popis |
|-------|--------|------|-------|
| AI Data Center Network Design and Technologies (1st ed., 2026) | Subramaniam, Styszynski, Tambakuwala | 978-0-13-543628-8 | První vendor-agnostický průvodce návrhem sítí pro AI trénování a inferenci. Pokrývá high-radix fabric, lossless Ethernet/IP, UEC technologie, chlazení a power pro AI klastry. Autoři z HPE Juniper Networking. |
*Poslední revize: 2026-06-03*

101
DATABASE-ENGINES.en.md Normal file
View File

@@ -0,0 +1,101 @@
# ⚙️ Storage Engines and Transaction Models
## B-Tree vs LSM-Tree
Two dominant storage engine approaches in modern databases.
| Property | B-Tree | LSM-Tree |
|-----------|--------|----------|
| **Write** | In-place update (random I/O on page) | Append-only (sequential I/O) |
| **Read** | Fast (directly in page, O(log N)) | Slower (merge from multiple SSTables, bloom filters) |
| **Write amplification** | Lower (page rewrite) | Higher (compaction, SSTable merge) |
| **Read amplification** | Lower (1 page read) | Higher (multiple SSTables to search) |
| **Compression** | Worse (page fragmentation) | Better (compact SSTable, block compression) |
| **Range scan** | Fast (linked list at leaf level) | Fast (SSTables are sorted) |
| **Space amplification** | Low | Higher (awaits compaction) |
| **Typical DBs** | PostgreSQL, MySQL (InnoDB), SQLite, Oracle | Cassandra, RocksDB, LevelDB, ScyllaDB, MongoDB (WiredTiger) |
### When to Choose Which Engine
**B-Tree** — when:
- You need fast point lookups (PK lookup, unique ID)
- Workload is read-heavy (most queries = SELECT by key)
- You need range queries on primary key
- Transactional workload (OLTP) with short queries
**LSM-Tree** — when:
- You need high write throughput (write-heavy)
- Append-only workload (logs, time-series, IoT)
- Data compression is important (saves space)
- Write amplification is not a concern (sufficient I/O capacity)
## Write-Ahead Log (WAL)
Append-only log guaranteeing that no operation is lost on crash:
```text
1. Transaction BEGIN → WAL entry
2. Data modification → WAL entry (before page modification)
3. Transaction COMMIT → flush WAL to disk (COMMIT confirmed only after flush)
4. Checkpoint → flush dirty pages → WAL up to checkpoint point can be deleted
```
- **Write-ahead** — WAL is written before the data page
- **Checkpoint** — point from which WAL is needed for recovery
- **Redo log** (InnoDB) — similar concept, used to replay missing changes
- **Group commit** — multiple transactions flush WAL at once (higher throughput)
## MVCC (Multi-Version Concurrency Control)
Each transaction sees a snapshot of data as of the start time. Old row versions remain in the table.
### Implementations
| DB | Mechanism | Vacuum/GC | Isolation Levels |
|----|------------|-----------|-----------------|
| **PostgreSQL** | Heap tuple (xmin/xmax) — old versions in main table | VACUUM (autovacuum) | RU, RC, RR, Serializable (SSI) |
| **MySQL InnoDB** | Undo log — old versions in undo segments | Purge (automatic) | RU, RC, RR, Serializable |
| **MSSQL** | Tempdb version store | Automatic (row versioning) | RC (snapshot), Serializable |
| **Oracle** | Undo tablespace | Automatic (undo retention) | RC, Serializable, Read-only |
| **MongoDB WiredTiger** | MVCC at document level | Automatic (eviction) | Snapshot isolation |
| **Cassandra** | No MVCC (value overwrite) | Compaction (merge SSTable) | — |
### Anomalies
| Level | Dirty Read | Non-repeatable Read | Phantom Read | Serialization Anomaly |
|--------|-----------|---------------------|-------------|----------------------|
| **Read Uncommitted** | Yes | Yes | Yes | Yes |
| **Read Committed** | No | Yes | Yes | Yes |
| **Repeatable Read** | No | No | No (PG: no, MySQL: next-key locking) | Yes |
| **Serializable** | No | No | No | No |
- **Dirty Read** — reading data from an uncommitted transaction
- **Non-repeatable Read** — same query returns different data
- **Phantom Read** — same query returns new rows
- **Serialization Anomaly** — result of transactions is not equivalent to any serial order
## Index Types
| Type | Algorithm | Use Case | DB Support |
|-----|-----------|----------|------------|
| **B-tree** | Balanced tree | `=`, `<`, `>`, `BETWEEN`, `IN`, `LIKE (prefix)` | All (default) |
| **Hash** | Hash table | Only `=` (equality) | PostgreSQL (hash index), MySQL (MEMORY) |
| **GiST** | Generalized Search Tree | Geometry, full-text, intervals, IP ranges | PostgreSQL |
| **GIN** | Generalized Inverted Index | JSONB, arrays, full-text (contains, overlaps) | PostgreSQL |
| **BRIN** | Block Range Index | Time-series, logs (data in order) — extremely small | PostgreSQL |
| **SP-GiST** | Space-partitioned | Quadrants, KD-tree, radix tree | PostgreSQL |
| **R-tree** | Spatial tree | Geospatial data | MySQL (MyISAM/InnoDB), SQLite |
| **Clustered index** | B-tree + data in leaves | PK lookup (InnoDB) — data stored with index | MySQL InnoDB, MSSQL |
| **Full-text** | Inverted index | Text search (stemming, relevance) | MySQL, PostgreSQL, MSSQL |
## Resources
Links, books and standards: [sources/databases/sources.md](sources/databases/sources.md)
### Recommended Reading
| Book | Authors | ISBN | Description |
|-------|--------|------|-------|
| Database Internals | Alex Petrov | 978-1492040346 | In-depth explanation of storage engines (B-Tree, LSM-Tree, WAL, MVCC), distributed systems (partitioning, replication, consensus) |
*Last revision: 2026-06-03*

323
DATABASES.en.md Normal file
View File

@@ -0,0 +1,323 @@
# 🗄️ Database Architecture
## Database Classification
### Relational (SQL)
| DB | License | Use Case | Details |
|----|---------|----------|--------|
| **PostgreSQL** | Open source | Universal, geospatial, analytics, AI | [POSTGRESQL.md](POSTGRESQL.md) |
| **MySQL / MariaDB** | Open source | Web, LAMP stack, e-commerce | [MYSQL.md](MYSQL.md) |
| **Microsoft SQL Server** | Proprietary | Enterprise .NET, Windows ecosystem | — |
| **Oracle DB** | Proprietary | Enterprise, finance, mainframe, RAC cluster | [ORACLE.md](ORACLE.md) |
| **Amazon Aurora** | Managed | MySQL/PostgreSQL compatible, cloud-native | — |
### NoSQL
| Type | DB | Use Case | Details |
|-----|----|----------|--------|
| **Document** | MongoDB, Couchbase | JSON data, flexible schema | [MONGODB.md](MONGODB.md) |
| **Key-Value / Cache** | Redis, Memcached, DynamoDB | Cache, session store, real-time | [REDIS.md](REDIS.md) |
| **Wide-column** | Cassandra, ScyllaDB | Time-series, IoT, big data | [CASSANDRA.md](CASSANDRA.md) |
| **Vector** | Pinecone, Qdrant, Milvus, pgvector | Embeddings, RAG, semantic search | [VEKTOROVE-DB.md](VEKTOROVE-DB.md) |
| **Graph** | Neo4j, Dgraph | Relationships, recommendations, social graphs | — |
### Storage Engines
Common concepts across databases: [DATABASE-ENGINES.en.md](DATABASE-ENGINES.en.md)
---
## Transaction Isolation Levels
| Level | Dirty Read | Non-repeatable Read | Phantom Read | Serialization Anomaly |
|--------|-----------|---------------------|-------------|----------------------|
| **Read Uncommitted** | Yes (possible) | Yes | Yes | Yes |
| **Read Committed** | No (prevented) | Yes | Yes | Yes |
| **Repeatable Read** | No | No | No (PostgreSQL: No) | Yes |
| **Serializable** | No | No | No | No |
**Anomalies**:
- **Dirty Read** — reading data from an uncommitted transaction (data may be rolled back)
- **Non-repeatable Read** — same query returns different data (another transaction updated the row in the meantime)
- **Phantom Read** — same query returns new rows (another transaction inserted data matching the condition)
- **Serialization Anomaly** — the result of transactions is not equivalent to any serial order
### PostgreSQL vs MySQL Differences
- **PostgreSQL**: Read Uncommitted behaves like Read Committed. Repeatable Read = Snapshot Isolation (also prevents phantom reads). Serializable = SSI.
- **MySQL InnoDB**: Repeatable Read uses next-key locking (prevents phantom reads).
---
## CAP Theorem
In a distributed system, only 2 out of 3 are possible: **C**onsistency, **A**vailability, **P**artition tolerance.
In practice: P is always required, we choose between CP (consistency) and AP (availability).
### PACELC Extension
PACELC extends CAP with behavior under normal conditions (no partition):
- **P**artition → **A**vailability vs **C**onsistency
- **E**lse (no partition) → **L**atency vs **C**onsistency
| DB | Partition Choice | Else Choice |
|----|----------------|------------|
| Cassandra | AP (availability) | LC (low latency, eventual consistency) |
| DynamoDB (default) | AP | LC |
| MongoDB | CP (primary) | LC |
| PostgreSQL (single) | CP | CC |
| CockroachDB | CP | CC |
### Quorum Details
- **R** (read quorum) + **W** (write quorum) > **N** (replication factor)
- Typical: N=3, R=2, W=2 (tolerates 1 node down)
- **Sloppy quorum** — when a node is unavailable, data is temporarily stored on another node
- **Hinted handoff** — temporary write to another node with a hint, data is transferred upon recovery
---
## Replication
| Type | Description | Latency |
|-----|-------|---------|
| Synchronous | Write confirmed only after replication to all nodes | High, but consistent |
| Asynchronous | Write confirmed immediately, replication in the background | Low, possible data loss |
| Semi-synchronous | Confirmation from majority of nodes | Compromise |
### Topologies
- **Leader-Follower** (Master-Slave) — reads from replicas
- **Leader-Leader** (Multi-master) — writes to multiple nodes
- **Quorum-based** — R + W > N (Cassandra, DynamoDB)
---
## Sharding
Data distribution across nodes based on a shard key.
```
┌─────────┐
│ Proxy │
│ Router │
└────┬────┘
┌──────────┼──────────┐
┌────▼───┐ ┌───▼────┐ ┌───▼────┐
│Shard A │ │Shard B │ │Shard C │
│ 0-100 │ │101-200 │ │201-300 │
└────────┘ └────────┘ └────────┘
```
### Methods
| Method | Description | Advantage | Disadvantage |
|--------|-------|--------|----------|
| **Hash-based** | `shard_id = hash(key) % N` | Even distribution | Loss of range queries |
| **Range-based** | Data by range (A-M, N-Z) | Preserves ordering | Hot spots |
| **Consistent hashing** | Hash ring, vnodes | Min. rebalancing when number of shards changes | More complex |
### Routing
- **Proxy-based** — application goes to proxy, which routes (Vitess, ProxySQL, mongos)
- **Client-side** — application knows which shard to target
- **DNS-based** — each shard has its own endpoint
---
## Data Consistency Patterns
| Pattern | Description | Example |
|---------|-------|---------|
| **Strong consistency** | After a write, every read sees the latest data | Single DB, Raft, Spanner |
| **Eventual consistency** | After a write, data propagates over time | DNS, DynamoDB (default), Cassandra |
| **Read-after-write** | The author always sees their own write (others are eventual) | Social networks, comments |
| **Causal consistency** | Causally dependent operations are seen in the correct order | COPS, Orbe, MongoDB (causal clusters) |
| **Monotonic reads** | You do not see older data after seeing newer data | Cassandra (MONOTONIC_READ consistency) |
| **Monotonic writes** | Writes from a single client are in order | Queue-based, single leader |
---
## Data Migration
### Schema Migration
```
V1__initial_schema.sql
V2__add_users_table.sql
V3__add_email_index.sql
V4__add_orders_table.sql
```
### Zero-Downtime Migration
1. **Expand** — add new column/table (application tolerates both states)
2. **Migrate** — backfill data, update application to new schema
3. **Contract** — remove old column/table
### Tools
| Tool | Language | Strategy | Zero-Downtime | Rollback |
|---------|-------|-----------|--------------|----------|
| **Flyway** | Java (multi-lang CLI) | Versioned SQL | Limited (additive only) | `undo` (limited, enterprise) |
| **Liquibase** | Java (multi-lang CLI) | Changesets (XML/YAML/JSON/SQL) | Yes (changeset design) | `rollback <count>` |
| **Alembic** | Python | Auto-generation, versioned | Yes (branching) | `downgrade` |
| **Prisma Migrate** | TypeScript | Declarative schema → diff | Yes (shadow DB) | `migrate diff` |
| **gh-ost** | Go | Triggerless online DDL (MySQL) | Yes (binlog stream) | No (progressive) |
| **pgroll** | Go | Online schema migration (PG) | Yes (views, multiple versions) | Yes (immediate) |
---
## SQL Antipatterns
Based on *More SQL Antipatterns* (Karwin, 2026) — 14 new antipatterns:
### Language Antipatterns
| Antipattern | Problem | Solution |
|-------------|---------|--------|
| **Fear of JOINs** | Manual pairing in application instead of JOIN | Use JOIN correctly |
| **Relational Division** | Finding sets in WHERE | Relational division (subquery with GROUP BY/HAVING) |
| **Pagination via OFFSET** | OFFSET is O(n) — the larger the offset, the slower | Keyset pagination (WHERE id > last_seen) |
| **Non-Sargable queries** | Functions on columns in WHERE (`WHERE YEAR(date) = 2026`) | Rewrite as range condition |
### Optimization Antipatterns
| Antipattern | Problem | Solution |
|-------------|---------|--------|
| **Premature denormalization** | Denormalization without reason | Measure, then optimize |
| **JSON overuse** | JSON as a universal solution | Use JSON only for genuinely flexible data |
| **Cacheless transactions** | Relying on query cache (removed in MySQL 8) | Application-level caching |
### Application Antipatterns
| Antipattern | Problem | Solution |
|-------------|---------|--------|
| **Polling** | Regularly querying for changes | LISTEN/NOTIFY, Kafka, Change Data Capture |
| **Transaction encapsulation** | Each model manages its own transaction | Unit of Work pattern |
| **Fear of deadlocks** | Trying to prevent all deadlocks | Mitigation, not prevention |
| **Data hoarding** | Storing everything forever | Data retention policies, archiving |
### Mini-Antipatterns
- `LIMIT` without `ORDER BY` — nondeterministic results
- `NATURAL JOIN` — fragile, implicit join condition
- `N+1 queries` — query in a loop instead of JOIN/batch
- Redundant indexes — duplicate/overlapping indexes unnecessarily slow writes
---
## Designing Data-Intensive Applications (2nd Edition)
*Kleppmann, Riccomini (2026)* — substantially revised edition.
### What's New Compared to 1st Edition
| Area | What's New |
|--------|-----------|
| **Cloud-native** | Storage = object store (S3, Blob), not local disk. Separation of control/data/compute plane |
| **AI workloads** | Vector indexes, DataFrames as a data model, batch processing for training data |
| **Local-first software** | DuckDB, PGlite, SQLite — databases running on laptop/edge, sync when connected |
| **Formal methods** | Randomized testing, formal verification (important for AI-generated code) |
| **Legal & ethics** | GDPR, ethics of predictive analytics, bias, algorithmic accountability |
| **Streaming → SQL views** | Materialize, incremental view maintenance — streaming as SQL |
### Key Principles (unchanged)
**Reliability**, **Scalability**, **Maintainability** — the three pillars of good data systems.
---
## Apache Iceberg Lakehouse
Based on *Architecting an Apache Iceberg Lakehouse* (Merced, 2026):
### What is a Data Lakehouse
An architecture combining the flexibility and low cost of a **data lake** (object storage) with the performance and governance of a **data warehouse**. Apache Iceberg is an open source table format.
### Iceberg Metadata Architecture
```
Table metadata (.metadata.json)
└── Snapshot manifest list
└── Manifests (file-level stats)
└── Data files (Parquet/ORC/Avro)
```
### Key Features
| Feature | Description |
|-----------|-------|
| **ACID transactions** | Safe concurrent read/write |
| **Schema evolution** | Add/drop/rename columns without rewrite |
| **Time travel** | Query historical snapshots |
| **Partition evolution** | Change partition strategy without data rewrite |
| **Hidden partitioning** | Automatic partition filters (user does not need to specify) |
| **Multi-engine** | Spark, Flink, Trino, Dremio, Snowflake over the same data |
### When to Use Iceberg
- Multi-tool access to the same governed data
- ACID on lake data
- Streaming + batch in a single table
- Reducing duplication (one canonical copy instead of ETL to warehouse)
---
## Best Practices
- **Connection pooling** — PgBouncer, RDS Proxy, ProxySQL
- **Indexing based on query patterns** — do not have unnecessary indexes
- **Read replicas** for reporting and analytics
- **Backup & recovery** — point-in-time recovery (PITR), regular tests
- **Query monitoring** — slow query log, pg_stat_statements, performance_schema
- **Encryption at rest & in transit**
- **Migrations in CI/CD** — part of the pipeline, not manual
- **Choose DB based on workload** — no single universal DB (polyglot persistence)
---
## Database License Model Comparison
| DB | License | Price (self-hosted) | Price (managed cloud) | Vendor lock-in | Note |
|----|---------|-------------------|---------------------|----------------|----------|
| **PostgreSQL** | PostgreSQL license (MIT-like) | $0 | ~$0.10-1.00/hr (RDS, CloudSQL, Aurora) | Low | Fully open source, no restrictions |
| **MySQL** | GPL v2 / Commercial (Oracle) | $0 (GPL) / ~$2,000/server/year (commercial) | ~$0.10-1.00/hr (RDS, PlanetScale) | Medium (Oracle owned) | GPL = need to release application? (depends on distribution) |
| **MariaDB** | GPL v2 / Business Source | $0 (GPL) | ~$0.10-1.00/hr (SkySQL) | Low | Fully compatible MySQL fork, no Oracle influence |
| **Oracle SE2** | Proprietary (per core) | ~$17,500/core + 22% support/year | ~$1-5/hr (RDS, OCI) | High | Core factor 0.5 (EPYC/Xeon), max 16 threads |
| **Oracle EE** | Proprietary (per core + options) | ~$47,500/core + options + 22% support | ~$2-30/hr (OCI, RDS) | High | Options double the price (RAC, partitioning, compression) |
| **SQL Server Standard** | Proprietary (per core + CAL) | ~$1,000/core + $200/CAL | ~$0.20-1.00/hr (Azure SQL) | Medium | Windows Server license required additionally |
| **SQL Server Enterprise** | Proprietary (per core + CAL) | ~$7,000/core + $200/CAL | ~$1-5/hr (Azure SQL) | Medium | AlwaysOn, partitioning, in-memory OLTP |
| **MongoDB** | SSPL (Community) / Commercial (Enterprise) | $0 (Community) / ~$10k/server/year (Enterprise) | ~$0.10-5.00/hr (Atlas) | Medium | SSPL restricts managed cloud services |
| **Redis** | RSALv2 + SSPL (7.4+) / BSD (Valkey) | $0 (Valkey) | ~$0.10-1.00/hr (ElastiCache, Memorystore → Valkey) | Low (Valkey) | Redis 7.4+ license change → Valkey fork |
| **Cassandra** | Apache 2.0 | $0 | ~$0.10-1.00/hr (Keyspaces, Amazon Managed) | Low | Fully open source, no restrictions |
| **ScyllaDB** | Apache 2.0 (OSS) / Enterprise | $0 (OSS) / Enterprise subscription | ~$0.50-3.00/hr (ScyllaDB Cloud) | Low (OSS) | Enterprise: monitoring, security, support |
| **CockroachDB** | BSL (Business Source License) / Enterprise | $0 (core) / Enterprise subscription | ~$0.50-3.00/hr (CockroachDB Cloud) | Medium | BSL: converts to MIT after 3 years. Enterprise: multi-region, backup |
**Key Recommendations**:
- **Lowest TCO**: PostgreSQL (no license, broadest cloud support)
- **Highest vendor lock-in**: Oracle (PL/SQL, proprietary options, expensive migration)
- **License risk**: Redis (license change) → use Valkey for new projects
- **Cloud-native licensing**: MongoDB Atlas, CockroachDB Cloud, ScyllaDB Cloud — pay-per-use, no license management
## Resources
Links, books and standards: [sources/databases/sources.md](sources/databases/sources.md)
### Recommended Reading
| Book | Authors | ISBN | Key Takeaway |
|-------|--------|------|----------------|
| Database Internals | Alex Petrov | 978-1492040346 | In-depth explanation of storage engines (B-Tree, LSM-Tree, WAL, MVCC), distributed systems |
| Designing Data-Intensive Applications (2nd ed.) | Kleppmann, Riccomini | — | Cloud-native, AI, local-first, formal methods |
| High Performance MySQL (4th ed.) | Schwartz, Zaitsev, Tkachenko | 978-1492075292 | MySQL architecture, schema/index optimization |
| Expert Oracle Architecture (3rd ed.) | Kyte, Kuhn | 978-1484249602 | Oracle architecture, RAC, Data Guard, tuning |
| AI-Ready PostgreSQL 18 | Kumar, Linster | — | PostgreSQL as a unified platform for AI |
| More SQL Antipatterns | Bill Karwin (2026) | — | 14 antipatterns, keyset pagination |
| Vector Databases | Borwankar (2026) | — | Embeddings, vector indexes, RAG |
| Architecting an Apache Iceberg Lakehouse | Merced (2026) | — | Lakehouse architecture, Iceberg metadata |
*Last revision: 2026-06-03*

323
DATABASES.md Normal file
View File

@@ -0,0 +1,323 @@
# 🗄️ Databázová architektura
## Klasifikace databází
### Relační (SQL)
| DB | Licence | Use case | Detail |
|----|---------|----------|--------|
| **PostgreSQL** | Open source | Univerzální, geospatial, analytika, AI | [POSTGRESQL.md](POSTGRESQL.md) |
| **MySQL / MariaDB** | Open source | Web, LAMP stack, e-commerce | [MYSQL.md](MYSQL.md) |
| **Microsoft SQL Server** | Proprietary | Enterprise .NET, Windows ekosystém | — |
| **Oracle DB** | Proprietary | Enterprise, finance, mainframe, RAC cluster | [ORACLE.md](ORACLE.md) |
| **Amazon Aurora** | Managed | MySQL/PostgreSQL kompatibilní, cloud-native | — |
### NoSQL
| Typ | DB | Use case | Detail |
|-----|----|----------|--------|
| **Document** | MongoDB, Couchbase | JSON data, flexibilní schema | [MONGODB.md](MONGODB.md) |
| **Key-Value / Cache** | Redis, Memcached, DynamoDB | Cache, session store, real-time | [REDIS.md](REDIS.md) |
| **Wide-column** | Cassandra, ScyllaDB | Time-series, IoT, velká data | [CASSANDRA.md](CASSANDRA.md) |
| **Vector** | Pinecone, Qdrant, Milvus, pgvector | Embeddingy, RAG, sémantické vyhledávání | [VEKTOROVE-DB.md](VEKTOROVE-DB.md) |
| **Graph** | Neo4j, Dgraph | Vztahy, doporučení, social grafy | — |
### Storage enginy
Společné koncepty napříč databázemi: [DATABAZOVE-ENGINY.md](DATABAZOVE-ENGINY.md)
---
## Transaction isolation levels
| Úroveň | Dirty Read | Non-repeatable Read | Phantom Read | Serialization Anomaly |
|--------|-----------|---------------------|-------------|----------------------|
| **Read Uncommitted** | Ano (možné) | Ano | Ano | Ano |
| **Read Committed** | Ne (prevence) | Ano | Ano | Ano |
| **Repeatable Read** | Ne | Ne | Ne (PostgreSQL: Ne) | Ano |
| **Serializable** | Ne | Ne | Ne | Ne |
**Anomálie**:
- **Dirty Read** — čtení dat z necommitnuté transakce (data mohou být rollbacknuta)
- **Non-repeatable Read** — stejný dotaz vrátí jiná data (jiná transakce mezitím updatovala řádek)
- **Phantom Read** — stejný dotaz vrátí nové řádky (jiná transakce insertla data splňující podmínku)
- **Serialization Anomaly** — výsledek transakcí není ekvivalentní žádnému sériovému pořadí
### PostgreSQL vs MySQL rozdíly
- **PostgreSQL**: Read Uncommitted se chová jako Read Committed. Repeatable Read = Snapshot Isolation (zabraňuje i phantom reads). Serializable = SSI.
- **MySQL InnoDB**: Repeatable Read používá next-key locking (zabrání phantom reads).
---
## CAP teorém
V distribuovaném systému lze mít pouze 2 ze 3: **C**onsistency, **A**vailability, **P**artition tolerance.
V praxi: P je vždy vyžadováno, volíme mezi CP (konzistence) a AP (dostupnost).
### PACELC rozšíření
PACELC rozšiřuje CAP o chování za normálních podmínek (bez partition):
- **P**artition → **A**vailability vs **C**onsistency
- **E**lse (bez partition) → **L**atency vs **C**onsistency
| DB | Partition volba | Else volba |
|----|----------------|------------|
| Cassandra | AP (dostupnost) | LC (nízká latence, eventual consistency) |
| DynamoDB (default) | AP | LC |
| MongoDB | CP (primární) | LC |
| PostgreSQL (single) | CP | CC |
| CockroachDB | CP | CC |
### Quorum detail
- **R** (read quorum) + **W** (write quorum) > **N** (replication factor)
- Typické: N=3, R=2, W=2 (toleruje 1 node down)
- **Sloppy quorum** — při nedostupnosti nodu, data dočasně uložena na jiném nodu
- **Hinted handoff** — dočasný zápis na jiný node s hintem, při obnově se data přenesou
---
## Replikace
| Typ | Popis | Latence |
|-----|-------|---------|
| Synchronní | Zápis potvrzen až po replikaci na všechny nod | Vysoká, ale konzistentní |
| Asynchronní | Zápis potvrzen ihned, replikace na pozadí | Nízká, možný data loss |
| Semi-synchronní | Potvrzení od majority nodů | Kompromis |
### Topologie
- **Leader-Follower** (Master-Slave) — čtení z replic
- **Leader-Leader** (Multi-master) — zápis na více nodů
- **Quorum-based** — R + W > N (Cassandra, DynamoDB)
---
## Sharding
Distribuce dat napříč uzly podle shard klíče.
```
┌─────────┐
│ Proxy │
│ Router │
└────┬────┘
┌──────────┼──────────┐
┌────▼───┐ ┌───▼────┐ ┌───▼────┐
│Shard A │ │Shard B │ │Shard C │
│ 0-100 │ │101-200 │ │201-300 │
└────────┘ └────────┘ └────────┘
```
### Metody
| Metoda | Popis | Výhoda | Nevýhoda |
|--------|-------|--------|----------|
| **Hash-based** | `shard_id = hash(key) % N` | Rovnoměrná distribuce | Ztráta range dotazů |
| **Range-based** | Data dle rozsahu (A-M, N-Z) | Zachovává řazení | Hot spots |
| **Consistent hashing** | Hash ring, vnodes | Min. přeuspořádání při změně počtu shardů | Složitější |
### Routing
- **Proxy-based** — aplikace jde na proxy, ta routuje (Vitess, ProxySQL, mongos)
- **Client-side** — aplikace ví, na který shard jít
- **DNS-based** — každý shard má vlastní endpoint
---
## Data consistency patterns
| Pattern | Popis | Příklad |
|---------|-------|---------|
| **Strong consistency** | Po zápisu každý read vidí nejnovější data | Single DB, Raft, Spanner |
| **Eventual consistency** | Po zápisu se data časem propagují | DNS, DynamoDB (default), Cassandra |
| **Read-after-write** | Autor svůj zápis vždy vidí (ostatní eventual) | Sociální sítě, komentáře |
| **Causal consistency** | Kauzálně závislé operace viděny ve správném pořadí | COPS, Orbe, MongoDB (causal clusters) |
| **Monotonic reads** | Nevidíte starší data po tom, co jste viděli novější | Cassandra (MONOTONIC_READ consistency) |
| **Monotonic writes** | Zápisy od jednoho clienta v pořadí | Queue-based, single leader |
---
## Migrace dat
### Schema migrace
```
V1__initial_schema.sql
V2__add_users_table.sql
V3__add_email_index.sql
V4__add_orders_table.sql
```
### Zero-downtime migrace
1. **Expand** — přidání nového sloupce/tabulky (aplikace toleruje oba stavy)
2. **Migrate** — backfill dat, update aplikace na nové schema
3. **Contract** — odstranění starého sloupce/tabulky
### Nástroje
| Nástroj | Jazyk | Strategie | Zero-downtime | Rollback |
|---------|-------|-----------|--------------|----------|
| **Flyway** | Java (multi-lang CLI) | Versioned SQL | Omezeně (jen additive) | `undo` (limited, enterprise) |
| **Liquibase** | Java (multi-lang CLI) | Changesets (XML/YAML/JSON/SQL) | Ano (changeset design) | `rollback <count>` |
| **Alembic** | Python | Auto-generation, versioned | Ano (branching) | `downgrade` |
| **Prisma Migrate** | TypeScript | Declarative schema → diff | Ano (shadow DB) | `migrate diff` |
| **gh-ost** | Go | Triggerless online DDL (MySQL) | Ano (binlog stream) | Ne (progresivní) |
| **pgroll** | Go | Online schema migrace (PG) | Ano (views, multiple versions) | Ano (okamžitý) |
---
## SQL Antipatterns
Na základě *More SQL Antipatterns* (Karwin, 2026) — 14 nových antipatternů:
### Language antipatterns
| Antipattern | Problém | Řešení |
|-------------|---------|--------|
| **Fear of JOINs** | Manuální párování v aplikaci místo JOIN | Používat JOIN správně |
| **Relational Division** | Hledání množin v WHERE | Relační dělení (subquery s GROUP BY/HAVING) |
| **Pagination via OFFSET** | OFFSET je O(n) — čím větší offset, tím pomalejší | Keyset pagination (WHERE id > last_seen) |
| **Non-Sargable queries** | Funkce na sloupci v WHERE (`WHERE YEAR(date) = 2026`) | Přepsat na range podmínku |
### Optimization antipatterns
| Antipattern | Problém | Řešení |
|-------------|---------|--------|
| **Premature denormalization** | Denormalizace bez důvodu | Měřit, pak optimalizovat |
| **JSON overuse** | JSON jako univerzální řešení | Použít JSON jen pro skutečně flexibilní data |
| **Cacheless transactions** | Spoléhání na query cache (v MySQL 8 odstraněna) | Application-level caching |
### Application antipatterns
| Antipattern | Problém | Řešení |
|-------------|---------|--------|
| **Polling** | Pravidelné dotazování na změny | LISTEN/NOTIFY, Kafka, Change Data Capture |
| **Transaction encapsulation** | Každý model si spravuje vlastní transakci | Unit of Work pattern |
| **Fear of deadlocks** | Snaha o prevenci všech deadlocků | Mitigace, ne prevence |
| **Data hoarding** | Ukládání všeho navždy | Data retention politiky, archívace |
### Mini-antipatterny
- `LIMIT` bez `ORDER BY` — nedeterministické výsledky
- `NATURAL JOIN` — křehký, implicitní join condition
- `N+1 queries` — dotaz v cyklu místo JOIN/batch
- Redundantní indexy — duplicitní/překrývající se indexy zbytečně zpomalují zápisy
---
## Designing Data-Intensive Applications (2. vydání)
*Kleppmann, Riccomini (2026)* — zásadně přepracované vydání.
### Novinky oproti 1. vydání
| Oblast | Co je nové |
|--------|-----------|
| **Cloud-native** | Storage = object store (S3, Blob), nikoliv lokální disk. Separace control/data/compute plane |
| **AI workloads** | Vektorové indexy, DataFrames jako datový model, batch processing pro training data |
| **Local-first software** | DuckDB, PGlite, SQLite — databáze běžící na laptopu/edge, sync při připojení |
| **Formal methods** | Randomizované testování, formální verifikace (důležité pro AI-generovaný kód) |
| **Legal & ethics** | GDPR, etika prediktivní analytiky, bias, accountability algoritmů |
| **Streaming → SQL views** | Materialize, incremental view maintenance — streamování jako SQL |
### Klíčové principy (nemění se)
Spolehlivost (**Reliability**), škálovatelnost (**Scalability**), udržovatelnost (**Maintainability**) — tři pilíře dobrých datových systémů.
---
## Apache Iceberg Lakehouse
Na základě *Architecting an Apache Iceberg Lakehouse* (Merced, 2026):
### Co je data lakehouse
Architektura kombinující flexibilitu a nízkou cenu **data lake** (object storage) s výkonem a governance **data warehouse**. Apache Iceberg je open source table format.
### Iceberg metadata architektura
```
Table metadata (.metadata.json)
└── Snapshot manifest list
└── Manifests (file-level stats)
└── Data files (Parquet/ORC/Avro)
```
### Klíčové vlastnosti
| Vlastnost | Popis |
|-----------|-------|
| **ACID transakce** | Bezpečné concurrent read/write |
| **Schema evolution** | Přidání/odebrání/přejmenování sloupce bez rewrite |
| **Time travel** | Dotazování na historické snapshoty |
| **Partition evolution** | Změna partition strategie bez rewrite dat |
| **Hidden partitioning** | Automatické partition filtry (uživatel nemusí uvádět) |
| **Multi-engine** | Spark, Flink, Trino, Dremio, Snowflake nad stejnými daty |
### Kdy použít Iceberg
- Multi-tool přístup ke stejným governed datům
- ACID na lake datech
- Streamování + batch v jedné tabulce
- Snížení duplicity (jedna canonical kopie místo ETL do warehouse)
---
## Best practices
- **Connection pooling** — PgBouncer, RDS Proxy, ProxySQL
- **Indexování podle query patternů** — nemít zbytečné indexy
- **Read replicas** pro reporting a analytiku
- **Backup & recovery** — point-in-time recovery (PITR), pravidelné testy
- **Query monitoring** — slow query log, pg_stat_statements, performance_schema
- **Encryption at rest & in transit**
- **Migrace v CI/CD** — součást pipeline, ne manuálně
- **Volba DB podle workloadu** — neexistuje jedna univerzální DB (polyglot persistence)
---
## Srovnání licenčních modelů databází
| DB | Licence | Cena (self-hosted) | Cena (managed cloud) | Vendor lock-in | Poznámka |
|----|---------|-------------------|---------------------|----------------|----------|
| **PostgreSQL** | PostgreSQL license (MIT-like) | $0 | ~$0.10-1.00/hod (RDS, CloudSQL, Aurora) | Nízký | Plně open source, žádná omezení |
| **MySQL** | GPL v2 / Commercial (Oracle) | $0 (GPL) / ~$2 000/server/rok (commercial) | ~$0.10-1.00/hod (RDS, PlanetScale) | Střední (Oracle vlastní) | GPL = nutnost uvolnit aplikaci? (závisí na distribuci) |
| **MariaDB** | GPL v2 / Business Source | $0 (GPL) | ~$0.10-1.00/hod (SkySQL) | Nízký | Plně kompatibilní fork MySQL, žádný Oracle vliv |
| **Oracle SE2** | Proprietary (per core) | ~$17 500/core + 22 % support/rok | ~$1-5/hod (RDS, OCI) | Vysoký | Core factor 0.5 (EPYC/Xeon), max 16 threads |
| **Oracle EE** | Proprietary (per core + options) | ~$47 500/core + options + 22 % support | ~$2-30/hod (OCI, RDS) | Vysoký | Options zdvojnásobují cenu (RAC, partitioning, compression) |
| **SQL Server Standard** | Proprietary (per core + CAL) | ~$1 000/core + $200/CAL | ~$0.20-1.00/hod (Azure SQL) | Střední | Windows Server license nutná navíc |
| **SQL Server Enterprise** | Proprietary (per core + CAL) | ~$7 000/core + $200/CAL | ~$1-5/hod (Azure SQL) | Střední | AlwaysOn, partitioning, in-memory OLTP |
| **MongoDB** | SSPL (Community) / Commercial (Enterprise) | $0 (Community) / ~$10k/server/rok (Enterprise) | ~$0.10-5.00/hod (Atlas) | Střední | SSPL omezuje managed cloud služby |
| **Redis** | RSALv2 + SSPL (7.4+) / BSD (Valkey) | $0 (Valkey) | ~$0.10-1.00/hod (ElastiCache, Memorystore → Valkey) | Nízký (Valkey) | Redis 7.4+ změna licence → fork Valkey |
| **Cassandra** | Apache 2.0 | $0 | ~$0.10-1.00/hod (Keyspaces, Amazon Managed) | Nízký | Plně open source, žádná omezení |
| **ScyllaDB** | Apache 2.0 (OSS) / Enterprise | $0 (OSS) / Enterprise subscription | ~$0.50-3.00/hod (ScyllaDB Cloud) | Nízký (OSS) | Enterprise: monitoring, security, support |
| **CockroachDB** | BSL (Business Source License) / Enterprise | $0 (core) / Enterprise subscription | ~$0.50-3.00/hod (CockroachDB Cloud) | Střední | BSL: po 3 letech se mění na MIT. Enterprise: multi-region, backup |
**Klíčová doporučení**:
- **Nejnižší TCO**: PostgreSQL (žádná licence, nejširší cloud podpora)
- **Nejvyšší vendor lock-in**: Oracle (PL/SQL, proprietary options, drahá migrace)
- **License risk**: Redis (změna licence) → používejte Valkey pro nové projekty
- **Cloud-native licensing**: MongoDB Atlas, CockroachDB Cloud, ScyllaDB Cloud — pay-per-use, žádná správa licencí
## Zdroje
Odkazy, knihy a standardy: [sources/databases/sources.md](sources/databases/sources.md)
### Doporučená literatura
| Kniha | Autoři | ISBN | Klíčový přínos |
|-------|--------|------|----------------|
| Database Internals | Alex Petrov | 978-1492040346 | Hloubkový výklad storage engine (B-Tree, LSM-Tree, WAL, MVCC), distribuované systémy |
| Designing Data-Intensive Applications (2nd ed.) | Kleppmann, Riccomini | — | Cloud-native, AI, local-first, formal methods |
| High Performance MySQL (4th ed.) | Schwartz, Zaitsev, Tkachenko | 978-1492075292 | MySQL architektura, schema/index optimalizace |
| Expert Oracle Architecture (3rd ed.) | Kyte, Kuhn | 978-1484249602 | Oracle architektura, RAC, Data Guard, tuning |
| AI-Ready PostgreSQL 18 | Kumar, Linster | — | PostgreSQL jako unified platform pro AI |
| More SQL Antipatterns | Bill Karwin (2026) | — | 14 antipatternů, keyset pagination |
| Vector Databases | Borwankar (2026) | — | Embeddings, vektorové indexy, RAG |
| Architecting an Apache Iceberg Lakehouse | Merced (2026) | — | Lakehouse architektura, Iceberg metadata |
*Poslední revize: 2026-06-03*

101
DATABAZOVE-ENGINY.md Normal file
View File

@@ -0,0 +1,101 @@
# ⚙️ Storage enginy a transakční modely
## B-Tree vs LSM-Tree
Dva dominantní storage engine přístupy v moderních databázích.
| Vlastnost | B-Tree | LSM-Tree |
|-----------|--------|----------|
| **Zápis** | In-place update (náhodný I/O na page) | Append-only (sekvenční I/O) |
| **Čtení** | Rychlé (přímo v page, O(log N)) | Pomalejší (merge z více SSTable, bloom filtry) |
| **Write amplification** | Nižší (přepis stránky) | Vyšší (kompakce, merge SSTables) |
| **Read amplification** | Nižší (1 page read) | Vyšší (více SSTable k prohledání) |
| **Komprese** | Horší (fragmentace page) | Lepší (kompaktní SSTable, bloková komprese) |
| **Range scan** | Rychlý (linked list na listové úrovni) | Rychlý (SSTable jsou seřazené) |
| **Space amplification** | Nízká | Vyšší (čeká na kompakci) |
| **Typické DB** | PostgreSQL, MySQL (InnoDB), SQLite, Oracle | Cassandra, RocksDB, LevelDB, ScyllaDB, MongoDB (WiredTiger) |
### Kdy zvolit který engine
**B-Tree** — když:
- Potřebujete rychlé point lookupy (PK lookup, jedinečné ID)
- Workload je read-heavy (většina dotazů = SELECT podle klíče)
- Potřebujete range dotazy na primárním klíči
- Transakční workload (OLTP) s krátkými dotazy
**LSM-Tree** — když:
- Potřebujete vysokou propustnost zápisů (write-heavy)
- Append-only workload (logy, time-series, IoT)
- Komprese dat je důležitá (ušetří místo)
- Write amplification nevadí (dostatek I/O kapacity)
## Write-Ahead Log (WAL)
Append-only log garantující, že žádná operace není ztracena při crash:
```text
1. Transaction BEGIN → záznam do WAL
2. Data modification → záznam do WAL (před modifikací page)
3. Transaction COMMIT → flush WAL na disk (COMMIT potvrzen až po flush)
4. Checkpoint → flush dirty pages → WAL do bodu checkpointu může být smazán
```
- **Write-ahead** — WAL zapsán dříve než data page
- **Checkpoint** — bod, odkud je WAL při recovery potřeba
- **Redo log** (InnoDB) — podobný koncept, slouží k přehrání chybějících změn
- **Group commit** — více transakcí flushne WAL najednou (vyšší propustnost)
## MVCC (Multi-Version Concurrency Control)
Každá transakce vidí snapshot dat v okamžiku startu. Staré verze řádků zůstávají v tabulce.
### Implementace
| DB | Mechanismus | Vacuum/GC | Izolační úrovně |
|----|------------|-----------|-----------------|
| **PostgreSQL** | Heap tuple (xmin/xmax) — staré verze v hlavní tabulce | VACUUM (autovacuum) | RU, RC, RR, Serializable (SSI) |
| **MySQL InnoDB** | Undo log — staré verze v undo segmentech | Purge (automatický) | RU, RC, RR, Serializable |
| **MSSQL** | Tempdb version store | Automatické (row versioning) | RC (snapshot), Serializable |
| **Oracle** | Undo tablespace | Automatické (undo retention) | RC, Serializable, Read-only |
| **MongoDB WiredTiger** | MVCC na úrovni dokumentu | Automatické (eviction) | Snapshot isolation |
| **Cassandra** | MVCC není (přepis valore) | Compaction (merge SSTable) | — |
### Anomálie
| Úroveň | Dirty Read | Non-repeatable Read | Phantom Read | Serialization Anomaly |
|--------|-----------|---------------------|-------------|----------------------|
| **Read Uncommitted** | Ano | Ano | Ano | Ano |
| **Read Committed** | Ne | Ano | Ano | Ano |
| **Repeatable Read** | Ne | Ne | Ne (PG: ne, MySQL: next-key locking) | Ano |
| **Serializable** | Ne | Ne | Ne | Ne |
- **Dirty Read** — čtení dat z necommitnuté transakce
- **Non-repeatable Read** — stejný dotaz vrátí jiná data
- **Phantom Read** — stejný dotaz vrátí nové řádky
- **Serialization Anomaly** — výsledek transakcí není ekvivalentní žádnému sériovému pořadí
## Index types
| Typ | Algoritmus | Use case | DB podpora |
|-----|-----------|----------|------------|
| **B-tree** | Balanced tree | `=`, `<`, `>`, `BETWEEN`, `IN`, `LIKE (prefix)` | Všechny (výchozí) |
| **Hash** | Hash table | Pouze `=` (equality) | PostgreSQL (hash index), MySQL (MEMORY) |
| **GiST** | Generalized Search Tree | Geometrie, full-text, intervaly, IP rozsahy | PostgreSQL |
| **GIN** | Generalized Inverted Index | JSONB, pole, full-text (contains, overlaps) | PostgreSQL |
| **BRIN** | Block Range Index | Time-series, logy (data v pořadí) — extrémně malý | PostgreSQL |
| **SP-GiST** | Space-partitioned | Kvadranty, KD-tree, radix tree | PostgreSQL |
| **R-tree** | Prostorový strom | Geoprostorová data | MySQL (MyISAM/InnoDB), SQLite |
| **Clustered index** | B-tree + data v listech | PK lookup (InnoDB) — data uložena s indexem | MySQL InnoDB, MSSQL |
| **Full-text** | Inverted index | Text search (stemming, relevance) | MySQL, PostgreSQL, MSSQL |
## Zdroje
Odkazy, knihy a standardy: [sources/databases/sources.md](sources/databases/sources.md)
### Doporučená literatura
| Kniha | Autoři | ISBN | Popis |
|-------|--------|------|-------|
| Database Internals | Alex Petrov | 978-1492040346 | Hloubkový výklad storage engine (B-Tree, LSM-Tree, WAL, MVCC), distribuované systémy (partitioning, replication, consensus) |
*Poslední revize: 2026-06-03*

788
DATACENTERS.en.md Normal file
View File

@@ -0,0 +1,788 @@
# 🏭 Datacenters
## Tier classification (TIA-942 / Uptime Institute)
| Tier | Availability | Downtime / year | Redundancy |
|------|-------------|-----------------|------------|
| **Tier I** | 99.671 % | 28.8 h | N — no redundancy |
| **Tier II** | 99.741 % | 22.7 h | N+1 — redundant components |
| **Tier III** | 99.982 % | 1.6 h | N+1 — concurrently maintainable |
| **Tier IV** | 99.995 % | 26.3 min | 2N+1 — fault tolerant |
## Key subsystems
| System | Description |
|--------|-------------|
| **Power** | UPS, generators (diesel), ATS, PDU, redundant feeds (A/B feed) |
| **Cooling** | CRAC/CRAH, chilled water, free cooling, containment (hot/cold aisle) |
| **Physical security** | CCTV, biometric access, mantrap, rack security locks |
| **Cabling** | Structured cabling (Cat6A/7/8, OM3/OM4 single-mode fiber), patch panels |
| **Fire suppression** | Alarm, inert gases (Novec, FM-200), VESDA (very early smoke detection) |
| **Monitoring** | DCIM (Data Center Infrastructure Management), SNMP, BMS (Building Management System) |
## Aisle containment
```
┌────────────────────────────────────┐
│ Rack Row │
│ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ │
Cold │ │ │ │ │ │ │ │ │ │ │ │ │ │ Cold
Aisle <──│ └──┘ └──┘ └──┘ └──┘ └──┘ └──┘ ──> Aisle
│ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ │
Hot │ │ │ │ │ │ │ │ │ │ │ │ │ │ Hot
Aisle ──>│ └──┘ └──┘ └──┘ └──┘ └──┘ └──┘ <── Aisle
└────────────────────────────────────┘
```
## Environmental classes (ASHRAE TC 9.9)
ASHRAE Technical Committee 9.9 defines temperature and humidity envelopes for IT equipment in DC.
| Class | Temperature (recommended) | Temperature (allowable) | Usage |
|-------|--------------------------|-------------------------|-------|
| **A1** | 18-27 °C | 15-32 °C | Enterprise DC, strict control |
| **A2** | 18-27 °C | 10-35 °C | Standard DC |
| **A3** | 18-27 °C | 5-40 °C | Looser environment |
| **A4** | 18-27 °C | 5-45 °C | Maximum cooling savings |
| **H1** | 18-22 °C | 5-25 °C | High-density air-cooled (AI/ML) |
- 5th edition (2021) added class H1 for high-density and extended liquid cooling W-classes (W17, W27, W32, W40, W45, W+)
- 2024: new S-classes for Technology Cooling System (TCS) liquid cooling
- Humidity: recommended 9 °C DP to 70 % RH (at low pollutants); max 50 % RH at high corrosivity
## Power
### Power chain
```
Grid ──> Transformer ──> UPS ──> PDU ──> Rack PDU ──> Server PSU
├──> Generator (ATS switches on outage)
└──> STS/ATS (Static Transfer Switch)
```
A/B feed topology:
```
Grid A ──> UPS A ──> PDU A1 ──> Rack PDU A ──> PSU A (server)
Grid B ──> UPS B ──> PDU B1 ──> Rack PDU B ──> PSU B (server)
```
Each server has 2 PSUs — each powered from a different branch (A/B). On failure of one branch, the server continues without interruption.
### UPS types
| Classification | IEC 62040-3 | Description | Switching | Use case |
|--------------|-------------|-------------|-----------|----------|
| **VFD** (Voltage & Frequency Dependent) | Passive standby | UPS in bypass, switches to inverter on failure | 4-10 ms | SOHO, edge |
| **VI** (Voltage Independent) | Line-interactive | Voltage regulation via autotransformer | 2-4 ms | Smaller racks, office |
| **VFI** (Voltage & Frequency Independent) | Double-conversion | AC → DC → AC, full isolation, zero switching time | 0 ms | Enterprise DC, Tier III/IV |
For DC the standard is **VFI (double-conversion)** — online UPS, zero switching time, full isolation from the grid.
### Battery technologies
| Type | Density (Wh/L) | Lifespan (cycles) | Lifespan (years) | Temperature | Cost/kWh | Note |
|------|---------------|-------------------|------------------|-------------|----------|------|
| **VRLA** (AGM/Gel) | 50-80 | 200-500 | 3-5 | 20-25 °C | ~$150-200 | Cheap, large, heavy, temperature sensitive |
| **Li-ion (LFP)** | 200-350 | 3000-5000 | 10-15 | 0-40 °C | ~$300-500 | Small, light, long life, BMS required |
| **Li-ion (NMC)** | 250-400 | 1000-2000 | 8-12 | 0-40 °C | ~$250-400 | Higher density, thermal runaway risk |
| **NiCd** | 80-150 | 1000-2000 | 10-15 | 20-50 °C | ~$400-600 | Extreme temperatures, memory effect |
| **Flow battery** (V/Zn/Br) | 20-40 | 10,000+ | 20+ | 10-35 °C | ~$500-800 | Unlimited cycles, large, long-term backup |
Li-ion (LFP) is becoming the standard for new DCs due to longer life, smaller footprint, and better behavior at high temperatures.
### Generator sizing
| Variant | Size | Fuel | Start time | Run time | Use case |
|---------|------|------|------------|----------|----------|
| **Diesel** | 500-2500 kVA | Diesel | 10-30 s | 24-72 h (depending on tank) | Standard for enterprise DC |
| **Nat. gas** | 200-1500 kVA | Natural gas | 10-30 s | Unlimited (pipeline) | Less common, lower emissions |
| **CHP** (cogeneration) | 500-2000 kVA | Natural gas | 5-15 min | Unlimited | Combined power + cooling (absorption chiller) |
Sizing: Generator should cover 100 % IT load + 100 % cooling load (incl. chillers) — typically 1.3-1.8× IT load. Diesel tank min. for 24 h operation, commonly 48-72 h. Daily consumption ~0.3-0.4 L/kWh.
### ATS vs STS
| Feature | ATS (Automatic Transfer Switch) | STS (Static Transfer Switch) |
|---------|--------------------------------|-----------------------------|
| **Switching** | 4-10 ms (mechanical relay) | < 4 ms (thyristor) |
| **Lifespan** | ~10,000 switches | Unlimited (solid-state) |
| **Cost** | Low | High (~3-5× ATS) |
| **Use case** | Generator → UPS feed | Between two UPS outputs |
### PDU types
| Type | Description | Use case |
|------|-------------|----------|
| **Basic** | Passive splitter (no monitoring) | Edge, office |
| **Metered** | Current measurement at PDU level | Standard DC |
| **Monitored** | Measurement per outlet, SNMP, web GUI | Enterprise DC |
| **Switched** | On/off per outlet, remote reboot | Enterprise DC, colo |
| **High-density** | 3-phase, 60-100 A, C19 outlets | GPU/HPC/AI racks |
### Power calculation
```
Total Power = Σ(P_server + P_storage + P_network + P_cooling + P_losses)
P_server = P_idle + (P_max - P_idle) × Utilization%
P_cooling = P_IT / PUE
Example:
100 servers × 500 W (avg) = 50 kW IT load
PUE = 1.5 → total 75 kW
UPS + generator → sized for 75 kW × 1.2 (safety factor) = 90 kW
```
### PUE (Power Usage Effectiveness)
```
PUE = Total Facility Energy / IT Equipment Energy
```
| PUE | Efficiency | Type |
|-----|-----------|------|
| 1.0-1.1 | Excellent | Hyperscale (Google, Meta) |
| 1.1-1.3 | Very good | Modern DC |
| 1.3-1.6 | Good / average | Enterprise DC |
| 1.6-2.0 | Below average | Older DC |
| >2.0 | Poor | Legacy |
PUE is measured at the whole DC level, not per rack. Includes: UPS losses, cooling, lighting, distribution losses. Excludes: well-to-tank fuel production, embodied carbon. Target for modern DC: PUE < 1.2.
### WUE and CUE
| Metric | Description | Formula | Target |
|--------|-------------|---------|--------|
| **WUE** (Water Usage Effectiveness) | Water consumption per IT energy | WUE = Annual Water Usage / IT Energy (L/kWh) | < 0.5 L/kWh |
| **CUE** (Carbon Usage Effectiveness) | CO₂ emissions per IT energy | CUE = Total CO₂ / IT Energy (kg CO₂/kWh) | < 0.2 kg CO₂/kWh |
WUE is critical in dry regions (southwest US, Australia, Middle East). Adiabatic cooling consumes significantly more water than closed-loop cooling.
### 3-phase vs Single-phase
| Feature | Single-phase (230 V) | 3-phase (400 V) |
|---------|---------------------|-----------------|
| **Voltage** | 230 V (L-N) | 230/400 V (L-N/L-L) |
| **Power per feed** | ~7.4 kW (32 A) | ~22 kW (32 A, 3-ph) |
| **Efficiency** | Lower (more losses) | Higher (lower current) |
| **Use case** | Smaller racks, office | Standard in DC, high-density |
| **PDU** | Single-phase (C13/C19) | 3-phase (C13/C19, 3-ph monitoring) |
| **Balancing** | Automatic | Phase balancing required (L1/L2/L3) |
### Rack power density
| Cat. | Type | kW/rack | Power | Cooling |
|------|------|---------|-------|---------|
| Low | Office, storage | 1-3 kW | 1-ph, 16 A | Air (free cooling) |
| Medium | Standard compute | 5-10 kW | 3-ph, 32 A | Air (CRAC/CRAH) |
| High | GPU, HPC | 15-30 kW | 3-ph, 60 A | Air + liquid assist |
| Ultra | AI/ML clusters | 40-100+ kW | 3-ph, 100+ A | Direct-to-chip / immersion |
### Rack PDU connectors
| Connector | Max current | Device type |
|-----------|-------------|-------------|
| **C13** | 10 A (250 V) | Servers, switches, 1U |
| **C19** | 16 A (250 V) | Higher power servers, UPS |
| **IEC 60309** (3-ph) | 16-125 A | Rack PDU inputs |
| **NEMA L6-30** | 30 A (250 V) | US spec |
## Cooling
### Cooling — technology overview
| Technology | Type | Output (kW/rack) | Typical PUE | CAPEX | Use case |
|-----------|------|-----------------|-------------|-------|----------|
| **Free air cooling** | Air | < 5 | 1.05-1.15 | Low | Climatically suitable locations |
| **CRAC (DX)** | Air | 5-10 | 1.4-1.8 | Medium | Smaller DC, retrofit |
| **CRAH (CW)** | Air | 5-15 | 1.2-1.5 | High | Enterprise DC |
| **In-row cooling** | Air | 10-25 | 1.2-1.4 | High | High-density racks |
| **Rear-door HX** | Hybrid | 15-30 | 1.1-1.3 | Medium | Retrofits, GPU |
| **Direct-to-chip** | Liquid | 40-100+ | 1.05-1.15 | High | AI/ML, HPC |
| **Immersion (single-phase)** | Liquid | 50-100+ | 1.03-1.10 | High | Bitcoin, hyperscale |
| **Immersion (two-phase)** | Liquid | 100-200+ | 1.03-1.08 | Very high | Extreme density |
### Chilled water vs Direct Expansion (DX)
| Feature | Chilled water (CW) | Direct Expansion (DX) |
|---------|-------------------|----------------------|
| **Medium** | Water + glycol | Refrigerant (R134a, R410A, R454B) |
| **CRAC/CRAH** | CRAH (Coolant-based) | CRAC (refrigerant compressor) |
| **Efficiency** | Higher (COP 5-7) | Lower (COP 2-4) |
| **Water temperature** | 7-12 °C (standard), 18-22 °C (high-temp) | 5-10 °C (evaporator) |
| **Complexity** | Higher (chillers, pumps, pipes, cooling tower) | Simpler |
| **Maintenance** | Higher (water treatment, legionella prevention) | Lower |
| **Use case** | Large DC > 500 kW, enterprise | Smaller DC, edge, retrofit |
### Containment types
| Type | Description | Efficiency | Implementation |
|------|-------------|------------|----------------|
| **Cold aisle containment (CAC)** | Enclosed cold aisle, warm air returns to room | High | Doors at aisle ends, ceiling panels |
| **Hot aisle containment (HAC)** | Enclosed hot aisle, warm air goes directly to return | Higher | Doors + ceiling panels, return to CRAH |
| **Chimney / rear duct** | Each rack has its own exhaust chimney to ceiling | Highest | Individual ducts per rack, expensive |
| **Open aisle** | No containment, cold and warm air mix | Low | Legacy, cheap |
Recommendation: CAC/HAC at density > 5 kW/rack. HAC is 5-10 % more efficient than CAC (warm air is directly extracted, does not mix with room).
### CFD modeling
Computational Fluid Dynamics (CFD) simulates airflow in DC before physical implementation:
- Identification of hot spots (warm air recirculation into cold aisle)
- Optimization of perforated tile positions
- Design of bypass airflow (cable openings, uncovered positions)
- Simulation of CRAH unit failure (what-if scenarios)
- Tools: Future Facilities (6Sigma DC), Ansys Fluent, OpenFOAM
### Free cooling
- **Air-side** — intake of outside air at suitable temperature (filtration, humidification)
- **Water-side** — use of cold water from outdoor chillers (strainer cycle) without compressor
- **Climate zone** — free cooling usable ~2000-8000 hours/year depending on location
- Scandinavia: 7000-8000 h/year
- Central Europe: 4000-6000 h/year
- Southern Europe: 2000-4000 h/year
- **Hybrid** — combination of free cooling + mechanical cooling (most common)
- **Economizer types**: Class A1 (dry cooler), Class A2 (evaporative), Class B (air-side)
### Liquid cooling detail
| Type | Inlet temperature | Capacity (kW/rack) | Medium | Installation |
|------|-----------------|-------------------|--------|-------------|
| **Cold plate (D2C)** | 20-45 °C | 40-100+ | Water, propylene glycol | CDU per rack or per row |
| **Rear-door HX** | 18-27 °C | 15-30 | Water | Passive, no server modification |
| **Immersion (1-ph)** | 35-50 °C | 50-100+ | Dielectric oil | Tank, CDU, heat exchanger |
| **Immersion (2-ph)** | 25-35 °C | 100-200+ | Dielectric (boiling) | Tank + condenser |
**CDU (Coolant Distribution Unit)**:
- Provides coolant temperature and pressure to racks
- Primary loop (facility water) + secondary loop (rack coolant)
- Sizing: 1 CDU per 4-8 racks (40-100 kW per CDU)
- Redundancy: N+1 CDU, dual coolant loops
**Water quality requirements**:
- Conductivity: < 1 µS/cm (demineralized water)
- pH: 6.5-8.0
- Particulates: < 50 µm (filtration)
- Corrosion prevention: inhibitors, glycol (10-30 %)
- Biological growth prevention: UV, biocides
### Adiabatic cooling
Using water evaporation to cool air:
- **Direct adiabatic** — air passes through water (media pad), cools and humidifies
- **Indirect adiabatic** — air cools via heat exchanger without direct contact with water
- **Water consumption**: 3-5 L/kWh (direct), 1-2 L/kWh (indirect)
- Efficiency depends on air humidity — more effective in dry climates
## Cabling and structured cabling
### TIA-942 cabling hierarchy
```
Entrance Room (ER)
├── Backbone cabling (fiber single-mode / multi-mode)
│ │
│ ├── Main Distribution Area (MDA)
│ │ │
│ │ ├── Horizontal Distribution Area (HDA)
│ │ │ │
│ │ │ └── Equipment Distribution Area (EDA) → rack
│ │ │
│ │ └── Intermediate Distribution Area (IDA) — optional
│ │
│ └── Telecommunication Room (TR) — for office
└── Backbone cabling (fiber / copper)
```
### Copper cabling categories
| Category | Frequency | Speed | Length | Connector | Use case |
|----------|-----------|-------|--------|-----------|----------|
| **Cat5e** | 100 MHz | 1 GbE | 100 m | RJ45 | Legacy, voice |
| **Cat6** | 250 MHz | 1 GbE (10 GbE up to 55 m) | 100 m (10 GbE: 55 m) | RJ45 | Standard DC, enterprise |
| **Cat6A** | 500 MHz | 10 GbE | 100 m | RJ45 | Standard for new DC |
| **Cat7** (GG45) | 600 MHz | 10 GbE | 100 m | GG45/TERA | Niche, replaced by Cat6A/8 |
| **Cat8.1** | 2000 MHz | 25/40 GbE | 30 m | RJ45 | Top-of-rack, storage |
| **Cat8.2** | 2000 MHz | 25/40 GbE | 30 m | GG45/TERA | Top-of-rack, storage |
In DC, **Cat6A** (10 GbE up to 100 m) is standard for horizontal cabling. Cat8 only for patch cables within a rack (up to 30 m).
### Fiber optic types
| Type | Core | Modal BW | Speed | Max length | Use case |
|------|------|----------|-------|-----------|----------|
| **OS1** (SM) | 9 µm | — | 100 GbE - 800 GbE | 10-80 km | Backbone, campus, WAN |
| **OS2** (SM) | 9 µm | — | 100 GbE - 800 GbE | 2-80 km (CWDM/DWDM) | Backbone, DWDM |
| **OM1** (MM) | 62.5 µm | 200 MHz·km | 1 GbE | 275 m | Legacy |
| **OM2** (MM) | 50 µm | 500 MHz·km | 10 GbE | 82 m | Legacy |
| **OM3** (MM) | 50 µm | 2000 MHz·km | 10 GbE up to 300 m, 100 GbE up to 100 m | 300 m (10G) | Standard DC, VCSEL |
| **OM4** (MM) | 50 µm | 4700 MHz·km | 100 GbE up to 150 m, 400 GbE up to 100 m | 550 m (10G) | High-performance DC standard |
| **OM5** (MM) | 50 µm | 4700+ MHz·km | 200/400 GbE SWDM | 150 m (100G) | Emerging, SWDM |
For new DC: **OM4** as standard for multi-mode, **OS2** for single-mode backbone (LR, DWDM). OM5 is not widely deployed — OM4 + parallel optics (SR4) is more common.
### Connector types
| Connector | Type | Insertion loss | Fiber count | Use case |
|-----------|------|---------------|-------------|----------|
| **LC** | Duplex | < 0.15 dB | 2 | Standard for SFP/SFP+/QSFP |
| **SC** | Duplex | < 0.2 dB | 2 | Older installations, patch panels |
| **MPO/MTP** (12-f) | Multi-fiber | < 0.35 dB | 12/24 | 40/100/400 GbE parallel |
| **MPO/MTP** (24-f) | Multi-fiber | < 0.5 dB | 24 | 400 GbE (SR4.2, DR4) |
| **SN** | Duplex (mini) | < 0.15 dB | 2 | High-density (QSFP-DD, OSFP) |
| **CS** | Duplex (mini) | < 0.15 dB | 2 | High-density (QSFP-DD, OSFP) |
### MPO/MTP polarity
| Method | Description | Use case |
|--------|-------------|----------|
| **Type A** (Straight) | Fiber 1→1, 2→2, ... | Duplex applications with cross-over at both ends |
| **Type B** (Crossed) | Fiber 1→12, 2→11, ... | Parallel optics (SR4, SR8) — standard |
| **Type C** (Pairs crossed) | Pairs 1-2→2-1, 3-4→4-3 | 40 GbE SR4 (4×10G) |
### Breakout cassettes
```
MPO (12-f) ──> Breakout cassette ──> 6× LC duplex (12 fibers = 6× duplex)
MPO (24-f) ──> Breakout cassette ──> 12× LC duplex (24 fibers = 12× duplex)
```
Use case: Connecting MPO ports (switch) with LC ports (servers, storage). Cassettes are in the patch panel, not in the active path.
### Copper vs fiber decision
| Criterion | Copper (Cat6A/8) | Fiber (OM4/OS2) |
|-----------|-----------------|-----------------|
| **Reach** | 30-100 m | 100 m - 80 km |
| **Speed** | 1-40 GbE | 1-800 GbE |
| **Transceiver cost** | Lower (RJ45) | Higher (SFP+/QSFP) |
| **Cable cost** | Lower | Higher (patch cord) |
| **Port power** | 2-5 W (25 GbE) | 1-3 W (25 GbE SR) |
| **EMI immunity** | Susceptible | Immune |
| **Weight (100 m)** | ~3-4 kg | ~0.5-1 kg |
| **Recommendation** | Up to 30 m, server→ToR switch | Backbone, storage, >30 m |
### Cabling best practices
- **Horizontal cabling**: max 90 m permanent link + 10 m patch cords (TIA-942)
- **Fiber management**: slack spools, cable managers, minimum bend radius 10× cable diameter
- **Color coding**: OS1/OS2 (yellow), OM3 (aqua), OM4 (magenta/purple), OM5 (lime green)
- **Labeling**: both ends, patch panels, faceplates — standard ANSI/TIA-606-B
- **Overhead vs underfloor**: overhead (ladder rack) is preferred in DC (better airflow, easier changes)
- **MPO cassettes**: plan 15-20 % fiber reserve for future needs
## Physical security
### Multi-layer security model (defense in depth)
```
Layer 1: Perimeter (fence, gate, guards)
Layer 2: Building (walls, locks, CCTV, card readers)
Layer 3: DC hall (biometrics, mantrap, CCTV, motion detection)
Layer 4: Rack / Cage (electronic locks, sensors)
Layer 5: Data (encryption, HSM, access control)
```
### Access control
| Method | Factor | Level | Note |
|--------|--------|-------|------|
| **RFID / proximity card** | Something you have | Standard | Basic access, cheap |
| **Smart card (PKI)** | Something you have + PIN | Medium | Certificate on card, anti-passback |
| **Biometric (fingerprint)** | Something you are | High | Fast, hygienic (touchless readers) |
| **Biometric (palm/finger vein)** | Something you are | Very high | Hard to forge, contactless |
| **Biometric (iris/retina)** | Something you are | Highest | Very accurate, slow, expensive |
| **Multi-factor** | 2+ factors | Highest | Card + biometrics + PIN — Tier IV DC |
### Mantrap design
```
Outer door ──> Mantrap (vestibule) ──> Inner door
├── Weight sensor (anti-tailgating)
├── CCTV (both doors)
├── Intercom (emergency exit)
└── Motion detector (in mantrap)
```
- Only one door opens at a time
- Anti-tailgating: weight sensor detects multiple persons
- Exit via breakout button + motion detection
- Emergency exit: panic bar + alarm
### CCTV
| Element | Recommendation |
|---------|----------------|
| **Resolution** | Min. 1080p, ideally 4K (6 MP+) |
| **FPS** | 15-30 FPS (recording), 30+ FPS (realtime monitoring) |
| **Retention** | Min. 30 days (90 days for audit) |
| **Storage** | NVR (on-prem), cloud (AWS KVS, Azure Video Indexer) |
| **AI analytics** | Face detection, ANPR (license plate), object detection |
| **Field of view** | Every door, every aisle — no blind spots |
### Asset tracking
| Technology | Accuracy | Cost | Use case |
|-----------|----------|------|----------|
| **Barcode** | Rack-level | Very low | Manual inventory |
| **RFID (passive)** | Rack-level (door sweep) | Low | Automatic rack open detection |
| **RFID (active, UWB)** | 10-30 cm | Medium | Real-time tracking |
| **Bluetooth BLE** | 1-3 m | Low | Approximate position |
| **GPS** | 1-10 m | Medium | Outdoor tracking |
## DC layout and design
### Raised floor vs Slab
| Feature | Raised floor | Slab (solid floor) |
|---------|-------------|-------------------|
| **Airflow** | Underfloor air distribution (raised floor as plenum) | Overhead air, in-row cooling |
| **Flexibility** | Easy addition of perforated tiles | Limited (overhead cooling required) |
| **Weight** | Limit 500-1000 kg/m² (depends on height) | Unlimited |
| **Cost** | Higher (~$200-400/m²) | Lower (~$100-200/m²) |
| **Height** | 600-900 mm (standard), 900-1200 mm (high-density) | — |
| **Trend** | Declining (shift to in-row/overhead cooling) | Growing (new DC, high-density) |
Modern high-density DC (AI/ML, GPU) are moving away from raised floor to slab + overhead/in-row cooling — higher rack weights (1000-2000 kg), inability to provide sufficient airflow through floor.
### Rack layout and dimensions
| Parameter | Standard | High-density | Note |
|-----------|----------|-------------|------|
| **Rack width** | 600 mm (19") | 600-750 mm | 750 mm for GPU (cabling, cooling) |
| **Rack depth** | 1000-1200 mm | 1200-1500 mm | GPU servers, longer cables |
| **Rack height** | 42U | 48U / 52U | Higher rack = better power density |
| **Aisle width (cold)** | 1200-1500 mm | 1500-1800 mm | Service access, airflow |
| **Aisle width (hot)** | 900-1200 mm | 1200-1500 mm | Narrower than cold |
| **Max rack load** | 500-800 kg | 1000-2000 kg | Floor reinforcement required |
### Space planning
```
For Tier III DC (example):
IT space: 1000 m²
└── 20 rows × 10 racks = 200 racks at 42U
└── 200 racks × 5 kW avg = 1 MW IT load
└── PUE 1.4 → 1.4 MW facility
Support spaces:
└── UPS + batteries: 200 m²
└── Generators: 100 m² (outdoor)
└── Cooling (chillers, cooling tower): 300 m²
└── Offices, storage, loading dock: 400 m²
Total: ~2000 m² (50% IT, 50% support)
```
### Zone approach (TIA-942)
| Zone | Description | Access | Security |
|------|-------------|--------|----------|
| **Z1** (Public) | Reception, offices | Free | Minimal |
| **Z2** (Office) | Administration, NOC | Employees + guests | RFID |
| **Z3** (DC support) | UPS, generators, cooling | DC operators | RFID + biometrics |
| **Z4** (DC hall) | Servers, storage, networking | DC operators + approved | RFID + biometrics + mantrap |
| **Z5** (Rack/cage) | Specific rack or cage | Only authorized personnel | Electronic lock |
## Fire suppression
### Detection
| System | Type | Detection time | False alarms | Use case |
|--------|------|----------------|--------------|----------|
| **VESDA** (Very Early Smoke Detection) | Aspiration, laser sensor | < 30 s (4 alarm levels) | Very low | Standard for DC |
| **Spot detection** | Ionization / optical smoke detector | 2-5 min | Medium | Legacy, smaller DC |
| **Heat detection** | Thermal detector (temperature / rate of rise) | 5-10 min | Very low | Backup for VESDA |
| **Line-type (LHD)** | Linear heat detection cable | 2-5 min | Low | Cable trays, above ceiling |
VESDA is the standard — active aspiration draws air from DC, laser sensor detects smoke particles at 4 levels (Alert → Action → Fire 1 → Fire 2). Enables intervention before visible smoke.
### Suppression systems
| System | Medium | Advantages | Disadvantages | Typical DC |
|--------|--------|------------|---------------|-----------|
| **Novec 1230** (FK-5-1-12) | Gas | Safe for people, zero ODP, short atmospheric lifetime (5 days) | Higher cost | Enterprise DC |
| **FM-200** (HFC-227ea) | Gas | Fast (10 s), effective | High GWP (3220), no ODP | Legacy DC |
| **Inergen** (IG-541) | Inert gas (52% N₂, 40% Ar, 8% CO₂) | Completely safe, natural gas | Large volume, high pressure | Enterprise DC |
| **Argonite** (IG-55) | 50% Ar, 50% N₂ | Safe, natural | Large volume, higher pressure | Enterprise DC |
| **Water mist** | Water (fine mist) | Cooling, smoke suppression, low cost | Water in DC (risk), local application only | Retrofits |
| **Pre-action sprinkler** | Water | Dual activation (detection + sprinkler) | Water risk, drainage required | Tier I-II |
**Concentration**: Novec (4-6 % volume), FM-200 (7-9 %), Inergen (35-50 %). Novec and Inergen are safe for breathing (min. 5-7 min evacuation).
### Detection zones
```
DC hall ──> zones of ~200 m² (max)
├── VESDA (each zone its own aspirator)
├── Smoke detectors (ceiling + floor)
└── Heat detection (backup)
```
## DCIM (Data Center Infrastructure Management)
### What DCIM covers
| Area | Metrics | Output |
|------|---------|--------|
| **Power** | Per PDU, per outlet, per rack, total | Capacity planning, PUE, kW/rack |
| **Cooling** | Temperature, humidity, airflow (sensors per rack) | Hot spot maps, airflow optimization |
| **Asset** | What is in which rack, U position, serial, warranty | Asset inventory, lease management |
| **Network** | Port utilization, patch panel connections | Patch management, port tracking |
| **Space** | Free U in rack, free racks | Capacity planning, "what-if" simulations |
### Tools
| Tool | Type | Platform | Cost | Note |
|------|------|----------|------|------|
| **Nlyte (Carrier)** | Enterprise DCIM | On-prem / Cloud | $$$ | Market leader, complex |
| **Sunbird DCIM** | Enterprise DCIM | Cloud | $$$ | Power monitoring, asset tracking |
| **Device42** | DCIM + IPAM | On-prem / Cloud | $$ | Integrated IPAM, CMDB |
| **NetBox** | Open source DCIM | On-prem | Free | IPAM, DCIM, asset tracking |
| **OpenDCIM** | Open source | On-prem | Free | Basic DCIM, asset management |
| **RackTables** | Open source | On-prem | Free | Simple, asset + networking |
| **Vendor-specific** | Dell OME, HPE OneView | On-prem | Part of HW | Vendor-specific only |
## Site selection
### Criteria for DC site selection
| Category | Criterion | Weight |
|----------|-----------|--------|
| **Power** | Electricity availability (grid capacity), cost/kWh, possibility of two independent feeds | High |
| **Connectivity** | Fiber backbone availability, number of connectivity providers, latency to major POP | High |
| **Natural risks** | Earthquakes, floods, hurricanes, tornadoes, wildfires — historical data + predictions | High |
| **Climate** | Average temperature, humidity (free cooling potential) | Medium |
| **Workforce** | Availability of technicians, DC operators, network/admin engineers | Medium |
| **Taxes and regulation** | Tax incentives, environmental regulations, building permits | Medium |
| **Security** | Crime, political stability, terrorist risk | High |
| **Transport accessibility** | Proximity to airport, highway (for HW deliveries, personnel) | Low |
### Natural risks — mapping
| Risk | Areas | Mitigation |
|------|-------|------------|
| **Earthquakes** | Pacific Ring of Fire (CA, Japan, Chile) | Base isolation, seismic bracing, flexible connections |
| **Hurricanes** | Caribbean, southeastern US, southeast Asia | Reinforced construction, generators above flood level |
| **Floods** | River valleys, coastal areas | Location outside flood zone, barriers |
| **Wildfires** | California, Australia, Mediterranean | Defensive zones, air filtration, monitoring |
### Power availability by region
| Region | Grid reliability | Cost/kWh (industrial) | Note |
|--------|-----------------|------------------------|------|
| **Northern Europe** (SE, NO, FI) | High (99.99 %) | $0.04-0.08 | Cheap green energy, cool climate |
| **Central Europe** (DE, NL, CZ) | High (99.99 %) | $0.10-0.20 | Stable, growing renewables |
| **Eastern US** (VA, NC) | High | $0.05-0.08 | Largest DC hub (Ashburn, VA) |
| **Western US** (CA, OR) | Medium (PG&E issues) | $0.10-0.15 | CALISO grid, blackout risk |
| **Singapore** | High | $0.15-0.20 | Moratorium on new DC (2023), water |
| **Dubai / UAE** | High | $0.06-0.10 | Cheap energy, high temperature (cooling) |
## Compliance and certification
| Standard / Certification | Area | Description |
|-------------------------|------|-------------|
| **TIA-942** (Rated 1-4) | DC design | Classification of redundancy, cabling, security (analogous to Uptime Tier) |
| **Uptime Institute** (Tier I-IV) | DC design | Operational certification, construction documentation |
| **ISO 27001** | ISMS | Information security, risk management |
| **ISO 27701** | Privacy | Extension of ISO 27001 for GDPR compliance |
| **SOC 2** (Type I/II) | Service org | Controls: Security, Availability, Confidentiality, Integrity, Privacy |
| **PCI DSS** | Payment cards | Physical security, access to cardholder data |
| **HIPAA** | Healthcare | USA, health data protection |
| **FedRAMP** | US government | Cloud service authorization, DC security |
| **GDPR** | EU | Personal data protection, data residency |
| **NIST SP 800-53** | DC security | Security control catalog for US federal |
| **ISO 14001** | EMS | Environmental management, sustainability |
## Sustainability
### Carbon footprint of DC
```
Total emissions = Scope 1 (direct) + Scope 2 (energy) + Scope 3 (supply chain)
Scope 1: Generators (diesel), refrigerant leaks
Scope 2: Purchased electricity (grid mix)
Scope 3: HW manufacturing, transport, EOL recycling (~60-80 % of total emissions)
```
### Emission reduction
| Measure | Impact on PUE | Emission reduction | Payback |
|---------|--------------|-------------------|---------|
| **Temperature increase** (22→27 °C) | 0.1-0.2 | 10-20 % cooling | Immediate |
| **Free cooling** | 0.1-0.3 | 20-40 % cooling | 1-2 years |
| **Liquid cooling** | 0.2-0.4 | 30-50 % cooling | 2-4 years |
| **LED lighting + sensors** | 0.01-0.02 | < 1 % | 1 year |
| **PPA (Power Purchase Agreement)** | — | 100 % Scope 2 | Variable |
| **Renewable sources** (rooftop solar) | — | 5-15 % consumption | 5-10 years |
| **Green generator** (HVO biodiesel) | — | 90 % CO₂ reduction | +30 % fuel cost |
### Sustainability certifications
| Certification | Description |
|--------------|-------------|
| **LEED** (BD+C: DC) | U.S. Green Building Council — design and construction |
| **BREEAM** | UK, European sustainability assessment |
| **Climate Neutral Data Centre Pact** (EU) | Self-regulatory, PUE < 1.4 by 2030 |
| **ISO 50001** | Energy management system |
| **Energy Star** | EPA, energy efficiency (US only) |
## Decision diagram — DC topology design
```mermaid
flowchart TD
Start(["DC design"]) --> TIER{"Required Tier?"}
TIER -->|"Tier I / II"| T1["N / N+1 redundancy<br/>Simple power, single path<br/>CRAC/CRAH, free cooling<br/>PUE 1.4-1.6, cost 1×"]
TIER -->|"Tier III"| T3["N+1, concurrently maintainable<br/>Dual path (A/B feed)<br/>Hot aisle containment<br/>PUE 1.2-1.4, cost 2×"]
TIER -->|"Tier IV"| T4["2N+1, fault tolerant<br/>Dual redundant + STS<br/>Hot + cold containment<br/>PUE 1.1-1.3, cost 3×"]
TIER --> POWER{"Power chain"}
POWER -->|"UPS"| UPS{"UPS type"}
UPS -->|"Enterprise DC"| UPS1["VFI double-conversion<br/>Li-ion (LFP), 10-15 years<br/>N+1 or 2N modular"]
UPS -->|"Edge / office"| UPS2["VI line-interactive<br/>VRLA, 3-5 years"]
POWER -->|"Generator"| GEN["Diesel 500-2500 kVA<br/>Tank for 24-72 h<br/>ATS 4-10 ms switching"]
POWER -->|"PDU"| PDU["3-phase 400 V<br/>Monitored/Switched<br/>A/B feed to racks"]
Start --> DENS{"Power density"}
DENS -->|"< 10 kW/rack"| COOL1["Air cooling<br/>CRAC/CRAH, raised floor<br/>Hot aisle containment<br/>ASHRAE A1-A2"]
DENS -->|"10-25 kW/rack"| COOL2["Hybrid<br/>In-row cooling<br/>Rear door HX<br/>ASHRAE A1-H1"]
DENS -->|"> 25 kW/rack"| COOL3["Liquid cooling<br/>CDU, direct-to-chip<br/>Immersion single/two-phase<br/>ASHRAE W-classes"]
Start --> CLIM{"Climate zone"}
CLIM -->|"Moderate (CZ, DE)"| FC1["Free cooling 4000-6000 h/year<br/>Chiller + economizer<br/>PUE saving 0.2-0.3"]
CLIM -->|"Warm (ES, US South)"| FC2["Chiller year-round<br/>Adiabatic cooling<br/>PUE 1.3-1.6"]
CLIM -->|"Cold (SE, NO)"| FC3["Free cooling 7000+ h/year<br/>Air-side economizer<br/>PUE < 1.2"]
```
## Disk monitoring — S.M.A.R.T.
Self-Monitoring, Analysis and Reporting Technology — predictive monitoring of HDD/SSD.
| Key attribute | ID | Description |
|--------------|----|-------------|
| Reallocated Sectors Count | 5 | Number of remapped sectors (increase = end of disk life) |
| Power-On Hours | 9 | Total operating time in hours |
| Reported Uncorrectable Errors | 187 | Uncorrectable errors (red flag) |
| CRC Error Count | 199 | Errors on SATA link (cable/controller) |
| SSD Life Left | 231 | % remaining SSD life |
| Media Wearout Indicator | 233 | Total NAND writes |
Tools: `smartmontools` (smartctl, smartd), Prometheus exporter (`node_exporter`), OTeL collector.
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
### Recommended literature
| Book | Authors | ISBN | Description |
|------|---------|------|-------------|
| The Data Center as a Computer (4th ed., 2025) | Barroso, Hölzle, Ranganathan | 978-3-031-99488-3 | Comprehensive design evolution of warehouse-scale computer (WSC) by Google architects. Covers hardware, software, power, cooling, networking and 25 years of WSC experience. Key publication for datacenter architecture. |
| Electronics Cooling: From the Chip to the Datacenter (Vol. 62) | Abraham et al. | 978-0-443-47084-4 | Practical guide to thermal management from transistor level to datacenter. Covers conduction, convection, liquid immersion and phase change cooling. Essential resource for DC cooling design. |
## Datacenter backbone services
When building a new DC, basic infrastructure services must be deployed first — without them, higher layers cannot operate:
### DNS
| Role | Service | Description |
|------|---------|-------------|
| **Authoritative** | Bind, PowerDNS, NSD | Primary DNS zone for internal domains |
| **Recursive** | Unbound, Bind (caching), CoreDNS | Resolver for internal + external queries |
| **Anycast** | DNS anycast (BGP) | Redundancy, lower latency |
| **Integration** | Infoblox, BlueCat, dnsmasq | IPAM + DNS + DHCP in one |
Best practices: separate auth and recursive resolvers, DNSSEC, split-horizon (internal vs external view), TSIG for zone transfers, monitoring (DNS query latency, NXDOMAIN rate).
### NTP (time synchronization)
- **Primary**: GPS-disciplined NTP servers (Microchip S600, Meinberg)
- **Secondary**: Stratum 1/2 NTP (ntpd, chrony, NTPsec)
- **All nodes**: chrony (modern replacement for ntpd), local NTP server on each rack switch (boundary clock)
- **Precision**: PTP (IEEE 1588) for telco/fintech — sub-microsecond accuracy
- **DC topology**: GPS antenna → Grandmaster (PTP) → Boundary clock (rack switch) → Ordinary clock (server)
### DHCP + IPAM
| Tool | Description |
|------|-------------|
| **ISC DHCP** | Legacy, still widely deployed |
| **Kea** | Modern replacement for ISC DHCP (ISC + Linux Foundation) |
| **Infoblox / BlueCat** | Enterprise IPAM + DHCP + DNS |
| **NetBox / phpIPAM** | Open-source IPAM |
### LDAP / Identity Management
| Tool | Description |
|------|-------------|
| **FreeIPA** | Integrated IDM (LDAP + Kerberos + DNS + CA) — Linux |
| **Active Directory** | Microsoft, LDAP + Kerberos + Group Policy |
| **389 Directory Server** | Open-source LDAP (Red Hat) |
| **OpenLDAP** | Classic open-source LDAP |
| **Keycloak / Authentik** | Modern OIDC/SAML/LDAP gateways |
### PKI and certificates
- **Enterprise CA**: EJBCA, Smallstep, HashiCorp Vault (PKI engine)
- **ACME**: Cert-Manager (Kubernetes), certbot (Let's Encrypt)
- **mTLS**: Vault PKI, spire (SPIFFE), Cilium
- **Best practices**: root CA offline, intermediate CA per environment, short-lived certificates (max 90 days), revocation (CRL/OCSP)
### Monitoring and observability
See [MONITORING.md](MONITORING.md). Before running first workloads, DC must have:
- Metric collection (Prometheus, Zabbix)
- Centralized logs (Loki, ELK)
- Alerting (Alertmanager, PagerDuty)
- Uptime monitoring (heartbeat checks)
### Deployment logistics — step order
```
1. DNS (at least recursive + local resolver)
2. NTP (time synchronization)
3. DHCP + IPAM (first servers get IPs)
4. LDAP / IAM (users, groups, access rights)
5. PKI (certificates for encryption)
6. Configuration management (Ansible, Puppet)
7. Monitoring + logging (see what's happening)
8. Container registry / Package repo (docker registry, apt/yum mirror)
9. Load balancer (for services)
10. Storage backend (Ceph, NFS, SAN)
11. Orchestration (Kubernetes, OpenStack)
```
## OpenStack in the datacenter
OpenStack brings a software abstraction layer to DC enabling multi-tenancy and self-service:
### Control plane architecture
- **Controller nodes** — management services (Keystone, Nova API, Neutron API, Horizon, RabbitMQ, DB)
- **Compute nodes** — hypervisor (KVM), Nova Compute, Neutron agent
- **Storage nodes** — Ceph OSD, Cinder volumes, Swift object storage
- **Network nodes** — Neutron L3 router, DHCP agent, DVR
### Requirements for DC infrastructure
| Component | Requirement |
|-----------|-------------|
| **Controller** | 3-5 node HA cluster, 16+ vCPU, 32+ GB RAM, SSD |
| **Compute** | Dense performance per rack (GPU, high-core), NUMA-aware design |
| **Storage (Ceph)** | 10-25 GbE networking, NVMe/SSD OSD, 3+ replica |
| **Network** | 25/100 GbE spine-leaf, L3 BGP underlay, VXLAN overlay |
| **Rack power** | 10-30 kW/rack for GPU compute |
### Use cases
- Private cloud for enterprise (multi-tenant, self-service Horizon)
- NFVI for telco (DPDK, SR-IOV, low-latency)
- Academic / HPC clusters (Ironic, Cyborg, Manila)
- Government / regulated environments (on-prem, audit trail)
*Last revision: 2026-06-03*

788
DATACENTERS.md Normal file
View File

@@ -0,0 +1,788 @@
# 🏭 Datová centra
## Tier klasifikace (TIA-942 / Uptime Institute)
| Tier | Dostupnost | Downtime / rok | Redundance |
|------|-----------|----------------|------------|
| **Tier I** | 99.671 % | 28.8 h | N — bez redundance |
| **Tier II** | 99.741 % | 22.7 h | N+1 — redundantní komponenty |
| **Tier III** | 99.982 % | 1.6 h | N+1 — současně udržovatelné |
| **Tier IV** | 99.995 % | 26.3 min | 2N+1 — fault tolerant |
## Klíčové subsystémy
| Systém | Popis |
|--------|-------|
| **Power** | UPS, generátory (diesel), ATS, PDU, redundantní přívody (A/B feed) |
| **Cooling** | CRAC/CRAH, chilled water, free cooling, containment (hot/cold aisle) |
| **Fyzická bezpečnost** | Kamerový systém, biometric access, mantrap, bezpečnostní zámky racků |
| **Cabling** | Structured cabling (Cat6A/7/8, OM3/OM4 single-mode fiber), patch panely |
| **Fire suppression** | Poplach, inertní plyny (Novec, FM-200), VESDA (very early smoke detection) |
| **Monitoring** | DCIM (Data Center Infrastructure Management), SNMP, BMS (Building Management System) |
## Aisle containment
```
┌────────────────────────────────────┐
│ Rack Row │
│ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ │
Cold │ │ │ │ │ │ │ │ │ │ │ │ │ │ Cold
Aisle <──│ └──┘ └──┘ └──┘ └──┘ └──┘ └──┘ ──> Aisle
│ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ │
Hot │ │ │ │ │ │ │ │ │ │ │ │ │ │ Hot
Aisle ──>│ └──┘ └──┘ └──┘ └──┘ └──┘ └──┘ <── Aisle
└────────────────────────────────────┘
```
## Environmental třídy (ASHRAE TC 9.9)
ASHRAE Technical Committee 9.9 definuje teplotní a vlhkostní obálky pro IT zařízení v DC.
| Třída | Teplota (doporučeno) | Teplota (allowable) | Použití |
|-------|---------------------|---------------------|---------|
| **A1** | 18-27 °C | 15-32 °C | Enterprise DC, přísná kontrola |
| **A2** | 18-27 °C | 10-35 °C | Běžné DC |
| **A3** | 18-27 °C | 5-40 °C | Volnější prostředí |
| **A4** | 18-27 °C | 5-45 °C | Maximální úspora chlazení |
| **H1** | 18-22 °C | 5-25 °C | High-density air-cooled (AI/ML) |
- 5. edice (2021) přidala třídu H1 pro high-density a rozšířila liquid cooling W-třídy (W17, W27, W32, W40, W45, W+)
- 2024: nové S-třídy pro Technology Cooling System (TCS) chlazení kapalinou
- Vlhkost: doporučeno 9 °C DP až 70 % RH (při nízkých polutantech); max 50 % RH při vysoké korozivitě
## Power
### Power chain
```
Grid ──> Transformer ──> UPS ──> PDU ──> Rack PDU ──> Server PSU
├──> Generator (ATS přepíná při výpadku)
└──> STS/ATS (Static Transfer Switch)
```
A/B feed topology:
```
Grid A ──> UPS A ──> PDU A1 ──> Rack PDU A ──> PSU A (server)
Grid B ──> UPS B ──> PDU B1 ──> Rack PDU B ──> PSU B (server)
```
Každý server má 2 PSU — každá napájena z jiné větve (A/B). Při výpadku jedné větve server pokračuje bez přerušení.
### UPS typy
| Klasifikace | IEC 62040-3 | Popis | Přepínání | Use case |
|-----------|-------------|-------|-----------|----------|
| **VFD** (Voltage & Frequency Dependent) | Passive standby | UPS v bypassu, při výpadku přepne na invertor | 4-10 ms | SOHO, edge |
| **VI** (Voltage Independent) | Line-interactive | Regulace napětí přes autotransformátor | 2-4 ms | Menší racky, office |
| **VFI** (Voltage & Frequency Independent) | Double-conversion | AC → DC → AC, plná izolace, žádný přepínací čas | 0 ms | Enterprise DC, Tier III/IV |
Pro DC je standard **VFI (double-conversion)** — online UPS, nulový přepínací čas, plná izolace od sítě.
### Battery technologies
| Typ | Hustota (Wh/L) | Životnost (cykly) | Životnost (roky) | Teplota | Cena/kWh | Poznámka |
|-----|---------------|-------------------|------------------|---------|----------|----------|
| **VRLA** (AGM/Gel) | 50-80 | 200-500 | 3-5 | 20-25 °C | ~$150-200 | Levné, velké, těžké, citlivé na teplotu |
| **Li-ion (LFP)** | 200-350 | 3000-5000 | 10-15 | 0-40 °C | ~$300-500 | Malé, lehké, dlouhá životnost, BMS nutný |
| **Li-ion (NMC)** | 250-400 | 1000-2000 | 8-12 | 0-40 °C | ~$250-400 | Vyšší hustota, riziko thermal runaway |
| **NiCd** | 80-150 | 1000-2000 | 10-15 | 20-50 °C | ~$400-600 | Extrémní teploty, paměťový efekt |
| **Flow battery** (V/Zn/Br) | 20-40 | 10,000+ | 20+ | 10-35 °C | ~$500-800 | Neomezené cykly, velké, dlouhodobé zálohování |
Li-ion (LFP) se stává standardem pro nové DC díky delší životnosti, menšímu půdorysu a lepšímu chování při vysokých teplotách.
### Generator sizing
| Varianta | Velikost | Fuel | Start time | Run time | Use case |
|----------|---------|------|-----------|----------|----------|
| **Diesel** | 500-2500 kVA | Diesel (Nafta) | 10-30 s | 24-72 h (dle nádrže) | Standard pro enterprise DC |
| **Nat. gas** | 200-1500 kVA | Zemní plyn | 10-30 s | Neomezeno (plynovod) | Méně časté, nižší emise |
| **CHP** (cogeneration) | 500-2000 kVA | Zemní plyn | 5-15 min | Neomezeno | Kombinace power + cooling (absorption chiller) |
Sizing: Generator by měl pokrýt 100 % IT loadu + 100 % cooling loadu (vč. chillerů) — typicky 1.3-1.8× IT load. Dieselová nádrž min. na 24 h provozu, běžně 48-72 h. Denní spotřeba ~0.3-0.4 L/kWh.
### ATS vs STS
| Vlastnost | ATS (Automatic Transfer Switch) | STS (Static Transfer Switch) |
|-----------|-------------------------------|-----------------------------|
| **Přepínání** | 4-10 ms (mechanické relé) | < 4 ms (tyristorové) |
| **Životnost** | ~10,000 přepnutí | Neomezená (solid-state) |
| **Cena** | Nízká | Vysoká (~3-5× ATS) |
| **Use case** | Generátor → UPS feed | Mezi dvěma UPS výstupy |
### PDU typy
| Typ | Popis | Use case |
|-----|-------|----------|
| **Basic** | Pasivní rozbočení (no monitoring) | Edge, office |
| **Metered** | Měření proudu na úrovni PDU | Standard DC |
| **Monitored** | Měření per outlet, SNMP, web GUI | Enterprise DC |
| **Switched** | On/off per outlet, remote reboot | Enterprise DC, colo |
| **High-density** | 3-phase, 60-100 A, C19 outlets | GPU/HPC/AI racky |
### Power calculation
```
Total Power = Σ(P_server + P_storage + P_network + P_cooling + P_losses)
P_server = P_idle + (P_max - P_idle) × Utilization%
P_cooling = P_IT / PUE
Příklad:
100 serverů × 500 W (avg) = 50 kW IT load
PUE = 1.5 → celkem 75 kW
UPS + generátor → dimenzováno na 75 kW × 1.2 (safety factor) = 90 kW
```
### PUE (Power Usage Effectiveness)
```
PUE = Total Facility Energy / IT Equipment Energy
```
| PUE | Efektivita | Typ |
|-----|-----------|-----|
| 1.0-1.1 | Vynikající | Hyperscale (Google, Meta) |
| 1.1-1.3 | Velmi dobrý | Moderní DC |
| 1.3-1.6 | Dobrý / průměr | Enterprise DC |
| 1.6-2.0 | Podprůměr | Starší DC |
| >2.0 | Špatný | Legacy |
PUE se měří na úrovni celého DC, nikoliv per rack. Zahrnuje: UPS ztráty, chlazení, osvětlení, ztráty v rozvodu. Nezahrnuje: výrobu paliva (well-to-tank), embodied carbon. Cíl pro moderní DC: PUE < 1.2.
### WUE a CUE
| Metrika | Popis | Vzorec | Cíl |
|---------|-------|--------|-----|
| **WUE** (Water Usage Effectiveness) | Spotřeba vody na IT energii | WUE = Annual Water Usage / IT Energy (L/kWh) | < 0.5 L/kWh |
| **CUE** (Carbon Usage Effectiveness) | CO₂ emise na IT energii | CUE = Total CO₂ / IT Energy (kg CO₂/kWh) | < 0.2 kg CO₂/kWh |
WUE je kritický v suchých oblastech (jihozápad USA, Austrálie, Střední východ). Adiabatické chlazení spotřebuje výrazně více vody než chlazení s uzavřeným okruhem.
### 3-phase vs Single-phase
| Vlastnost | Single-phase (230 V) | 3-phase (400 V) |
|-----------|---------------------|-----------------|
| **Napětí** | 230 V (L-N) | 230/400 V (L-N/L-L) |
| **Výkon per feed** | ~7.4 kW (32 A) | ~22 kW (32 A, 3-f) |
| **Efektivita** | Nižší (více ztrát) | Vyšší (nižší proud) |
| **Use case** | Menší racky, office | Standard v DC, high-density |
| **PDU** | Single-phase (C13/C19) | 3-phase (C13/C19, 3-f monitoring) |
| **Balancování** | Automatické | Nutné balancovat fáze (L1/L2/L3) |
### Rack power density
| Kat. | Typ | kW/rack | Napájení | Cooling |
|------|-----|---------|----------|---------|
| Nízká | Office, storage | 1-3 kW | 1-f, 16 A | Air (free cooling) |
| Střední | Standard compute | 5-10 kW | 3-f, 32 A | Air (CRAC/CRAH) |
| Vysoká | GPU, HPC | 15-30 kW | 3-f, 60 A | Air + liquid assist |
| Ultra | AI/ML clusters | 40-100+ kW | 3-f, 100+ A | Direct-to-chip / immersion |
### Rack PDU konektory
| Konektor | Max proud | Typ zařízení |
|----------|-----------|-------------|
| **C13** | 10 A (250 V) | Servery, switche, 1U |
| **C19** | 16 A (250 V) | Servery s vyšším výkonem, UPS |
| **IEC 60309** (3-f) | 16-125 A | Rack PDU vstupy |
| **NEMA L6-30** | 30 A (250 V) | US spec |
## Cooling
### Chlazení — přehled technologií
| Technologie | Typ | Výkon (kW/rack) | PUE typický | CAPEX | Use case |
|-----------|------|----------------|-------------|-------|----------|
| **Free air cooling** | Air | < 5 | 1.05-1.15 | Nízký | Klimaticky vhodné lokality |
| **CRAC (DX)** | Air | 5-10 | 1.4-1.8 | Střední | Menší DC, retrofit |
| **CRAH (CW)** | Air | 5-15 | 1.2-1.5 | Vysoký | Enterprise DC |
| **In-row cooling** | Air | 10-25 | 1.2-1.4 | Vysoký | High-density racky |
| **Rear-door HX** | Hybrid | 15-30 | 1.1-1.3 | Střední | Retrofity, GPU |
| **Direct-to-chip** | Liquid | 40-100+ | 1.05-1.15 | Vysoký | AI/ML, HPC |
| **Immersion (single-phase)** | Liquid | 50-100+ | 1.03-1.10 | Vysoký | Bitcoin, hyperscale |
| **Immersion (two-phase)** | Liquid | 100-200+ | 1.03-1.08 | Velmi vysoký | Extreme density |
### Chilled water vs Direct Expansion (DX)
| Vlastnost | Chilled water (CW) | Direct Expansion (DX) |
|-----------|-------------------|----------------------|
| **Medium** | Voda + glycol | Freon (R134a, R410A, R454B) |
| **CRAC/CRAH** | CRAH (Coolant-based) | CRAC (refrigerant compressor) |
| **Efektivita** | Vyšší (COP 5-7) | Nižší (COP 2-4) |
| **Teplota vody** | 7-12 °C (standard), 18-22 °C (high-temp) | 5-10 °C (evaporator) |
| **Komplexita** | Vyšší (chillers, pumps, pipes, cooling tower) | Jednodušší |
| **Údržba** | Vyšší (vodní úprava, prevence legionely) | Nižší |
| **Use case** | Velké DC > 500 kW, enterprise | Menší DC, edge, retrofit |
### Containment typy
| Typ | Popis | Efektivita | Implementace |
|-----|-------|-----------|-------------|
| **Cold aisle containment (CAC)** | Uzavřená studená ulička, teplý vzduch se vrací do místnosti | Vysoká | Dveře na koncích uličky, stropní panely |
| **Hot aisle containment (HAC)** | Uzavřená teplá ulička, teplý vzduch jde přímo do zpátečky | Vyšší | Dveře + stropní panely, zpátečka do CRAH |
| **Chimney / rear duct** | Každý rack má vlastní výfukový komín do stropu | Nejvyšší | Samostatné ducty per rack, nákladné |
| **Open aisle** | Bez containmentu, studený a teplý vzduch se mísí | Nízká | Legacy, levné |
Doporučení: CAC/HAC při hustotě > 5 kW/rack. HAC je o 5-10 % efektivnější než CAC (teplý vzduch je přímo odváděn, nemísí se s místností).
### CFD modeling
Computational Fluid Dynamics (CFD) simuluje proudění vzduchu v DC před fyzickou implementací:
- Identifikace hot spots (recirkulace teplého vzduchu do studené uličky)
- Optimalizace pozice perforovaných dlaždic
- Návrh bypass airflow (kabelové otvory, nezakryté pozice)
- Simulace výpadku CRAH jednotky (what-if scénáře)
- Nástroje: Future Facilities (6Sigma DC), Ansys Fluent, OpenFOAM
### Free cooling
- **Air-side** — nasávání venkovního vzduchu při vhodné teplotě (filtrace, humidifikace)
- **Water-side** — využití chladné vody z venkovních chillerů (strainer cycle) bez kompresoru
- **Klimatické pásmo** — free cooling využitelný ~2000-8000 hodin/rok podle lokality
- Skandinávie: 7000-8000 h/rok
- Střední Evropa: 4000-6000 h/rok
- Jižní Evropa: 2000-4000 h/rok
- **Hybrid** — kombinace free cooling + mechanical cooling (nejběžnější)
- **Economizer types**: Class A1 (dry cooler), Class A2 (evaporative), Class B (air-side)
### Liquid cooling detail
| Typ | Teplota vstupu | Kapacita (kW/rack) | Medium | Instalace |
|-----|---------------|-------------------|--------|-----------|
| **Cold plate (D2C)** | 20-45 °C | 40-100+ | Voda, propylenglykol | CDU per rack nebo per row |
| **Rear-door HX** | 18-27 °C | 15-30 | Voda | Pasivní, bez úpravy serveru |
| **Immersion (1-f)** | 35-50 °C | 50-100+ | Dielektrický olej | Nádrž, CDU, heat exchanger |
| **Immersion (2-f)** | 25-35 °C | 100-200+ | Dielektrikum (var) | Nádrž + kondenzátor |
**CDU (Coolant Distribution Unit)**:
- Zajišťuje teplotu a tlak chladiva do racků
- Primární okruh (facility water) + sekundární okruh (rack coolant)
- Dimenzování: 1 CDU na 4-8 racků (40-100 kW per CDU)
- Redundance: N+1 CDU, dual coolant loops
**Water quality requirements**:
- Vodivost: < 1 µS/cm (demineralizovaná voda)
- pH: 6.5-8.0
- Částice: < 50 µm (filtrace)
- Prevence koroze: inhibitory, glykol (10-30 %)
- Prevence biologického růstu: UV, biocidy
### Adiabatic cooling
Využití odpařování vody pro ochlazení vzduchu:
- **Direct adiabatic** — vzduch prochází vodou (media pad), ochlazuje se a zvlhčuje
- **Indirect adiabatic** — vzduch se ochlazuje přes heat exchanger bez přímého kontaktu s vodou
- **Spotřeba vody**: 3-5 L/kWh (direct), 1-2 L/kWh (indirect)
- Účinnost závisí na vlhkosti vzduchu — v suchém klimatu efektivnější
## Kabeláž a structured cabling
### TIA-942 cabling hierarchy
```
Entrance Room (ER)
├── Backbone cabling (fiber single-mode / multi-mode)
│ │
│ ├── Main Distribution Area (MDA)
│ │ │
│ │ ├── Horizontal Distribution Area (HDA)
│ │ │ │
│ │ │ └── Equipment Distribution Area (EDA) → rack
│ │ │
│ │ └── Intermediate Distribution Area (IDA) — volitelný
│ │
│ └── Telecommunication Room (TR) — pro office
└── Backbone cabling (fiber / copper)
```
### Copper cabling categories
| Kategorie | Frekvence | Rychlost | Délka | Konektor | Use case |
|-----------|----------|----------|-------|----------|----------|
| **Cat5e** | 100 MHz | 1 GbE | 100 m | RJ45 | Legacy, voice |
| **Cat6** | 250 MHz | 1 GbE (10 GbE do 55 m) | 100 m (10 GbE: 55 m) | RJ45 | Běžné DC, enterprise |
| **Cat6A** | 500 MHz | 10 GbE | 100 m | RJ45 | Standard pro nové DC |
| **Cat7** (GG45) | 600 MHz | 10 GbE | 100 m | GG45/TERA | Niche, nahrazen Cat6A/8 |
| **Cat8.1** | 2000 MHz | 25/40 GbE | 30 m | RJ45 | Top-of-rack, storage |
| **Cat8.2** | 2000 MHz | 25/40 GbE | 30 m | GG45/TERA | Top-of-rack, storage |
V DC se standardně používá **Cat6A** (10 GbE do 100 m) pro horizontální rozvody. Cat8 pouze pro propojky v rámci racku (do 30 m).
### Fiber optic typy
| Typ | Core | Modal BW | Rychlost | Max délka | Use case |
|-----|------|----------|----------|-----------|----------|
| **OS1** (SM) | 9 µm | — | 100 GbE - 800 GbE | 10-80 km | Backbone, campus, WAN |
| **OS2** (SM) | 9 µm | — | 100 GbE - 800 GbE | 2-80 km (CWDM/DWDM) | Backbone, DWDM |
| **OM1** (MM) | 62.5 µm | 200 MHz·km | 1 GbE | 275 m | Legacy |
| **OM2** (MM) | 50 µm | 500 MHz·km | 10 GbE | 82 m | Legacy |
| **OM3** (MM) | 50 µm | 2000 MHz·km | 10 GbE do 300 m, 100 GbE do 100 m | 300 m (10G) | Standard DC, VCSEL |
| **OM4** (MM) | 50 µm | 4700 MHz·km | 100 GbE do 150 m, 400 GbE do 100 m | 550 m (10G) | Výkonný standard DC |
| **OM5** (MM) | 50 µm | 4700+ MHz·km | 200/400 GbE SWDM | 150 m (100G) | Emerging, SWDM |
Pro nové DC: **OM4** jako standard pro multi-mode, **OS2** pro single-mode backbone (LR, DWDM). OM5 není široce nasazen — OM4 + paralelní optika (SR4) je běžnější.
### Connector types
| Konektor | Typ | Insertion loss | Počet vláken | Use case |
|----------|-----|---------------|-------------|----------|
| **LC** | Duplex | < 0.15 dB | 2 | Standard pro SFP/SFP+/QSFP |
| **SC** | Duplex | < 0.2 dB | 2 | Starší instalace, patch panely |
| **MPO/MTP** (12-f) | Multi-fiber | < 0.35 dB | 12/24 | 40/100/400 GbE paralelní |
| **MPO/MTP** (24-f) | Multi-fiber | < 0.5 dB | 24 | 400 GbE (SR4.2, DR4) |
| **SN** | Duplex (mini) | < 0.15 dB | 2 | High-density (QSFP-DD, OSFP) |
| **CS** | Duplex (mini) | < 0.15 dB | 2 | High-density (QSFP-DD, OSFP) |
### MPO/MTP polarity
| Metoda | Popis | Use case |
|--------|-------|----------|
| **Type A** (Straight) | Vlákno 1→1, 2→2, ... | Duplex aplikace s cross-over na obou koncích |
| **Type B** (Crossed) | Vlákno 1→12, 2→11, ... | Paralelní optika (SR4, SR8) — standard |
| **Type C** (Pairs crossed) | Páry 1-2→2-1, 3-4→4-3 | 40 GbE SR4 (4×10G) |
### Breakout kazety
```
MPO (12-f) ──> Breakout kazeta ──> 6× LC duplex (12 vláken = 6× duplex)
MPO (24-f) ──> Breakout kazeta ──> 12× LC duplex (24 vláken = 12× duplex)
```
Use case: Propojení MPO portu (switch) s LC porty (servery, storage). Kazety jsou v patch panelu, ne v aktivní cestě.
### Copper vs fiber decision
| Kritérium | Copper (Cat6A/8) | Fiber (OM4/OS2) |
|-----------|-----------------|-----------------|
| **Dosah** | 30-100 m | 100 m - 80 km |
| **Rychlost** | 1-40 GbE | 1-800 GbE |
| **Cena transceiveru** | Nižší (RJ45) | Vyšší (SFP+/QSFP) |
| **Cena kabelu** | Nižší | Vyšší (patch cord) |
| **Spotřeba portu** | 2-5 W (25 GbE) | 1-3 W (25 GbE SR) |
| **Elektromagnetické rušení** | Citlivý | Imunní |
| **Váha (100 m)** | ~3-4 kg | ~0.5-1 kg |
| **Doporučení** | Do 30 m, server→ToR switch | Backbone, storage, >30 m |
### Cabling best practices
- **Horizontal cabling**: max 90 m permanent link + 10 m patch cords (TIA-942)
- **Fiber management**: slack spools, cable managers, minimální poloměr ohybu 10× průměr kabelu
- **Color coding**: OS1/OS2 (yellow), OM3 (aqua), OM4 (magenta/purple), OM5 (lime green)
- **Labeling**: oba konce, patch panely, faceplates — standard ANSI/TIA-606-B
- **Overhead vs underfloor**: overhead (ladder rack) je preferován v DC (lepší airflow, jednodušší změny)
- **MPO cassettes**: plánovat 15-20 % rezervu vláken pro budoucí potřeby
## Fyzická bezpečnost
### Multi-layer security model (defense in depth)
```
Layer 1: Perimeter (plot, brána, stráže)
Layer 2: Building (zdi, zámky, CCTV, čtečky karet)
Layer 3: DC hall (biometrie, mantrap, CCTV, detekce pohybu)
Layer 4: Rack / Cage (elektronické zámky, senzory)
Layer 5: Data (šifrování, HSM, access control)
```
### Access control
| Metoda | Faktor | Úroveň | Poznámka |
|--------|--------|--------|----------|
| **RFID / proximity card** | Něco, co máte | Standard | Základní přístup, levné |
| **Smart card (PKI)** | Něco, co máte + PIN | Střední | Certifikát na kartě, anti-passback |
| **Biometric (fingerprint)** | Něco, co jste | Vysoká | Rychlý, hygienický (čtečky bez dotyku) |
| **Biometric (palm/finger vein)** | Něco, co jste | Velmi vysoká | Těžko falšovatelný, bezkontaktní |
| **Biometric (iris/retina)** | Něco, co jste | Nejvyšší | Velmi přesný, pomalý, drahý |
| **Multi-factor** | 2+ faktory | Nejvyšší | Karta + biometrie + PIN — Tier IV DC |
### Mantrap design
```
Vnější dveře ──> Mantrap (prostor) ──> Vnitřní dveře
├── Weight sensor (anti-tailgating)
├── CCTV (obě dveře)
├── Intercom (nouzový východ)
└── Motion detector (v mantrapu)
```
- Otevírá se vždy jen jedny dveře
- Anti-tailgating: váhový senzor detekuje více osob
- Výstup (exit) přes breakout button + detekce pohybu
- Nouzový východ: panic bar + alarm
### CCTV
| Prvek | Doporučení |
|-------|-----------|
| **Rozlišení** | Min. 1080p, ideálně 4K (6 MP+) |
| **FPS** | 15-30 FPS (záznam), 30+ FPS (realtime monitoring) |
| **Retence** | Min. 30 dní (90 dní pro audit) |
| **Storage** | NVR (on-prem), cloud (AWS KVS, Azure Video Indexer) |
| **AI analytics** | Detekce obličeje, ANPR (poznávací značky), object detection |
| **Zorné pole** | Každé dveře, každá ulička — bez slepých míst |
### Asset tracking
| Technologie | Přesnost | Cena | Use case |
|-----------|----------|------|----------|
| **Barcode** | Rack-level | Velmi nízká | Manuální inventura |
| **RFID (passive)** | Rack-level (door sweep) | Nízká | Automatická detekce otevření racku |
| **RFID (active, UWB)** | 10-30 cm | Střední | Real-time tracking v reálném čase |
| **Bluetooth BLE** | 1-3 m | Nízká | Orientační pozice |
| **GPS** | 1-10 m | Střední | Venkovní tracking |
## DC layout a design
### Raised floor vs Slab
| Vlastnost | Raised floor | Slab (pevná podlaha) |
|-----------|-------------|----------------------|
| **Airflow** | Underfloor air distribution (zvednutá podlaha jako plénum) | Overhead air, in-row cooling |
| **Flexibilita** | Snadné přidání perforovaných dlaždic | Omezené (nutné overhead cooling) |
| **Hmotnost** | Limit 500-1000 kg/m² (závisí na výšce) | Neomezené |
| **Cena** | Vyšší (~$200-400/m²) | Nižší (~$100-200/m²) |
| **Výška** | 600-900 mm (standard), 900-1200 mm (high-density) | — |
| **Trend** | Klesající (přechod na in-row/overhead cooling) | Rostoucí (nové DC, high-density) |
Moderní high-density DC (AI/ML, GPU) se odklánějí od raised floor k slab + overhead/in-row cooling — vyšší hmotnost racků (1000-2000 kg), nemožnost dostatečného airflow podlahou.
### Rack layout a rozměry
| Parametr | Standard | High-density | Poznámka |
|----------|----------|-------------|----------|
| **Rack šířka** | 600 mm (19") | 600-750 mm | 750 mm pro GPU (kabeláž, chlazení) |
| **Rack hloubka** | 1000-1200 mm | 1200-1500 mm | GPU servery, delší kabely |
| **Rack výška** | 42U | 48U / 52U | Vyšší rack = lepší power density |
| **Ulička šířka (studená)** | 1200-1500 mm | 1500-1800 mm | Servisní přístup, airflow |
| **Ulička šířka (teplá)** | 900-1200 mm | 1200-1500 mm | Užší než studená |
| **Max zatížení racku** | 500-800 kg | 1000-2000 kg | Nutné podlahové nosníky |
### Space planning
```
Pro Tier III DC (příklad):
IT prostor: 1000 m²
└── 20 řad × 10 racků = 200 racků při 42U
└── 200 racků × 5 kW avg = 1 MW IT load
└── PUE 1.4 → 1.4 MW facility
Podpůrné prostory:
└── UPS + baterie: 200 m²
└── Generátory: 100 m² (venkovní)
└── Chlazení (chillery, cooling tower): 300 m²
└── Kanceláře, sklady, loading dock: 400 m²
Celkem: ~2000 m² (50% IT, 50% support)
```
### Zone approach (TIA-942)
| Zóna | Popis | Přístup | Security |
|------|-------|---------|----------|
| **Z1** (Veřejná) | Recepce, kanceláře | Volný | Minimální |
| **Z2** (Kancelářská) | Administrativa, NOC | Zaměstnanci + hosté | RFID |
| **Z3** (DC support) | UPS, generátory, chlazení | DC operátoři | RFID + biometrie |
| **Z4** (DC hall) | Servery, storage, networking | DC operátoři + schválení | RFID + biometrie + mantrap |
| **Z5** (Rack/cage) | Konkrétní rack nebo cage | Pouze oprávněný personál | Elektronický zámek |
## Fire suppression
### Detekce
| Systém | Typ | Doba detekce | Falešné poplachy | Use case |
|--------|-----|-------------|------------------|----------|
| **VESDA** (Very Early Smoke Detection) | Aspirační, laserové čidlo | < 30 s (4 stupně alarmu) | Velmi nízké | Standard pro DC |
| **Spot detection** | Ionizační / optický kouřový detektor | 2-5 min | Střední | Legacy, menší DC |
| **Heat detection** | Tepelný detektor (teplota / rychlost nárůstu) | 5-10 min | Velmi nízké | Záloha za VESDA |
| **Line-type (LHD)** | Lineární tepelný kabel | 2-5 min | Nízké | Cable trays, nad stropem |
VESDA je standard — aktivní aspirace nasává vzduch z DC, laserové čidlo detekuje částice kouře ve 4 úrovních (Alert → Action → Fire 1 → Fire 2). Umožňuje zásah ještě před viditelným kouřem.
### Suppression systémy
| Systém | Medium | Výhody | Nevýhody | Typ DC |
|--------|--------|--------|----------|--------|
| **Novec 1230** (FK-5-1-12) | Plyn | Bezpečný pro lidi, nulový ODP, krátký atmospheric lifetime (5 dní) | Vyšší cena | Enterprise DC |
| **FM-200** (HFC-227ea) | Plyn | Rychlý (10 s), účinný | Vysoký GWP (3220), ODP nemá | Legacy DC |
| **Inergen** (IG-541) | Inertní plyn (52% N₂, 40% Ar, 8% CO₂) | Zcela bezpečný, přírodní plyn | Velké množství (objem), vysoký tlak | Enterprise DC |
| **Argonite** (IG-55) | 50% Ar, 50% N₂ | Bezpečný, přírodní | Velké množství, vyšší tlak | Enterprise DC |
| **Water mist** | Voda (jemná mlha) | Chlazení, potlačení kouře, nízká cena | Voda v DC (riziko), jen local application | Retrofity |
| **Pre-action sprinkler** | Voda | Dvojí spuštění (detekce + sprinkler) | Riziko vody, nutné odvodnění | Tier I-II |
**Koncentrace**: Novec (4-6 % objemu), FM-200 (7-9 %), Inergen (35-50 %). Novec a Inergen jsou bezpečné pro dýchání (min. 5-7 min evakuace).
### Detekční zóny
```
DC hall ──> zóny po ~200 m² (max)
├── VESDA (každá zóna vlastní aspirátor)
├── Kouřové detektory (podhled + podlaha)
└── Heat detection (záložní)
```
## DCIM (Data Center Infrastructure Management)
### Co DCIM pokrývá
| Oblast | Metriky | Výstup |
|--------|---------|--------|
| **Power** | Per PDU, per outlet, per rack, celkem | Capacity planning, PUE, kW/rack |
| **Cooling** | Teplota, vlhkost, airflow (senzory per rack) | Hot spot mapy, airflow optimalizace |
| **Asset** | Co je v kterém racku, U pozice, serial, warranty | Asset inventory, lease management |
| **Network** | Port utilization, patch panel propojení | Patch management, port tracking |
| **Space** | Volné U v racku, volné racky | Capacity planning, "what-if" simulace |
### Nástroje
| Nástroj | Typ | Platforma | Cena | Poznámka |
|---------|-----|-----------|------|----------|
| **Nlyte (Carrier)** | Enterprise DCIM | On-prem / Cloud | $$$ | Tržní leader, complex |
| **Sunbird DCIM** | Enterprise DCIM | Cloud | $$$ | Power monitoring, asset tracking |
| **Device42** | DCIM + IPAM | On-prem / Cloud | $$ | Integrovaný IPAM, CMDB |
| **NetBox** | Open source DCIM | On-prem | Zdarma | IPAM, DCIM, asset tracking |
| **OpenDCIM** | Open source | On-prem | Zdarma | Základní DCIM, asset management |
| **RackTables** | Open source | On-prem | Zdarma | Jednoduchý, asset + networking |
| **Vendor-specific** | Dell OME, HPE OneView | On-prem | Součást hw | Pouze daný vendor |
## Site selection
### Kritéria pro výběr lokality DC
| Kategorie | Kritérium | Váha |
|-----------|-----------|------|
| **Power** | Dostupnost elektřiny (grid capacity), cena/kWh, možnost dvou nezávislých přívodů | Vysoká |
| **Connectivity** | Dostupnost fiber backbone, počet poskytovatelů konektivity, latency k major POP | Vysoká |
| **Přírodní rizika** | Zemětřesení, povodně, hurikány, tornáda, lesní požáry — historická data + predikce | Vysoká |
| **Klima** | Průměrná teplota, vlhkost (free cooling potenciál) | Střední |
| **Pracovní síla** | Dostupnost techniků, DC operátorů, network/admin inženýrů | Střední |
| **Daně a regulace** | Daňové pobídky, environmental regulations, stavební povolení | Střední |
| **Bezpečnost** | Kriminalita, politická stabilita, teroristické riziko | Vysoká |
| **Dopravní dostupnost** | Blízkost letiště, dálnice (pro dodávky HW, personál) | Nízká |
### Přírodní rizika — mapování
| Riziko | Oblasti | Mitigace |
|--------|---------|----------|
| **Zemětřesení** | Pacific Ring of Fire (CA, Japonsko, Chile) | Base isolation, seismic bracing, flexibilní propojení |
| **Hurikány** | Karibik, jihovýchod USA, jihovýchodní Asie | Zesílená konstrukce, generátory nad úrovní záplav |
| **Povodně** | Říční údolí, pobřežní oblasti | Umístění mimo záplavovou zónu, bariéry |
| **Lesní požáry** | Kalifornie, Austrálie, Středomoří | Defenzivní zóny, filtrace vzduchu, monitoring |
### Power availability po regionech
| Region | Grid reliability | Cena/kWh (industriální) | Poznámka |
|--------|-----------------|------------------------|----------|
| **Severní Evropa** (SE, NO, FI) | Vysoká (99.99 %) | $0.04-0.08 | Levná zelená energie, chladné klima |
| **Střední Evropa** (DE, NL, CZ) | Vysoká (99.99 %) | $0.10-0.20 | Stabilní, renewables rostou |
| **Východní USA** (VA, NC) | Vysoká | $0.05-0.08 | Největší DC hub (Ashburn, VA) |
| **Západní USA** (CA, OR) | Střední (PG&E issues) | $0.10-0.15 | CALISO grid, blackout risk |
| **Singapur** | Vysoká | $0.15-0.20 | Moratorium na nová DC (2023), voda |
| **Dubai / UAE** | Vysoká | $0.06-0.10 | Levná energie, vysoká teplota (cooling) |
## Compliance a certifikace
| Standard / Certifikace | Oblast | Popis |
|----------------------|--------|-------|
| **TIA-942** (Rated 1-4) | DC design | Klasifikace redundance, kabeláže, bezpečnosti (analogický k Uptime Tier) |
| **Uptime Institute** (Tier I-IV) | DC design | Provozní certifikace, konstrukční dokumentace |
| **ISO 27001** | ISMS | Informační bezpečnost, řízení rizik |
| **ISO 27701** | Privacy | Rozšíření ISO 27001 pro GDPR compliance |
| **SOC 2** (Type I/II) | Service org | Controls: Security, Availability, Confidentiality, Integrity, Privacy |
| **PCI DSS** | Platební karty | Fyzická bezpečnost, přístup k cardholder data |
| **HIPAA** | Zdravotnictví | USA, ochrana zdravotních dat |
| **FedRAMP** | US government | Cloud service authorization, DC security |
| **GDPR** | EU | Ochrana osobních údajů, data residency |
| **NIST SP 800-53** | DC security | Security control catalog pro US federal |
| **ISO 14001** | EMS | Environmental management, sustainability |
## Sustainability
### Uhlíková stopa DC
```
Celkové emise = Scope 1 (přímé) + Scope 2 (energie) + Scope 3 (dodavatelský řetězec)
Scope 1: Generátory (diesel), úniky chladiva
Scope 2: Nakoupená elektřina (grid mix)
Scope 3: Výroba HW, transport, EOL recyklace (~60-80 % celkových emisí)
```
### Redukce emisí
| Opatření | Dopad na PUE | Snížení emisí | Návratnost |
|----------|-------------|---------------|------------|
| **Zvýšení teploty** (22→27 °C) | 0.1-0.2 | 10-20 % chlazení | Ihned |
| **Free cooling** | 0.1-0.3 | 20-40 % chlazení | 1-2 roky |
| **Liquid cooling** | 0.2-0.4 | 30-50 % chlazení | 2-4 roky |
| **LED osvětlení + senzory** | 0.01-0.02 | < 1 % | 1 rok |
| **PPA (Power Purchase Agreement)** | — | 100 % Scope 2 | Variabilní |
| **Obnovitelné zdroje** (solární na střeše) | — | 5-15 % spotřeby | 5-10 let |
| **Zelený generátor** (HVO biodiesel) | — | 90 % CO₂ redukce | +30 % fuel cost |
### Certifikace udržitelnosti
| Certifikace | Popis |
|-----------|-------|
| **LEED** (BD+C: DC) | U.S. Green Building Council — design a konstrukce |
| **BREEAM** | UK, European sustainability assessment |
| **Climate Neutral Data Centre Pact** (EU) | Self-regulatory, PUE < 1.4 do 2030 |
| **ISO 50001** | Energy management system |
| **Energy Star** | EPA, energetická účinnost (jen US) |
## Decision diagram — návrh DC topologie
```mermaid
flowchart TD
Start(["DC design"]) --> TIER{"Požadovaný Tier?"}
TIER -->|"Tier I / II"| T1["N / N+1 redundance<br/>Jednoduché napájení, single path<br/>CRAC/CRAH, free cooling<br/>PUE 1.4-1.6, cena 1×"]
TIER -->|"Tier III"| T3["N+1, současně udržovatelné<br/>Dual path (A/B feed)<br/>Hot aisle containment<br/>PUE 1.2-1.4, cena 2×"]
TIER -->|"Tier IV"| T4["2N+1, fault tolerant<br/>Dual redundant + STS<br/>Hot + cold containment<br/>PUE 1.1-1.3, cena 3×"]
TIER --> POWER{"Power chain"}
POWER -->|"UPS"| UPS{"UPS typ"}
UPS -->|"Enterprise DC"| UPS1["VFI double-conversion<br/>Li-ion (LFP), 10-15 let<br/>N+1 nebo 2N modulární"]
UPS -->|"Edge / office"| UPS2["VI line-interactive<br/>VRLA, 3-5 let"]
POWER -->|"Generátor"| GEN["Diesel 500-2500 kVA<br/>Nádrž na 24-72 h<br/>ATS 4-10 ms přepnutí"]
POWER -->|"PDU"| PDU["3-phase 400 V<br/>Monitored/Switched<br/>A/B feed do racků"]
Start --> DENS{"Hustota výkonu"}
DENS -->|"< 10 kW/rack"| COOL1["Air cooling<br/>CRAC/CRAH, raised floor<br/>Hot aisle containment<br/>ASHRAE A1-A2"]
DENS -->|"10-25 kW/rack"| COOL2["Hybrid<br/>In-row cooling<br/>Rear door HX<br/>ASHRAE A1-H1"]
DENS -->|"> 25 kW/rack"| COOL3["Liquid cooling<br/>CDU, direct-to-chip<br/>Immersion single/two-phase<br/>ASHRAE W-třídy"]
Start --> CLIM{"Klimatická zóna"}
CLIM -->|"Mírná (ČR, DE)"| FC1["Free cooling 4000-6000 h/rok<br/>Chiller + economizer<br/>PUE saving 0.2-0.3"]
CLIM -->|"Teplá (ES, US South)"| FC2["Chiller celoročně<br/>Adiabatic cooling<br/>PUE 1.3-1.6"]
CLIM -->|"Chladná (SE, NO)"| FC3["Free cooling 7000+ h/rok<br/>Air-side economizer<br/>PUE < 1.2"]
```
## Monitoring disků — S.M.A.R.T.
Self-Monitoring, Analysis and Reporting Technology — prediktivní monitoring HDD/SSD.
| Klíčový atribut | ID | Popis |
|----------------|----|-------|
| Reallocated Sectors Count | 5 | Počet přemapovaných sektorů (nárůst = konec disku) |
| Power-On Hours | 9 | Celková doba provozu v hodinách |
| Reported Uncorrectable Errors | 187 | Nekorigovatelné chyby (červená kontrolka) |
| CRC Error Count | 199 | Chyby na SATA lince (kabel/controller) |
| SSD Life Left | 231 | % zbývající životnosti SSD |
| Media Wearout Indicator | 233 | Celkový zápis do NAND |
Nástroje: `smartmontools` (smartctl, smartd), Prometheus exporter (`node_exporter`), OTeL collector.
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
### Doporučená literatura
| Kniha | Autoři | ISBN | Popis |
|-------|--------|------|-------|
| The Data Center as a Computer (4th ed., 2025) | Barroso, Hölzle, Ranganathan | 978-3-031-99488-3 | Komplexní vývoj designu warehouse-scale computer (WSC) od Google architektů. Pokrývá hardware, software, power, cooling, networking a 25 let zkušeností s WSC. Klíčová publikace pro architekturu datových center. |
| Electronics Cooling: From the Chip to the Datacenter (Vol. 62) | Abraham et al. | 978-0-443-47084-4 | Praktický průvodce tepelným managementem od úrovně tranzistoru po datové centrum. Zahrnuje conduction, convection, liquid immersion a phase change cooling. Nezbytný zdroj pro návrh chlazení DC. |
## Páteřní služby datového centra
Při stavbě nového DC je potřeba nejdříve nasadit základní infrastrukturní služby — bez nich nelze provozovat vyšší vrstvy:
### DNS
| Role | Služba | Popis |
|------|--------|-------|
| **Authoritative** | Bind, PowerDNS, NSD | Primární DNS zóna pro interní domény |
| **Recursive** | Unbound, Bind (caching), CoreDNS | Resolver pro interní + externí dotazy |
| **Anycast** | DNS anycast (BGP) | Redundance, nižší latence |
| **Integrace** | Infoblox, BlueCat, dnsmasq | IPAM + DNS + DHCP v jednom |
Best practices: oddělené auth a recursive resolvery, DNSSEC, split-horizon (interní vs externí pohled), TSIG pro přenos zón, monitoring (DNS query latency, NXDOMAIN rate).
### NTP (časová synchronizace)
- **Primary**: GPS-disciplinované NTP servery (Microchip S600, Meinberg)
- **Secondary**: Stratum 1/2 NTP (ntpd, chrony, NTPsec)
- **All nodes**: chrony (moderní náhrada ntpd), lokální NTP server na každém rack switchi (boundary clock)
- **Precision**: PTP (IEEE 1588) pro telco/fintech — sub-microsecond accuracy
- **DC topologie**: GPS anténa → Grandmaster (PTP) → Boundary clock (rack switch) → Ordinary clock (server)
### DHCP + IPAM
| Nástroj | Popis |
|---------|-------|
| **ISC DHCP** | Legacy, stále široce nasazen |
| **Kea** | Moderní náhrada ISC DHCP (ISC + Linux Foundation) |
| **Infoblox / BlueCat** | Enterprise IPAM + DHCP + DNS |
| **NetBox / phpIPAM** | Open-source IPAM |
### LDAP / Identity Management
| Nástroj | Popis |
|---------|-------|
| **FreeIPA** | Integrované IDM (LDAP + Kerberos + DNS + CA) — Linux |
| **Active Directory** | Microsoft, LDAP + Kerberos + Group Policy |
| **389 Directory Server** | Open-source LDAP (Red Hat) |
| **OpenLDAP** | Klasický open-source LDAP |
| **Keycloak / Authentik** | Moderní OIDC/SAML/LDAP brány |
### PKI a certifikáty
- **Enterprise CA**: EJBCA, Smallstep, HashiCorp Vault (PKI engine)
- **ACME**: Cert-Manager (Kubernetes), certbot (Let's Encrypt)
- **mTLS**: Vault PKI, spire (SPIFFE), Cilium
- **Best practices**: root CA offline, intermediate CA per prostředí, certifikáty s krátkou platností (max 90 dní), revocation (CRL/OCSP)
### Monitoring a observabilita
Viz [MONITORING.md](MONITORING.md). Před spuštěním prvních workloadů musí DC mít:
- Sběr metrik (Prometheus, Zabbix)
- Centralizované logy (Loki, ELK)
- Alerting (Alertmanager, PagerDuty)
- Uptime monitoring (heartbeat checky)
### Logistika nasazení — pořadí kroků
```
1. DNS (alespoň recursive + local resolver)
2. NTP (časová synchronizace)
3. DHCP + IPAM (první servery dostanou IP)
4. LDAP / IAM (uživatelé, skupiny, přístupová práva)
5. PKI (certifikáty pro šifrování)
6. Configuration management (Ansible, Puppet)
7. Monitoring + logging (vidět co se děje)
8. Container registry / Package repo (docker registry, apt/yum mirror)
9. Load balancer (pro služby)
10. Storage backend (Ceph, NFS, SAN)
11. Orchestrace (Kubernetes, OpenStack)
```
## OpenStack v datacentru
OpenStack přináší do DC softwarovou abstrakční vrstvu, která umožňuje multi-tenancy a self-service:
### Control plane architektura
- **Controller nodes** — management služby (Keystone, Nova API, Neutron API, Horizon, RabbitMQ, DB)
- **Compute nodes** — hypervisor (KVM), Nova Compute, Neutron agent
- **Storage nodes** — Ceph OSD, Cinder volumes, Swift object storage
- **Network nodes** — Neutron L3 router, DHCP agent, DVR
### Požadavky na DC infrastrukturu
| Komponenta | Požadavek |
|------------|-----------|
| **Controller** | 3-5 node HA cluster, 16+ vCPU, 32+ GB RAM, SSD |
| **Compute** | Hustý výkon na rack (GPU, high-core), NUMA-aware design |
| **Storage (Ceph)** | 10-25 GbE networking, NVMe/SSD OSD, 3+ replica |
| **Network** | 25/100 GbE spine-leaf, L3 BGP underlay, VXLAN overlay |
| **Rack power** | 10-30 kW/rack pro GPU compute |
### Use cases
- Privátní cloud pro enterprise (multi-tenant, self-service Horizon)
- NFVI pro telco (DPDK, SR-IOV, low-latency)
- Akademické / HPC clustery (Ironic, Cyborg, Manila)
- Government / regulated prostředí (on-prem, audit trail)
*Poslední revize: 2026-06-03*

149
GPU.en.md Normal file
View File

@@ -0,0 +1,149 @@
# 🎮 GPU — architecture, models, virtualization
## GPU models
### NVIDIA
| GPU | Architecture | VRAM | HBM | FP16 (TFLOPS) | FP8 (TFLOPS) | Interconnect | TDP |
|-----|-------------|------|-----|--------------|-------------|-------------|-----|
| **A100** | Ampere (2020) | 40/80 GB | HBM2e | 312 | — | NVLink 3 (600 GB/s) | 400 W |
| **H100** | Hopper (2022) | 80 GB | HBM3 | 1000 | 2000 (sparse) | NVLink 4 (900 GB/s) | 700 W |
| **H200** | Hopper (2023) | 141 GB | HBM3e | 1650 | ~3300 | NVLink 4 (900 GB/s) | 700 W |
| **B200** | Blackwell (2024) | 192 GB | HBM3e | 2250 | ~4500 | NVLink 5 (1800 GB/s) | 700 W |
| **B100** | Blackwell (2024) | 192 GB | HBM3e | ~1800 | ~3600 | NVLink 5 | 700 W |
| **GB200** | Blackwell (2024) | — | HBM3e | 4500 (dual) | 9000 (dual) | NVLink 5 | 2700 W |
### AMD
| GPU | Architecture | VRAM | HBM | FP16 (TFLOPS) | Interconnect | TDP |
|-----|-------------|------|-----|--------------|-------------|-----|
| **MI250X** | CDNA 2 (2021) | 128 GB | HBM2e | 383 | Infinity Fabric | 500 W |
| **MI300X** | CDNA 3 (2023) | 192 GB | HBM3 | ~2600 | Infinity Fabric (896 GB/s) | 750 W |
| **MI350** | CDNA 4 (2025) | 288 GB | HBM3e | ~3500 | Infinity Fabric | 750 W |
## GPU interconnects
| Technology | Provider | Bandwidth | Topology | Use case |
|------------|-------------|-----------|-----------|----------|
| **NVLink 4** | NVIDIA | 900 GB/s (18× 50 GB/s) | GPU-GPU direct | AI training (H100, H200) |
| **NVLink 5** | NVIDIA | 1800 GB/s (18× 100 GB/s) | GPU-GPU direct | AI training (B200, GB200) |
| **Infinity Fabric** | AMD | 896 GB/s | GPU-GPU + CPU-GPU | AI training (MI300X, MI350) |
| **NVSwitch** | NVIDIA | 900 GB/s per GPU (NVLink) | Full-mesh (256 GPU) | DGX SuperPOD, HGX |
| **InfiniBand (NDR)** | NVIDIA/Mellanox | 400 Gbps per port | GPU-NIC direct, RDMA | Distributed training, HPC |
| **PCIe 5.0** | Standard | 63 GB/s per x16 | CPU-GPU | Inference, rendering |
| **Ethernet (RoCE v2)** | Standard | 100/200/400 GbE | GPU-NIC, RDMA over converged ethernet | AI inference, storage |
### GPU direct communication
```
GPU 0 ──NVLink── GPU 1 GPU 0 ───PCIe─── CPU ───PCIe─── GPU 1
│ │
│ │
NVSwitch InfiniBand
│ │
│ │
GPU 2 ──NVLink── GPU 3 GPU 2 ───PCIe─── CPU ───PCIe─── GPU 3
NVLink topologie (GPU direct) PCIe topologie (CPU mediated)
```
- **GPU Direct RDMA** — GPU ↔ NIC without CPU (InfiniBand, RoCE)
- **GPU Direct Storage** — GPU ↔ NVMe without CPU (NVIDIA Magnum IO)
- **NVSwitch** — full bisection bandwidth between all GPUs in a node
## GPU virtualization
| Technology | Description | GPU support | Use case |
|------------|-------|-------------|----------|
| **NVIDIA vGPU (Grid)** | Time slicing + dedicated profiles | A-series (VDI), Q-series (pro viz), B-series (AI) | VDI, virtualized AI |
| **NVIDIA MIG** | Hardware GPU partitioning | A100 (7 inst.), H100/H200/B200 | AI inference, multi-tenant GPU |
| **AMD MxGPU** | SR-IOV, hardware partitioning | AMD MI (pro), Radeon Pro | VDI, cloud gaming |
| **Intel SG (SG1)** | SR-IOV, hardware partitioning | Intel SG1, Flex, Arc | VDI, media transcoding |
| **GPU passthrough** | Dedicated GPU to whole VM (VFIO-pci) | All GPUs | AI training, HPC, highest performance |
### MIG partition table (A100 / H100)
| GPU | Partition profile | GPU Memory | Compute units |
|-----|------------------|-----------|--------------|
| **A100 80 GB** | 1g.5gb | 5 GB | 1 |
| A100 80 GB | 2g.10gb | 10 GB | 2 |
| A100 80 GB | 3g.20gb | 20 GB | 3 |
| A100 80 GB | 7g.40gb | 40 GB | 7 |
| A100 80 GB | Full (7× 1g) | 7 × 5 GB | 7 instances |
| **H100 80 GB** | 1g.6gb+me | 6 GB | 1 |
| H100 80 GB | 2g.12gb+me | 12 GB | 2 |
| H100 80 GB | 3g.24gb+me | 24 GB | 3 |
| H100 80 GB | 7g.80gb | 80 GB | 7 |
## GPU use cases
### AI Training
- **Models**: LLM (70B-405B+), vision, multimodal
- **GPU**: H100, B200, GB200, MI300X
- **Interconnect**: NVLink 5 / Infinity Fabric (within node), InfiniBand NDR (between nodes)
- **Parallelism**: Data Parallel (DDP), Tensor Parallel (TP), Pipeline Parallel (PP), Fully Sharded (FSDP)
- **Framework**: PyTorch (NCCL), JAX (XLA), DeepSpeed, Megatron-LM
- **Tips**:
- GB200: 2× B200 connected via NVLink, 8 GPU → 4 GB200
- DGX B200 / HGX B200: standard building block
- InfiniBand: fat tree topology for all-reduce optimization
### AI Inference
- **Models**: LLM serving, embedding, image gen
- **GPU**: A100, H200, B200 (larger VRAM for larger models)
- **Techniques**: MIG partition, TensorRT-LLM, vLLM, Triton Inference Server
- **Quantization**: FP8, INT8, INT4 → lower VRAM, higher throughput
- **Latency**: batch size optimization, dynamic batching, continuous batching
- **Scale**: on-prem (2-32 GPU) / cloud (elastic)
### VDI (Virtual Desktop Infrastructure)
- **GPU**: NVIDIA A16 (1 GPU = 16 users), A10 (1 GPU = 4 users)
- **Technology**: vGPU (Grid), AMD MxGPU
- **Protocols**: VMware Blast, Citrix HDX, Microsoft RDP, PC-over-IP (HP Teradici)
- **Use case**: CAD (CATIA, SolidWorks), Office, engineering, healthcare (PACS)
### Rendering and VFX
- **GPU**: NVIDIA RTX 6000 Ada, RTX A6000, AMD Radeon Pro W7900
- **Rendering**: Blender (Cycles/OptiX), V-Ray, Octane Render, Redshift
- **Denoising**: AI-accelerated denoising on GPU
- **Farm rendering**: Deadline, Qube! (job scheduler)
## GPU server form factors
| Form factor | GPU count | Power | Cooling | Example |
|------------|-----------|-------|---------|---------|
| **1U** | 1-2 | 700-1400 W | Air (high-RPM) | Dell XR4510c |
| **2U** | 4-8 | 3-6 kW | Air / Liquid | Dell R760xa, HPE DL380a |
| **4U** | 8-10 | 5-8 kW | Liquid | NVIDIA DGX H100, Dell R760xa |
| **8U / Chassis** | 8-16 | 10-20 kW | Liquid (CDU) | NVIDIA HGX, Supermicro SYS-821GE |
## OpenStack Cyborg (GPU lifecycle management)
Cyborg is an OpenStack service for managing accelerators (GPU, FPGA, DPU, NPU).
### Key capabilities
- **Discovery** — automatic GPU detection on compute nodes (NVIDIA, AMD, Intel)
- **Inventory** — tracking available accelerators in the cluster
- **Lifecycle** — attach/detach GPU to VM, firmware update, reset
- **Scheduling** — Placement API for GPU-aware scheduling (Nova)
- **Cyborg API** — REST API for accelerator management
### Integration
| Component | Role |
|------------|------|
| **Nova** | VM scheduling with GPU requirements (extra_specs: `accel:device_profile`) |
| **Placement** | Resource provider for GPU (inventory, traits) |
| **Neutron** | SR-IOV VF passthrough for GPU networking |
| **Ironic** | Bare metal + GPU provisioning |
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Last revision: 2026-06-03*

149
GPU.md Normal file
View File

@@ -0,0 +1,149 @@
# 🎮 GPU — architektura, modely, virtualizace
## GPU modely
### NVIDIA
| GPU | Architektura | VRAM | HBM | FP16 (TFLOPS) | FP8 (TFLOPS) | Interconnect | TDP |
|-----|-------------|------|-----|--------------|-------------|-------------|-----|
| **A100** | Ampere (2020) | 40/80 GB | HBM2e | 312 | — | NVLink 3 (600 GB/s) | 400 W |
| **H100** | Hopper (2022) | 80 GB | HBM3 | 1000 | 2000 (sparse) | NVLink 4 (900 GB/s) | 700 W |
| **H200** | Hopper (2023) | 141 GB | HBM3e | 1650 | ~3300 | NVLink 4 (900 GB/s) | 700 W |
| **B200** | Blackwell (2024) | 192 GB | HBM3e | 2250 | ~4500 | NVLink 5 (1800 GB/s) | 700 W |
| **B100** | Blackwell (2024) | 192 GB | HBM3e | ~1800 | ~3600 | NVLink 5 | 700 W |
| **GB200** | Blackwell (2024) | — | HBM3e | 4500 (dual) | 9000 (dual) | NVLink 5 | 2700 W |
### AMD
| GPU | Architektura | VRAM | HBM | FP16 (TFLOPS) | Interconnect | TDP |
|-----|-------------|------|-----|--------------|-------------|-----|
| **MI250X** | CDNA 2 (2021) | 128 GB | HBM2e | 383 | Infinity Fabric | 500 W |
| **MI300X** | CDNA 3 (2023) | 192 GB | HBM3 | ~2600 | Infinity Fabric (896 GB/s) | 750 W |
| **MI350** | CDNA 4 (2025) | 288 GB | HBM3e | ~3500 | Infinity Fabric | 750 W |
## GPU interconnects
| Technologie | Poskytovatel | Bandwidth | Topologie | Use case |
|------------|-------------|-----------|-----------|----------|
| **NVLink 4** | NVIDIA | 900 GB/s (18× 50 GB/s) | GPU-GPU direct | AI training (H100, H200) |
| **NVLink 5** | NVIDIA | 1800 GB/s (18× 100 GB/s) | GPU-GPU direct | AI training (B200, GB200) |
| **Infinity Fabric** | AMD | 896 GB/s | GPU-GPU + CPU-GPU | AI training (MI300X, MI350) |
| **NVSwitch** | NVIDIA | 900 GB/s per GPU (NVLink) | Full-mesh (256 GPU) | DGX SuperPOD, HGX |
| **InfiniBand (NDR)** | NVIDIA/Mellanox | 400 Gbps per port | GPU-NIC direct, RDMA | Distributed training, HPC |
| **PCIe 5.0** | Standard | 63 GB/s per x16 | CPU-GPU | Inference, rendering |
| **Ethernet (RoCE v2)** | Standard | 100/200/400 GbE | GPU-NIC, RDMA over converged ethernet | AI inference, storage |
### GPU direct communication
```
GPU 0 ──NVLink── GPU 1 GPU 0 ───PCIe─── CPU ───PCIe─── GPU 1
│ │
│ │
NVSwitch InfiniBand
│ │
│ │
GPU 2 ──NVLink── GPU 3 GPU 2 ───PCIe─── CPU ───PCIe─── GPU 3
NVLink topologie (GPU direct) PCIe topologie (CPU mediated)
```
- **GPU Direct RDMA** — GPU ↔ NIC bez CPU (InfiniBand, RoCE)
- **GPU Direct Storage** — GPU ↔ NVMe bez CPU (NVIDIA Magnum IO)
- **NVSwitch** — full bisection bandwidth mezi všemi GPU v node
## Virtualizace GPU
| Technologie | Popis | GPU support | Use case |
|------------|-------|-------------|----------|
| **NVIDIA vGPU (Grid)** | Časové slicing + dedikované profily | A-series (VDI), Q-series (pro viz), B-series (AI) | VDI, virtualizované AI |
| **NVIDIA MIG** | Hardwarové partition GPU | A100 (7 inst.), H100/H200/B200 | AI inference, multi-tenant GPU |
| **AMD MxGPU** | SR-IOV, hardwarové partition | AMD MI (pro), Radeon Pro | VDI, cloud gaming |
| **Intel SG (SG1)** | SR-IOV, hardwarové partition | Intel SG1, Flex, Arc | VDI, media transcoding |
| **GPU passthrough** | Dedikovaný GPU celé VM (VFIO-pci) | Všechny GPU | AI training, HPC, nejvyšší výkon |
### MIG partition table (A100 / H100)
| GPU | Partition profile | GPU Memory | Compute units |
|-----|------------------|-----------|--------------|
| **A100 80 GB** | 1g.5gb | 5 GB | 1 |
| A100 80 GB | 2g.10gb | 10 GB | 2 |
| A100 80 GB | 3g.20gb | 20 GB | 3 |
| A100 80 GB | 7g.40gb | 40 GB | 7 |
| A100 80 GB | Full (7× 1g) | 7 × 5 GB | 7 instances |
| **H100 80 GB** | 1g.6gb+me | 6 GB | 1 |
| H100 80 GB | 2g.12gb+me | 12 GB | 2 |
| H100 80 GB | 3g.24gb+me | 24 GB | 3 |
| H100 80 GB | 7g.80gb | 80 GB | 7 |
## GPU use cases
### AI Training
- **Modely**: LLM (70B-405B+), vision, multimodal
- **GPU**: H100, B200, GB200, MI300X
- **Interconnect**: NVLink 5 / Infinity Fabric (v rámci node), InfiniBand NDR (mezi nody)
- **Parallelism**: Data Parallel (DDP), Tensor Parallel (TP), Pipeline Parallel (PP), Fully Sharded (FSDP)
- **Framework**: PyTorch (NCCL), JAX (XLA), DeepSpeed, Megatron-LM
- **Tipy**:
- GB200: 2× B200 propojené NVLink, 8 GPU → 4 GB200
- DGX B200 / HGX B200: standardní building block
- InfiniBand: fat tree topology pro all-reduce optimalizaci
### AI Inference
- **Modely**: LLM serving, embedding, image gen
- **GPU**: A100, H200, B200 (larger VRAM pro větší modely)
- **Techniky**: MIG partition, TensorRT-LLM, vLLM, Triton Inference Server
- **Kvantizace**: FP8, INT8, INT4 → nižší VRAM, vyšší throughput
- **Latency**: batch size optimalizace, dynamic batching, continuous batching
- **Scale**: on-prem (2-32 GPU) / cloud (elastic)
### VDI (Virtual Desktop Infrastructure)
- **GPU**: NVIDIA A16 (1 GPU = 16 users), A10 (1 GPU = 4 users)
- **Technologie**: vGPU (Grid), AMD MxGPU
- **Protokoly**: VMware Blast, Citrix HDX, Microsoft RDP, PC-over-IP (HP Teradici)
- **Use case**: CAD (CATIA, SolidWorks), Office, engineering, healthcare (PACS)
### Rendering a VFX
- **GPU**: NVIDIA RTX 6000 Ada, RTX A6000, AMD Radeon Pro W7900
- **Rendering**: Blender (Cycles/OptiX), V-Ray, Octane Render, Redshift
- **Denoising**: AI-accelerated denoising na GPU
- **Farm rendering**: Deadline, Qube! (job scheduler)
## GPU server form factors
| Form factor | GPU count | Power | Cooling | Příklad |
|------------|-----------|-------|---------|---------|
| **1U** | 1-2 | 700-1400 W | Air (high-RPM) | Dell XR4510c |
| **2U** | 4-8 | 3-6 kW | Air / Liquid | Dell R760xa, HPE DL380a |
| **4U** | 8-10 | 5-8 kW | Liquid | NVIDIA DGX H100, Dell R760xa |
| **8U / Chassis** | 8-16 | 10-20 kW | Liquid (CDU) | NVIDIA HGX, Supermicro SYS-821GE |
## OpenStack Cyborg (GPU lifecycle management)
Cyborg je OpenStack service pro správu akcelerátorů (GPU, FPGA, DPU, NPU).
### Klíčové schopnosti
- **Discovery** — automatická detekce GPU na compute node (NVIDIA, AMD, Intel)
- **Inventory** — tracking dostupných akcelerátorů v clusteru
- **Lifecycle** — attach/detach GPU k VM, firmware update, reset
- **Scheduling** — Placement API pro GPU-aware scheduling (Nova)
- **Cyborg API** — REST API pro správu akcelerátorů
### Integrace
| Komponenta | Role |
|------------|------|
| **Nova** | VM scheduling s GPU požadavky (extra_specs: `accel:device_profile`) |
| **Placement** | Resource provider pro GPU (inventory, traits) |
| **Neutron** | SR-IOV VF passthrough pro GPU networking |
| **Ironic** | Bare metal + GPU provisioning |
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Poslední revize: 2026-06-03*

12
HARDWARE.en.md Normal file
View File

@@ -0,0 +1,12 @@
# 🔧 Hardware and servers
This file has been split into separate areas:
| Area | File |
|--------|--------|
| 🔧 Server hardware — components and architecture | [SERVER-HW.md](SERVER-HW.md) |
| 🎮 GPU — architecture, models, virtualization | [GPU.md](GPU.md) |
| ⚙️ Server configuration — best practices by workload | [SERVER-CONFIG.md](SERVER-CONFIG.md) |
| 📦 Provisioning — boot, installation, server management | [PROVISIONING.md](PROVISIONING.md) |
*Last revision: 2026-06-03*

12
HARDWARE.md Normal file
View File

@@ -0,0 +1,12 @@
# 🔧 Hardware a servery
Tento soubor byl rozdělen do samostatných oblastí:
| Oblast | Soubor |
|--------|--------|
| 🔧 Server hardware — komponenty a architektura | [SERVER-HW.md](SERVER-HW.md) |
| 🎮 GPU — architektura, modely, virtualizace | [GPU.md](GPU.md) |
| ⚙️ Server configuration — best practices podle workloadu | [SERVER-CONFIG.md](SERVER-CONFIG.md) |
| 📦 Provisioning — boot, instalace, správa serverů | [PROVISIONING.md](PROVISIONING.md) |
*Poslední revize: 2026-06-03*

455
HYPERVISORS.en.md Normal file
View File

@@ -0,0 +1,455 @@
# 🖥️ Hypervisors and Virtualization Platforms
## Hypervisor Types
| Type | Description | Examples |
|-----|-------|----------|
| **Type 1** (bare-metal) | Runs directly on hardware | VMware ESXi, Microsoft Hyper-V, KVM, Xen |
| **Type 2** (hosted) | Runs on top of host OS | VirtualBox, VMware Workstation, Parallels |
## Platform Overview
| Platform | Hypervisor | License | Note |
|-----------|-----------|---------|----------|
| **VMware vSphere** | ESXi | Proprietary (Subscription from 2024) | Market leader, wide adoption. After Broadcom acquisition (2023), switched to per-core subscription, perpetual license discontinued |
| **Microsoft Hyper-V** | Hyper-V | Windows Server / standalone | Integration with Azure, SCVMM |
| **Proxmox VE** | KVM + LXC | Open source | Debian-based, web UI, low cost |
| **Red Hat OpenStack / oVirt** | KVM | Open source | Open alternative, complex |
| **Nutanix AHV** | KVM (fork) | Part of Nutanix | Integrated HCI solution |
| **XCP-ng / Xen Server** | Xen | Open source | Successor to Citrix Hypervisor |
| **Oracle VM** | Xen | Proprietary | Oracle ecosystem |
## Key Concepts
- **VM — Virtual Machine** — full virtualization, own kernel
- **Container** — shared host kernel, lighter (Docker, LXC)
- **Paravirtualization** — guest OS knows it runs in a VM (better I/O performance)
- **NUMA** — Non-Uniform Memory Access, CPU/memory allocation optimization (see [SERVER-HW.md](SERVER-HW.md#numa))
- **Overcommit** — allocating more vCPU/RAM than physically available (ratio management)
- **Live Migration** — moving a running VM between hosts (vSphere vMotion, Hyper-V Live Migration)
- **HA (High Availability)** — VM restart on another host upon failure
- **DRS / Load Balancing** — automatic VM distribution based on load
## VMware vSphere
### VMware licensing (post-Broadcom 2024+)
Since 2024, VMware only sells subscription licenses; perpetual + SnS (Support & Subscription) have been discontinued.
| Product | Metric | Price (indicative) | What it includes |
|---------|---------|-------------------|-------------|
| **vSphere Standard** | Per core (min 16 cores/CPU) | ~$140/core/year | ESXi, vCenter, vMotion, HA, DRS basic |
| **vSphere Enterprise Plus** | Per core | ~$220/core/year | All above + DRS advanced, SIOC, NIOC, Big Data Extensions |
| **vSphere Foundation** | Per core (bundle) | ~$350/core/year | vSphere Enterprise Plus + Aria Operations, Aria Operations for Logs, Aria Automation |
| **VMware Cloud Foundation (VCF)** | Per core (bundle) | ~$700/core/year | vSphere + vSAN + NSX + Aria full suite. Required for vSAN and NSX from 2025 |
| **vSAN** | Per core (only as part of VCF from 2025) | No longer standalone | Storage virtualization, dedup, compression, encryption |
| **NSX** | Per core (only as part of VCF from 2025) | No longer standalone | SDN, micro-segmentation, firewall, load balancing |
**Key changes after Broadcom acquisition**:
- Discontinued perpetual license sales (May 2024)
- Discontinued standalone products: vSAN and NSX can no longer be purchased standalone (only within VCF)
- Desktop and ROBO variants cancelled (migrated to VCF)
- Average cost increase: 25× compared to the previous model (depends on size and product mix)
- **Impact**: Many customers are migrating to Proxmox VE, Nutanix AHV, or Hyper-V
**Per-core calculation**:
```text
Server: 2× EPYC 9654 (96C each) = 192 cores
vSphere Standard: 192 × $140 = $26,880/year
VCF: 192 × $700 = $134,400/year (incl. vSAN and NSX)
For comparison: previously perpetual + SnS ≈ $15,000 one-time + $3,000/year
```
### VMware Exit Strategy (post-Broadcom 2024+)
#### Context
After Broadcom's acquisition of VMware (completed November 2023), the virtualization market experienced the biggest upheaval in its history. Changes include:
- **Discontinuation of perpetual licenses** (February 2024) — mandatory subscription model
- **Forced bundling** — 8,000+ SKUs reduced to 4 bundles (VCF, VVF, vSphere Standard/Foundation)
- **Minimum 72-core commitment** (from April 2025) — small servers can no longer be licensed economically
- **20% late renewal penalty** — no tolerance
- **Price increase of 1501,500%** depending on size and product mix
- **Standalone products discontinued** — vSAN and NSX only within VCF
- **Collapse of the partner ecosystem** — from 4,500+ partners to ~300 Premier
According to Foundry/CIO.com survey (2025): **56%** of organizations plan to reduce VMware usage, **71%** are actively looking for on-premise alternatives. Gartner predicts a loss of ~35% of workloads within 3 years.
#### Three Strategies
| Strategy | Description | Suitable for |
|-----------|-------|------------|
| **Stay** | Accept new pricing, renew VCF/VVF subscription | Large organizations with deep integration where migration costs more than new licenses |
| **Reduce** | Reduce VMware footprint, migrate part of workloads to alternatives, optimize the rest | Medium and large enterprises with heterogeneous environments |
| **Exit** | Complete migration to an alternative platform | SMEs, organizations facing 36× cost increases, greenfield projects |
#### Target Platforms — Comparison
| Criterion | Proxmox VE | Nutanix AHV | Microsoft Hyper-V | Red Hat OpenShift Virtualization |
|-----------|-----------|-------------|-------------------|----------------------------------|
| **Hypervisor** | KVM + LXC | KVM (fork) | Hyper-V | KVM (KubeVirt) |
| **License** | Open source (free), support ~€500/host/year | Per node subscription (3060% savings vs VCF) | Windows Server license (Standard/Datacenter) | OpenShift subscription (core-based) |
| **Live Migration** | Live Migration (Proxmox 8+) | AHV Live Migration | Live Migration (SMB/RDMA) | KubeVirt (VMI live migration) |
| **HA** | Proxmox HA (watchdog, fencing) | Built-in HA (Prism) | Hyper-V HA (WS Failover Cluster) | OpenShift HA (self-healing) |
| **Storage** | ZFS, Ceph, LVM | AOS (hybrid/SSD, erasure coding) | S2D, CSV, ReFS | OCS, Ceph, LSO |
| **Backup** | Proxmox Backup Server (free) | Native snapshot + DR | Windows Server Backup / Veeam | OpenShift APIs + OADP |
| **Price (3 years, 3 hosts)** | $0 + support $1,500 | ~$45,00060,000 | $0 (Hyper-V Server free) or Windows Server license | ~$90,000+ (OpenShift) |
| **Price (3 years, 10 hosts)** | $0 + support $5,000 | ~$150,000200,000 | Windows Server Datacenter for unlimited VMs | ~$300,000+ (OpenShift) |
| **Migration difficulty** | Medium (VMDK → QCOW2, VirtIO drivers) | Low (Nutanix Move tool) | Medium (V2V converter, SCVMM) | High (Kubernetes learning curve) |
| **Linux support** | Excellent (native KVM) | Excellent (KVM-based) | Good (LIS drivers) | Excellent (KVM + OpenShift) |
| **Windows support** | Good (VirtIO drivers) | Excellent (ALAS drivers, svpd) | Excellent (native) | Good (KubeVirt + VirtIO) |
| **GPU passthrough** | VFIO (excellent) | GPU passthrough | DDA (Direct Device Assignment) | VFIO + GPU Operator |
#### Migration Tools
| Tool | Source Platform | Target Platform | Method |
|---------|-------------------|-------------------|--------|
| **Proxmox VMware Import Wizard** | VMware ESXi | Proxmox VE | Web GUI import via NFS/ESXi API. Limitation: snapshots must be removed, UEFI not supported before Proxmox 8.1 |
| **Nutanix Move** | VMware ESXi, Hyper-V | Nutanix AHV | Virtual appliance, automated migration with minimal downtime, UEFI support, can retain IP/MAC |
| **Veeam Backup & Replication v12.2+** | VMware ESXi | Proxmox VE | Backup/restore via Veeam, hot migration, Proxmox support from v12.2 |
| **StarWind V2V Converter** | VMware ESXi | Proxmox, Hyper-V, XCP-ng | Free GUI tool, VMDK → QCOW2/raw/VHDX, CLI support, hot migrations |
| **virt-v2v** | VMware ESXi, Xen, Hyper-V | KVM (libvirt) | Open source CLI tool, disk + driver conversion (virtio), suitable for bulk migration |
| **Windows Admin Center VM Conversion Extension** | VMware ESXi | Hyper-V | Microsoft WAC extension, free, GUI-based, bulk migration |
| **Platform9 vJailbreak** | VMware ESXi | OpenStack / KVM | In-place migration (no swing gear), open source |
#### TCO Comparison — Example: 3 hosts (2× 20C CPU), 50 VMs
| Platform | Year 1 | 3 Years Total | Note |
|-----------|--------|---------------|----------|
| **VMware VVF** (1-year rate) | $22,800 | $68,400 | 120 cores × $190/core/year |
| **VMware VCF** | $42,000 | $126,000 | 120 cores × $350/core/year |
| **Proxmox VE** (support) | $1,500 | $4,500 | 3× €500/host/year |
| **Nutanix AHV** (average) | ~$18,000 | ~$54,000 | Per node subscription, estimate |
| **Hyper-V** (Windows Server Datacenter) | $12,400 | $37,200 | One-time license per core, without SA |
| **Hyper-V** (Azure Stack HCI) | ~$7,200 | ~$21,600 | ~$10/core/month, 120 cores |
**Real-world example from Spiceworks (2026)**: A user reports VMware Essentials+ increasing from $1,900/year to $14,000/year (VVF) — a 7.4× increase.
#### Decision Framework
```
1. Audit VMware environment
├─ Number of hosts, core count, utilization
├─ Feature dependency (vSAN, NSX, SRM)
├─ Workload profile (Windows vs Linux, DB, GPU)
└─ Hardware refresh cycle
2. Calculate TCO for VMware renewal (3 years)
├─ VVF vs VCF vs current model
└─ Include audit risk, late renewal penalty
3. Select target platform (1-2 candidates)
├─ Proxmox: lowest TCO, Linux-heavy shops
├─ Nutanix: enterprise HCI, low migration difficulty
├─ Hyper-V: Windows-centric, Azure hybrid
└─ OpenShift: Kubernetes-first, platform engineering
4. Plan migration phases
├─ Wave 1: non-critical (dev/test, 1-2 months)
├─ Wave 2: standard production (3-6 months)
├─ Wave 3: mission-critical (6-12 months)
└─ Coexistence: VMware + target running in parallel
5. Allow 18-48 months for complete exit (Gartner)
```
#### Real-World Case Studies
| Organization | Starting Point | Target | Scale | Result |
|-----------|---------|-----|--------|----------|
| **Stanford University** | VMware (60+ nodes) | Proxmox VE (6 clusters) | 1,500 VMs | Completed 2025, increased automation, lower costs |
| **Michelin** | VMware | Platform9 + OpenStack | Dozens of nodes | Platform engineering team, production workload migration |
| **Czech enterprise (50-100 servers)** | VMware | Proxmox VE | ~100 VMs | Annual savings of ~340,000500,000 CZK on licenses |
#### Timing — Key Deadlines
| Event | Date | Impact |
|---------|-------|-------|
| **Discontinuation of perpetual licenses** | February 2024 | Already done |
| **72-core minimum** | April 2025 | Small server licensing became more expensive |
| **vSphere 7 EOS** | April 2025 | Upgrade to 8.x required |
| **ESXi 8.0 EOS** | October 2027 | Last supported version, migration deadline |
| **Windows Server 2025 Hyper-V** | December 2025 | 64-host cluster, 2,048 vCPU per VM |
| **Proxmox VE 9 + Datacenter Manager** | 2026 | Enterprise features, vCenter alternative |
#### Recommendations
| Scenario | Action |
|--------|------|
| **Small company (< 10 hosts), Linux workloads** | Migrate to Proxmox VE — immediate 100% license savings |
| **Medium company (10-50 hosts), mixed workloads** | Evaluate Nutanix AHV (easy migration) or Proxmox (lower TCO) |
| **Enterprise (50+ hosts), deep VMware integration** | Reduce strategy: optimize existing VMware + migrate selected workloads to OpenShift / Hyper-V |
| **Microsoft shop** | Hyper-V / Azure Stack HCI — native Azure hybrid, no additional hypervisor licenses |
| **Kubernetes-native team** | OpenShift Virtualization / KubeVirt — unify VM and container management |
| **MSP / hosting provider** | Nutanix or OpenStack — multi-tenancy, vCloud Director alternative |
#### Cluster Design
- **Max cluster size**: 64 hosts (vSphere 8/9), 96 hosts (vSphere 8 + enhanced)
- **Datastore limits**: max 256 datastores per host, max 65 TB per VMFS-6 datastore
- **vSAN ready capacity**: recommended max 6064 hosts per vSAN cluster
- **Fault domains** — cluster division into host groups (rack awareness), min 3 fault domains for stretched cluster
- **Admission control** — resource reservation for HA failover:
- **Host failures cluster tolerates** — most common (14 hosts)
- **Percentage of cluster resources** — reserve % of CPU/memory
- **Dedicated failover hosts** — dedicated host(s) for HA
- **Cluster limits (vSphere 8/9)**:
- 960 VMs per host (vSphere 9 max)
- 15,000 VMs per cluster (vCenter max)
- 300 hosts per cluster (vSphere 8/9, hardware vMotion)
### Microsoft Hyper-V Licensing
| Variant | Metric | Price | What it includes |
|----------|---------|------|-------------|
| **Windows Server Standard** | Per core (min 16 licenses/server) + CAL | ~$1,000/core (one-time) + $200/CAL | 2 VM licenses (each with full Windows Server license) |
| **Windows Server Datacenter** | Per core (min 16 licenses/server) + CAL | ~$6,200/core (one-time) + $200/CAL | Unlimited VMs, Storage Spaces Direct, Shielded VMs |
| **Azure Stack HCI** | Per core (monthly) | ~$1020/core/month (Azure hybrid benefit) | Hyper-V + S2D + Azure management, part of Azure subscription |
| **Hyper-V Server** | Free | $0 | Standalone hypervisor (no management, no GUI, limited support) — no longer distributed as of 2025 |
**Important**:
- Windows Server Standard = 2 VMs per license. If you need 3 VMs on a 2-socket server, you need 2× Standard license (4 VMs) or Datacenter
- **Azure Hybrid Benefit** — if you have Windows Server with SA (Software Assurance), you can use licenses in Azure at no additional cost
- **CAL (Client Access License)** — every user or device accessing Windows Server must have a CAL (except Azure Hybrid Benefit)
## Microsoft Hyper-V
| Feature | Hyper-V | Note |
|-----------|---------|----------|
| **Max hosts per cluster** | 64 (Windows Server 2025) | Shared Nothing Live Migration |
| **Max VMs per host** | 1,024 (WS 2022+) | Generation 2 VMs |
| **Max vCPU per VM** | 240 (WS 2022+) | 64-host cluster |
| **Max RAM per VM** | 12 TB (WS 2022+) | Dynamic memory |
| **Live Migration** | SMB, CSV, RDMA | Compressed or RDMA |
| **Storage** | CSV (Cluster Shared Volumes), ReFS | S2D for HCI |
| **Nested Virtualization** | Yes | Intel VT-x / AMD-V |
| **SCVMM** | System Center VMM | Enterprise management, fabric, P2V |
### Hyper-V vs VMware Comparison
| Feature | VMware vSphere | Microsoft Hyper-V |
|-----------|---------------|-------------------|
| **OS** | VMware ESXi (VMkernel) | Windows Server / Hyper-V Server |
| **License** | Per CPU (subscription) | Windows Server license / Datacenter |
| **Storage** | VMFS, NFS, vSAN, HCI | NTFS, ReFS, SMB, S2D |
| **Live Migration** | vMotion (cross-vSwitch, long distance) | Live Migration (SMB/RDMA) |
| **Storage Migration** | Storage vMotion (online) | Shared Nothing (data disk) |
| **Replication** | vSphere Replication | Hyper-V Replica (ASR) |
| **Management** | vCenter, vSphere Client | SCVMM, Hyper-V Manager, Admin Center |
| **Linux support** | Excellent (open-vm-tools) | Good (Linux Integration Services) |
| **TCO** | Higher | Lower (with Windows license) |
## KVM
### Architecture
```
Hardware ──> QEMU (I/O emulation) + KVM (kernel module, virtualization)
libvirt (API + management)
┌───────┼───────────┐
virt-manager virsh openstack/proxmox
```
### Tuning
- **CPU pinning** — `virsh vcpupin vm1 0 2` (vCPU 0 → physical core 2), prevents context switching
- **Huge pages** — 2 MB / 1 GB pages instead of 4 KB, reduces TLB misses (VMs with large RAM): `echo 2048 > /proc/sys/vm/nr_hugepages`
- **NUMA affinity** — VM pinned to one NUMA node (minimizes cross-NUMA memory access)
- `numactl --cpunodebind=0 --membind=0`
- `virsh numatune vm1 --nodeset 0`
- **VirtIO** — paravirtualized I/O (virtio-net, virtio-blk, virtio-scsi) for better performance
- **IO threads** — dedicated threads for QEMU I/O emulation
### KVM Tuning Checklist
- Verify HW virtualization: `lscpu | grep Virtualization`
- Load KVM modules: `kvm`, `kvm_intel`/`kvm_amd`, `vfio-pci`
- Optimize storage: raw/LVM (avoid qcow2 for performance workloads)
## Storage in Hypervisors
See also: [STORAGE.md](STORAGE.md) — detailed overview of storage protocols and configurations.
| Type | Description | Protocols |
|-----|-------|-----------|
| **Local storage** | Disks directly in the server | SATA, SAS, NVMe |
| **Shared storage** | SAN / NAS accessible to all hosts | Fibre Channel, iSCSI, NFS, SMB |
| **vSAN / HCI** | Hyperconverged storage (server disks = single pool) | VMware vSAN, Nutanix, StarWind |
| **Software-Defined** | SDS separates storage software from hardware | Ceph, GlusterFS, MinIO |
## HCI Details
| Feature | Nutanix (AOS + AHV) | VMware vSAN | Azure Stack HCI |
|-----------|--------------------|-------------|----------------|
| **Hypervisor** | AHV (KVM fork), ESXi optional | ESXi (required) | Hyper-V |
| **Min. nodes** | 3 | 2 (witness) | 2 (witness) |
| **Max nodes** | 80+ | 64 | 16 (typical) |
| **Replication** | 2 or 3 copies + erasure coding | Mirroring (RAID 1), erasure coding | Mirroring + parity |
| **Deduplication** | Cluster-level (post-process) | Disk-level (capacity tier) | ReFS (real-time) |
| **Compression** | Inline (AOS 6+) | Dedup + compression combined | ReFS |
| **Management** | Prism (web UI) | vCenter + vSAN UI | Windows Admin Center |
| **Licensing** | Per node subscription | Per CPU subscription | Per core subscription |
| **Ecosystem** | Built-in DR, backup, security | Broad ISV ecosystem | Azure integration |
| **Use case** | Enterprise VDI, general VM | VMware-centric shops | Azure hybrid, branch offices |
## Virtualization Platforms — Comparison
| Capability | VMware vSphere | Microsoft Hyper-V | Proxmox VE | Nutanix AHV |
|-----------|---------------|-------------------|------------|-------------|
| Live Migration | vMotion | Live Migration | Live Migration | Live Migration |
| HA | vSphere HA | Hyper-V HA | Proxmox HA | Built-in |
| DRS/balancing | DRS | SCVMM / AKS | HA groups | Built-in |
| Storage vMotion | yes | when VM is off | ZFS send/recv | Built-in |
| Snapshots | yes | yes | yes | yes |
| Backup API | CBT (Changed Block Tracking) | Hyper-V WMI / RCT | Proxmox Backup Server | Native |
| GPU passthrough | vGPU (NVIDIA Grid) | DDA | VFIO passthrough | GPU passthrough |
| Licensing | Per CPU / subscription | Windows Server license | Open source (free) | Per node subscription |
## OpenStack
- **Distributions**: Red Hat OpenStack, Canonical Charmed OpenStack
- **Services**: Nova (compute), Cinder (block), Neutron (networking), Glance (images), Swift (object)
- **Use case**: Telco, large private clouds, MNO (MANO, NFVI)
- **Complexity**: High — complex deployment and maintenance
---
## Variant Hypervisor Configurations by Size and Storage Type
### Platform Selection by Use Case
| Use Case | Primary Choice | Alternative | Rationale |
|----------|---------------|-------------|------------|
| **VMware shop, enterprise** | vSphere 8/9 | Hyper-V | Most comprehensive ecosystem, vSAN, SRM, broadest ISV support |
| **Microsoft shop, Azure hybrid** | Hyper-V / Azure Stack HCI | vSphere | Windows Server CAL already in place, S2D, Azure Arc, native Hyper-V Replica |
| **SME / low budget** | Proxmox VE | XCP-ng / Hyper-V (free) | Open source, built-in Ceph, ZFS, PBS, no license costs |
| **HCI greenfield** | Nutanix AHV | VMware vSAN | All-in-one, simple management, built-in DR and backup |
| **Hyperscale / telco** | OpenStack (RHOSP) | — | Multi-tenancy, NFVI, MANO, Neutron SDN, Ceph integration |
### Variant A: Small Deployment (2-3 hosts, local storage)
For small companies, branch offices, edge, dev/test. No shared storage — HA provided at the application level or via VM replication.
| Parameter | Proxmox VE | VMware vSphere | Hyper-V |
|----------|-----------|---------------|---------|
| **CPU** | 1× EPYC 9124-9224 / Xeon 4410Y (8-16C) | 1× EPYC 9124-9224 / Xeon 4410Y | 1× Xeon 4410Y / EPYC 9124 |
| **RAM** | 64-128 GB (DDR5-4800, 1DPC) | 64-128 GB | 64-128 GB |
| **OS disk** | 2× SATA SSD RAID1 (240-480 GB) | 2× SATA SSD RAID1 | 2× SATA SSD RAID1 |
| **VM storage** | ZFS RAID10 (4-6× NVMe/SATA SSD) | VMFS local (4-6× SSD RAID5/10) | ReFS CSV (4-6× SSD RAID10) |
| **Network** | 2× 10/25 GbE LACP | 2× 10/25 GbE LACP + management | 2× 10/25 GbE LACP |
| **Management** | Proxmox web UI (1× node) | vCSA / vCenter (1× appliance) | Windows Admin Center / SCVMM |
| **HA** | Proxmox HA (watchdog, fencing) | vSphere HA (1 host failure) | Hyper-V HA (WS Failover Cluster) |
| **Backup** | Proxmox Backup Server | Veeam B&R (Community) | Windows Server Backup / Veeam |
| **License** | Free (support ~€500/host/year) | vSphere Essentials (~$600/3 hosts) | Windows Server Standard (2 VMs) |
**Use case**: Startup, branch office, dev/test, < 200 VMs, no SAN, minimal budget.
**Advantages**: Low cost, simple management. **Disadvantages**: Limited scalability, host failure = VM unavailability.
### Variant B: Medium HCI (3-6 hosts, vSAN / Ceph)
Hyperconverged infrastructure — storage runs on the same hosts as VMs.
| Parameter | VMware vSAN | Proxmox + Ceph | Nutanix AHV |
|----------|------------|----------------|-------------|
| **CPU** | 1-2× EPYC 9334-9654 (16-32C) | 1-2× EPYC 9224-9334 (12-24C) | 1-2× EPYC 9334-9654 |
| **RAM** | 256-512 GB | 128-256 GB | 256-512 GB |
| **Cache tier** | 1-2× NVMe cache (write buffer) | — (Ceph uses RAM/OSD) | 1-2× NVMe (oplog) |
| **Capacity tier** | 4-8× SSD (SAS/SATA) | 4-8× HBA NVMe/SSD (OSD) | 4-6× SSD (extent store) |
| **Network** | 4× 25 GbE (vSAN + VM + mgmt) | 4× 25 GbE (Ceph public + cluster) | 4× 25 GbE (storage + VM) |
| **Fault domain** | Rack awareness (3 racks min) | CRUSH rack level | Rack awareness |
| **Replication** | RAID-1 mirroring (FTT=1) | 3× replication / EC 8+3 | 2× copies + EC |
| **Dedupe/Compress** | Dedup + compression (capacity) | ZFS / Ceph compression (inline) | Inline compression |
| **HA limit** | 1-3 host failures | 1-2 host failures (replication) | 1-2 host failures |
| **Min. hosts** | 2 + witness | 3 (MON + OSD) | 3 |
**Use case**: Medium company, VDI, general virtualization, 50-500 VMs.
**Recommendation**: For vSAN → min. 4 hosts for FTT=1 with erasure coding. For Ceph → min. 3 hosts, ideally 5+, each OSD host = 1 OSD per NVMe for maximum IOPS.
### Variant C: Enterprise FC SAN (6+ hosts)
Classic 3-tier architecture — compute (hosts) + storage (SAN) + network separated.
| Parameter | VMware vSphere | Hyper-V |
|----------|---------------|---------|
| **CPU** | 2× EPYC 9654-9965 (32-64C) | 2× EPYC 9654-9965 / Xeon 8592+ |
| **RAM** | 512-2048 GB (DDR5) | 512-2048 GB |
| **OS disk** | 2× SATA SSD RAID1 (480 GB) | 2× SATA SSD RAID1 |
| **Storage** | FC SAN LUN (2× FC HBA 32/64G) | FC SAN LUN or CSV over SMB |
| **App network** | 2-4× 25/100 GbE LACP | 2-4× 25/100 GbE LACP |
| **Storage network** | 2× FC 32/64G (multipath) | 2× FC 32/64G or SMB Multichannel |
| **vMotion / Live Migration** | 2× 25 GbE dedicated (vMotion) | 2× 25 GbE dedicated (SMB/RDMA) |
| **Management** | vCenter (VCSA), NSX, Aria | SCVMM, Azure Arc |
| **Cluster max** | 64-96 hosts (vSphere 8/9) | 64 hosts (WS 2025) |
| **Admission control** | 1-4 host failures | Nodes reserve |
| **DRS / Balancing** | DRS (fully automated) | SCVMM / AKS load balancing |
**Use case**: Enterprise, databases, critical applications, 500-5000 VMs.
**Storage variants**: FC SAN (lowest latency), iSCSI (lower CAPEX), NFS (simpler management).
**FC SAN topology**:
```
┌─────────────────────────────────────┐
│ FC Fabric │
│ ┌─────────┐ ┌─────────┐ │
│ │ Switch 1│ │ Switch 2│ │
│ └────┬────┘ └────┬────┘ │
└────────┼─────────────────┼──────────┘
┌─────┴─────┐ ┌─────┴─────┐
┌───┤ FC HBA 1 ├─┐ ┌─┤ FC HBA 2 ├───┐
│ └───────────┘ │ │ └───────────┘ │
┌──┴──┐ ┌──┴──┴──┐ ┌──┴──┐
│Host1│ │Host2 │ │Host3│ ...
└─────┘ └────────┘ └─────┘
```
### Variant D: Hyperscale OpenStack (20+ hosts)
For telco, large private clouds, MANO/NFVI environments.
| Parameter | Red Hat OpenStack | Canonical Charmed OpenStack |
|----------|-------------------|-----------------------------|
| **Compute** | Nova + KVM | Nova + KVM |
| **Storage** | Ceph (Cinder/RBD) + Swift | Ceph + Swift |
| **Network** | Neutron + OVN/OVS + DPDK | Neutron + OVN/OVS |
| **CPU per host** | 2× EPYC 9654-9965 (64-128C) | 2× EPYC 9654-9965 |
| **RAM per host** | 512-1024 GB | 512-1024 GB |
| **Storage per host** | Ceph OSD (4-12× NVMe/SSD) | Ceph OSD |
| **Network per host** | 4-8× 100 GbE (DPDK/VPP) | 4× 100 GbE |
| **Control plane** | 3-9× control node (HA) | 3-7× control node |
| **Orchestration** | TripleO / OpenStack Kolla | Juju + charms |
| **SDN** | OVN, OpenDaylight | OVN |
| **NFVI ready** | Yes (SR-IOV, NUMA, huge pages) | Yes |
| **Min. size** | 9 nodes (3 ctl + 3 compute + 3 ceph) | 7 nodes |
**Use case**: Telco (5G UPF, MNO), hyperscale private cloud, > 5000 VMs.
### Connectivity Summary by Platform
| Platform | App / VM Network | Storage Network | Replication / HA | Management |
|-----------|-------------|-------------|----------------|------------|
| **Proxmox small** | 2× 10/25 GbE LACP | — (local ZFS) | — | 1× 1 GbE |
| **vSAN (3-6)** | 2× 25 GbE LACP | 2× 25 GbE (vSAN) | vSAN traffic | 1× 1 GbE |
| **Proxmox Ceph (3-6)** | 2× 25 GbE | 2× 25 GbE (Ceph public) | 2× 25 GbE (Ceph cluster) | 1× 1 GbE |
| **Nutanix (3-6)** | 2× 25 GbE | Dedicated storage VLAN | Replication traffic | 1× 1 GbE |
| **vSphere FC SAN (6+)** | 2-4× 25/100 GbE LACP | 2× FC 32/64G multipath | 2× 25 GbE (vMotion) | 1× 1 GbE + SAN mgmt |
| **Hyper-V FC SAN (6+)** | 2-4× 25/100 GbE LACP | 2× FC 32/64G or SMB | 2× 25 GbE (Live Migration) | 1× 1 GbE |
| **OpenStack (20+)** | 2-4× 100 GbE | 2× 100 GbE (Ceph) | 2× 100 GbE (OVN) | 1× 1 GbE |
## Resources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
### Recommended Reading
| Book | Authors | ISBN | Description |
|-------|--------|------|-------|
| Virtualization Essentials (3rd ed., 2023) | Matthew Portnoy | 978-1119481513 | Practical guide to virtualization: from hypervisor basics (Type 1/Type 2), VM configuration (CPU, memory, storage, networking) to cloud computing and DevOps. "Learning-by-doing" approach with tutorials. Author is a Senior System Engineer at VMware/Splunk. |
| VMware vSphere Design (2nd ed.) | Guthrie, Lowe, Coleman | 978-1119130312 | Comprehensive guide to vSphere infrastructure design: hardware selection, network layout, security, storage and hypervisors. Describes a framework for design, decision analysis and best practices from experienced VMware architects. |
*Last revision: 2026-06-04*

455
HYPERVISORS.md Normal file
View File

@@ -0,0 +1,455 @@
# 🖥️ Hypervisory a virtualizační platformy
## Typy hypervisorů
| Typ | Popis | Příklady |
|-----|-------|----------|
| **Type 1** (bare-metal) | Běží přímo na hardware | VMware ESXi, Microsoft Hyper-V, KVM, Xen |
| **Type 2** (hosted) | Běží nad OS hostitele | VirtualBox, VMware Workstation, Parallels |
## Přehled platforem
| Platforma | Hypervisor | Licence | Poznámka |
|-----------|-----------|---------|----------|
| **VMware vSphere** | ESXi | Proprietary (Subscription od 2024) | Tržní lídr, široká adopce. Po akvizici Broadcomem (2023) přešlo na per-core subscription, ukončen perpetual license |
| **Microsoft Hyper-V** | Hyper-V | Windows Server / standalone | Integrace s Azure, SCVMM |
| **Proxmox VE** | KVM + LXC | Open source | Debian-based, web UI, levný |
| **Red Hat OpenStack / oVirt** | KVM | Open source | Otevřená alternativa, komplexní |
| **Nutanix AHV** | KVM (fork) | Součást Nutanix | Integrované HCI řešení |
| **XCP-ng / Xen Server** | Xen | Open source | Nástupce Citrix Hypervisor |
| **Oracle VM** | Xen | Proprietary | Oracle ekosystém |
## Klíčové koncepty
- **VM — Virtual Machine** — plná virtualizace, vlastní kernel
- **Container** — sdílený kernel hostitele, lehčí (Docker, LXC)
- **Paravirtualizace** — guest OS ví, že běží ve VM (lepší výkon I/O)
- **NUMA** — Non-Uniform Memory Access, optimalizace přidělování CPU/memory (viz [SERVER-HW.md](SERVER-HW.md#numa))
- **Overcommit** — přidělení více vCPU/RAM než je fyzicky (řízení poměru)
- **Live Migration** — přesun běžící VM mezi hosty (vSphere vMotion, Hyper-V Live Migration)
- **HA (High Availability)** — restart VM na jiném hostu při selhání
- **DRS / Load Balancing** — automatická distribuce VM podle vytížení
## VMware vSphere
### VMware licensing (post-Broadcom 2024+)
Od roku 2024 VMware prodává pouze subscription license, perpetual + SnS (Support & Subscription) byly ukončeny.
| Produkt | Metrika | Cena (orientační) | Co obsahuje |
|---------|---------|-------------------|-------------|
| **vSphere Standard** | Per core (min 16 cores/CPU) | ~$140/core/rok | ESXi, vCenter, vMotion, HA, DRS basic |
| **vSphere Enterprise Plus** | Per core | ~$220/core/rok | Vše výše + DRS advanced, SIOC, NIOC, Big Data Extensions |
| **vSphere Foundation** | Per core (balíček) | ~$350/core/rok | VSphere Enterprise Plus + Aria Operations, Aria Operations for Logs, Aria Automation |
| **VMware Cloud Foundation (VCF)** | Per core (balíček) | ~$700/core/rok | VSphere + vSAN + NSX + Aria celá sada. Vyžadováno pro vSAN a NSX od 2025 |
| **vSAN** | Per core (pouze jako součást VCF od 2025) | Již není standalone | Storage virtualization, dedup, compression, encryption |
| **NSX** | Per core (pouze jako součást VCF od 2025) | Již není standalone | SDN, micro-segmentace, firewall, load balancing |
**Klíčové změny po Broadcom akvizici**:
- Ukončen prodej perpetual licencí (květen 2024)
- Ukončeny samostatné produkty: vSAN a NSX již nelze koupit standalone (pouze v rámci VCF)
- Zrušeny desktopové a ROBO varianty (migrováno na VCF)
- Průměrný nárůst nákladů: 2-5× oproti předchozímu modelu (závisí na velikosti a produktovém mixu)
- **Dopad**: Mnoho zákazníků migruje na Proxmox VE, Nutanix AHV nebo Hyper-V
**Per-core kalkulace**:
```text
Server: 2× EPYC 9654 (96C each) = 192 cores
vSphere Standard: 192 × $140 = $26 880/rok
VCF: 192 × $700 = $134 400/rok (vč. vSAN a NSX)
Pro srovnání: dříve perpetual + SnS ≈ $15 000 jednorázově + $3 000/rok
```
### Exit strategie z VMware (post-Broadcom 2024+)
#### Kontext
Po akvizici VMware společností Broadcom (dokončeno listopad 2023) došlo k největšímu otřesu na trhu virtualizace v historii. Změny zahrnují:
- **Ukončení perpetual licencí** (únor 2024) — povinný subscription model
- **Forced bundling** — 8000+ SKU zredukováno na 4 balíčky (VCF, VVF, vSphere Standard/Foundation)
- **Minimální závazek 72 cores** (od dubna 2025) — nelze licencovat malé servery
- **20% penalizace za pozdní obnovu** — žádná tolerance
- **Cenový nárůst 1501500 %** dle velikosti a produktového mixu
- **Zánik samostatných produktů** — vSAN a NSX pouze v rámci VCF
- **Kolaps partnerského ekosystému** — z 4500+ partnerů na ~300 Premier
Dle Foundry/CIO.com průzkumu (2025): **56 %** organizací plánuje snížit využití VMware, **71 %** aktivně hledá on-premise alternativy. Gartner predikuje ztrátu ~35 % workloadů do 3 let.
#### Tři strategie
| Strategie | Popis | Vhodné pro |
|-----------|-------|------------|
| **Stay** | Přijmout nové ceny, obnovit VCF/VVF předplatné | Velké organizace s hlubokou integrací, kde migrace stojí víc než nové licence |
| **Reduce** | Snížit VMware footprint, migrovat část workloadů na alternativy, zbytek optimalizovat | Střední a velké firmy s heterogenním prostředím |
| **Exit** | Kompletní migrace na alternativní platformu | SME, organizace s rostoucími náklady 3-6×, greenfield projekty |
#### Cílové platformy — srovnání
| Kritérium | Proxmox VE | Nutanix AHV | Microsoft Hyper-V | Red Hat OpenShift Virtualization |
|-----------|-----------|-------------|-------------------|----------------------------------|
| **Hypervisor** | KVM + LXC | KVM (fork) | Hyper-V | KVM (KubeVirt) |
| **Licence** | Open source (free), support ~€500/host/rok | Per node subscription (3060 % savings oproti VCF) | Windows Server license (Standard/Datacenter) | OpenShift subscription (core-based) |
| **Live Migration** | Live Migration (Proxmox 8+) | AHV Live Migration | Live Migration (SMB/RDMA) | KubeVirt (VMI live migration) |
| **HA** | Proxmox HA (watchdog, fencing) | Built-in HA (Prism) | Hyper-V HA (WS Failover Cluster) | OpenShift HA (self-healing) |
| **Storage** | ZFS, Ceph, LVM | AOS (hybrid/SSD, erasure coding) | S2D, CSV, ReFS | OCS, Ceph, LSO |
| **Backup** | Proxmox Backup Server (free) | Native snapshot + DR | Windows Server Backup / Veeam | OpenShift APIs + OADP |
| **Cena (3 roky, 3 hosty)** | $0 + support $1 500 | ~$45 00060 000 | $0 (Hyper-V Server zdarma) nebo Windows Server lic. | ~$90 000+ (OpenShift) |
| **Cena (3 roky, 10 hostů)** | $0 + support $5 000 | ~$150 000200 000 | Windows Server Datacenter pro neomezené VM | ~$300 000+ (OpenShift) |
| **Náročnost migrace** | Střední (VMDK → QCOW2, VirtIO drivery) | Nízká (Nutanix Move tool) | Střední (V2V converter, SCVMM) | Vysoká (Kubernetes learning curve) |
| **Linux podpora** | Výborná (nativní KVM) | Výborná (KVM-based) | Dobrá (LIS drivers) | Výborná (KVM + OpenShift) |
| **Windows podpora** | Dobrá (VirtIO drivers) | Výborná (ALAS drivers, svpd) | Výborná (nativní) | Dobrá (KubeVirt + VirtIO) |
| **GPU passthrough** | VFIO (výborná) | GPU passthrough | DDA (Direct Device Assignment) | VFIO + GPU Operator |
#### Migrační nástroje
| Nástroj | Zdrojová platforma | Cílová platforma | Metoda |
|---------|-------------------|-------------------|--------|
| **Proxmox VMware Import Wizard** | VMware ESXi | Proxmox VE | Web GUI import přes NFS/ESXi API. Omezení: nutné ukončit snapshoty, nepodporuje UEFI do Proxmox 8.1 |
| **Nutanix Move** | VMware ESXi, Hyper-V | Nutanix AHV | Virtuální appliance, automatizovaná migrace s minimálním downtime, podpora UEFI, možnost retain IP/MAC |
| **Veeam Backup & Replication v12.2+** | VMware ESXi | Proxmox VE | Backup/restore přes Veeam, hot migration, podpora Proxmox od v12.2 |
| **StarWind V2V Converter** | VMware ESXi | Proxmox, Hyper-V, XCP-ng | Free GUI tool, VMDK → QCOW2/raw/VHDX, CLI support, hot migrations |
| **virt-v2v** | VMware ESXi, Xen, Hyper-V | KVM (libvirt) | Open source CLI nástroj, konverze disků + driverů (virtio), vhodný pro hromadnou migraci |
| **Windows Admin Center VM Conversion Extension** | VMware ESXi | Hyper-V | Microsoft WAC extension, free, GUI-based, hromadná migrace |
| **Platform9 vJailbreak** | VMware ESXi | OpenStack / KVM | In-place migration (bez swing gear), open source |
#### TCO srovnání — příklad: 3 hosty (2× 20C CPU), 50 VM
| Platforma | 1. rok | 3 roky celkem | Poznámka |
|-----------|--------|---------------|----------|
| **VMware VVF** (1-year rate) | $22 800 | $68 400 | 120 cores × $190/core/rok |
| **VMware VCF** | $42 000 | $126 000 | 120 cores × $350/core/rok |
| **Proxmox VE** (support) | $1 500 | $4 500 | 3× €500/host/rok |
| **Nutanix AHV** (průměr) | ~$18 000 | ~$54 000 | Per node subscription, odhad |
| **Hyper-V** (Windows Server Datacenter) | $12 400 | $37 200 | Jednorázová licence per core, bez SA |
| **Hyper-V** (Azure Stack HCI) | ~$7 200 | ~$21 600 | ~$10/core/měsíc, 120 cores |
**Reálný příklad ze Spiceworks (2026)**: Uživatel hlásí navýšení VMware Essentials+ z $1 900/rok na $14 000/rok (VVF) — nárůst 7.4×.
#### Rozhodovací rámec
```
1. Proveď audit VMware prostředí
├─ Počet hostů, core count, využití
├─ Feature dependency (vSAN, NSX, SRM)
├─ Workload profile (Windows vs Linux, DB, GPU)
└─ Hardware refresh cycle
2. Spočítej TCO pro VMware renewal (3 roky)
├─ VVF vs VCF vs aktuální model
└─ Zahrň audit risk, late renewal penalty
3. Vyber cílovou platformu (1-2 kandidáty)
├─ Proxmox: nejnižší TCO, Linux-heavy shops
├─ Nutanix: enterprise HCI, nízká náročnost migrace
├─ Hyper-V: Windows-centric, Azure hybrid
└─ OpenShift: Kubernetes-first, platform engineering
4. Naplánuj migrační fáze
├─ Wave 1: non-critical (dev/test, 1-2 měsíce)
├─ Wave 2: standard production (3-6 měsíců)
├─ Wave 3: mission-critical (6-12 měsíců)
└─ Coexistence: VMware + cíl běží paralelně
5. Počítej s 18-48 měsíci na kompletní exit (Gartner)
```
#### Reálné case studies
| Organizace | Výchozí | Cíl | Rozsah | Výsledek |
|-----------|---------|-----|--------|----------|
| **Stanford University** | VMware (60+ nodů) | Proxmox VE (6 clusterů) | 1 500 VM | Dokončeno 2025, zvýšená automatizace, nižší náklady |
| **Michelin** | VMware | Platform9 + OpenStack | Desítky nodů | Platform engineering tým, migrace výrobních workloadů |
| **Český podnik (50-100 serverů)** | VMware | Proxmox VE | ~100 VM | Roční úspora ~340 000500 000 CZK na licencích |
#### Načasování — klíčové deadline
| Událost | Datum | Dopad |
|---------|-------|-------|
| **Ukončení perpetual licencí** | Únor 2024 | Již proběhlo |
| **72-core minimum** | Duben 2025 | Small server licensing zdraženo |
| **vSphere 7 EOS** | Duben 2025 | Nutnost upgrade na 8.x |
| **ESXi 8.0 EOS** | Říjen 2027 | Poslední supported verze, migrační deadline |
| **Windows Server 2025 Hyper-V** | Prosinec 2025 | 64 hostů cluster, 2 048 vCPU per VM |
| **Proxmox VE 9 + Datacenter Manager** | 2026 | Enterprise features, vCenter alternativa |
#### Doporučení
| Scénář | Akce |
|--------|------|
| **Malá firma (< 10 hostů), Linux workloady** | Migrovat na Proxmox VE — okamžitá úspora 100 % licencí |
| **Střední firma (10-50 hostů), smíšené workloady** | Vyhodnotit Nutanix AHV (snadná migrace) nebo Proxmox (nižší TCO) |
| **Enterprise (50+ hostů), hluboká VMware integrace** | Reduce strategie: optimalizovat stávající VMware + migrovat vybrané workloady na OpenShift / Hyper-V |
| **Microsoft shop** | Hyper-V / Azure Stack HCI — native Azure hybrid, žádné dodatečné licence na hypervisor |
| **Kubernetes-native tým** | OpenShift Virtualization / KubeVirt — sjednotit VM a container management |
| **MSP / poskytovatel hostingu** | Nutanix nebo OpenStack — multi-tenancy, vCloud Director alternativa |
#### Cluster design
- **Max velikost clusteru**: 64 hostů (vSphere 8/9), 96 hostů (vSphere 8 + enhanced)
- **Datastore limits**: max 256 datastorů na host, max 65 TB na VMFS-6 datastore
- **vSAN ready capacity**: doporučeno max 60-64 hostů na vSAN cluster
- **Fault domains** — rozdělení clusteru do skupin hostů (rack awareness), min 3 fault domains pro stetch cluster
- **Admission control** — rezervace resource pro HA failover:
- **Host failures cluster tolerates** — nejčastější (1-4 hosty)
- **Percentage of cluster resources** — rezervace % CPU/memory
- **Dedicated failover hosts** — vyhrazený host(y) pro HA
- **Cluster limits (vSphere 8/9)**:
- 960 VMs per host (vSphere 9 max)
- 15 000 VMs per cluster (vCenter max)
- 300 hosts per cluster (vSphere 8/9, hardware vMotion)
### Microsoft Hyper-V licensing
| Varianta | Metrika | Cena | Co obsahuje |
|----------|---------|------|-------------|
| **Windows Server Standard** | Per core (min 16 licencí/server) + CAL | ~$1 000/core (jednorázově) + $200/CAL | 2 VM licence (každá s plnou Windows Server licencí) |
| **Windows Server Datacenter** | Per core (min 16 licencí/server) + CAL | ~$6 200/core (jednorázově) + $200/CAL | Neomezené VM, Storage Spaces Direct, Shielded VMs |
| **Azure Stack HCI** | Per core (měsíčně) | ~$10-20/core/měsíc (Azure hybrid benefit) | Hyper-V + S2D + Azure management, součást Azure subscription |
| **Hyper-V Server** | Zdarma | $0 | Samostatný hypervisor (bez managementu, bez GUI, omezená podpora) — od 2025 již není distribuován |
**Důležité**:
- Windows Server Standard = 2 VM na každou licenci. Pokud potřebujete 3 VM na 2-socket serveru, potřebujete 2× Standard license (4 VM) nebo Datacenter
- **Azure Hybrid Benefit** — pokud máte Windows Server s SA (Software Assurance), můžete použít license v Azure bez dodatečných nákladů
- **CAL (Client Access License)** — každý uživatel nebo zařízení přistupující k Windows Serveru musí mít CAL (kromě Azure Hybrid Benefit)
## Microsoft Hyper-V
| Vlastnost | Hyper-V | Poznámka |
|-----------|---------|----------|
| **Max hostů v clusteru** | 64 (Windows Server 2025) | Shared Nothing Live Migration |
| **Max VM na host** | 1024 (WS 2022+) | Generace 2 VM |
| **Max vCPU per VM** | 240 (WS 2022+) | 64 hostů cluster |
| **Max RAM per VM** | 12 TB (WS 2022+) | Dynamická paměť |
| **Live Migration** | SMB, CSV, RDMA | Compressed nebo RDMA |
| **Storage** | CSV (Cluster Shared Volumes), ReFS | S2D pro HCI |
| **Nested Virtualization** | Ano | Intel VT-x / AMD-V |
| **SCVMM** | System Center VMM | Enterprise management, fabric, P2V |
### Hyper-V vs VMware srovnání
| Vlastnost | VMware vSphere | Microsoft Hyper-V |
|-----------|---------------|-------------------|
| **OS** | VMware ESXi (VMkernel) | Windows Server / Hyper-V Server |
| **Licence** | Per CPU (subscription) | Windows Server license / Datacenter |
| **Storage** | VMFS, NFS, vSAN, HCI | NTFS, ReFS, SMB, S2D |
| **Live Migration** | vMotion (cross-vSwitch, long distance) | Live Migration (SMB/RDMA) |
| **Storage Migration** | Storage vMotion (online) | Shared Nothing (datový disk) |
| **Replication** | vSphere Replication | Hyper-V Replica (ASR) |
| **Management** | vCenter, vSphere Client | SCVMM, Hyper-V Manager, Admin Center |
| **Linux support** | Výborný (open-vm-tools) | Dobrý (Linux Integration Services) |
| **TCO** | Vyšší | Nižší (s Windows licencí) |
## KVM
### Architektura
```
Hardware ──> QEMU (emulace I/O) + KVM (kernel module, virtualization)
libvirt (API + management)
┌───────┼───────────┐
virt-manager virsh openstack/proxmox
```
### Ladění
- **CPU pinning** — `virsh vcpupin vm1 0 2` (vCPU 0 → physical core 2), zamezuje přepínání kontextu
- **Huge pages** — 2 MB / 1 GB stránky místo 4 KB, snížení výpadků TLB (VM s velkou RAM): `echo 2048 > /proc/sys/vm/nr_hugepages`
- **NUMA affinity** — VM pinned na jeden NUMA node (minimalizace cross-NUMA memory access)
- `numactl --cpunodebind=0 --membind=0`
- `virsh numatune vm1 --nodeset 0`
- **VirtIO** — paravirtualizované I/O (virtio-net, virtio-blk, virtio-scsi) pro lepší výkon
- **IO threads** — dedikovaná vlákna pro I/O emulaci QEMU
### KVM tuning checklist
- Ověřit HW virtualizaci: `lscpu | grep Virtualization`
- Naložit KVM moduly: `kvm`, `kvm_intel`/`kvm_amd`, `vfio-pci`
- Optimalizovat storage: raw/LVM (vyhnout se qcow2 u výkonových workloadů)
## Storage v hypervizorech
Viz také: [STORAGE.md](STORAGE.md) — detailní přehled storage protokolů a konfigurací.
| Typ | Popis | Protokoly |
|-----|-------|-----------|
| **Local storage** | Disky přímo v serveru | SATA, SAS, NVMe |
| **Shared storage** | SAN / NAS přístupné všem hostům | Fibre Channel, iSCSI, NFS, SMB |
| **vSAN / HCI** | Hyperkonvergované úložiště (disky serverů = jediný pool) | VMware vSAN, Nutanix, StarWind |
| **Software-Defined** | SDS odděluje storage software od hardware | Ceph, GlusterFS, MinIO |
## HCI detail
| Vlastnost | Nutanix (AOS + AHV) | VMware vSAN | Azure Stack HCI |
|-----------|--------------------|-------------|----------------|
| **Hypervisor** | AHV (KVM fork), ESXi optional | ESXi (required) | Hyper-V |
| **Min. nodů** | 3 | 2 (witness) | 2 (witness) |
| **Max nodů** | 80+ | 64 | 16 (typical) |
| **Replikace** | 2 nebo 3 kopie + erasure coding | Mirroring (RAID 1), erasure coding | Mirroring + parity |
| **Deduplication** | Na úrovni clusteru (post-process) | Na úrovni disku (capacity tier) | ReFS (real-time) |
| **Compression** | Inline (AOS 6+) | Dedup + compression combined | ReFS |
| **Management** | Prism (web UI) | vCenter + vSAN UI | Windows Admin Center |
| **Licencování** | Per node subscription | Per CPU subscription | Per core subscription |
| **Ekosystém** | Built-in DR, backup, security | Broad ISV ecosystem | Azure integration |
| **Use case** | Enterprise VDI, general VM | VMware-centric shops | Azure hybrid, branch offices |
## Virtualizační platformy — srovnání
| Schopnost | VMware vSphere | Microsoft Hyper-V | Proxmox VE | Nutanix AHV |
|-----------|---------------|-------------------|------------|-------------|
| Live Migration | vMotion | Live Migration | Live Migration | Live Migration |
| HA | vSphere HA | Hyper-V HA | Proxmox HA | Built-in |
| DRS/balancování | DRS | SCVMM / AKS | HA skupiny | Built-in |
| Storage vMotion | ano | při vypnuté VM | ZFS send/recv | Built-in |
| Snapshoty | ano | ano | ano | ano |
| Backup API | CBT (Changed Block Tracking) | Hyper-V WMI / RCT | Proxmox Backup Server | Native |
| GPU passthrough | vGPU (NVIDIA Grid) | DDA | VFIO passthrough | GPU passthrough |
| Licencování | Per CPU / subscription | Windows Server licence | Open source (free) | Per node subscription |
## OpenStack
- **Distribuce**: Red Hat OpenStack, Canonical Charmed OpenStack
- **Služby**: Nova (compute), Cinder (block), Neutron (networking), Glance (images), Swift (object)
- **Use case**: Telco, velké private cloudy, MNO (MANO, NFVI)
- **Náročnost**: Vysoká — komplexní nasazení a údržba
---
## Variantní konfigurace hypervizorů podle velikosti a typu storage
### Volba platformy podle use case
| Use case | Primární volba | Alternativa | Zdůvodnění |
|----------|---------------|-------------|------------|
| **VMware shop, enterprise** | vSphere 8/9 | Hyper-V | Nejobsáhlejší ekosystém, vSAN, SRM, nejširší ISV podpora |
| **Microsoft shop, Azure hybrid** | Hyper-V / Azure Stack HCI | vSphere | Windows Server CAL už je, S2D, Azure Arc, native Hyper-V Replica |
| **SME / nízký budget** | Proxmox VE | XCP-ng / Hyper-V (free) | Open source, vestavěný Ceph, ZFS, PBS, žádné licenční náklady |
| **HCI greenfield** | Nutanix AHV | VMware vSAN | All-in-one, jednoduchá správa, vestavěný DR a backup |
| **Hyperscale / telco** | OpenStack (RHOSP) | — | Multi-tenancy, NFVI, MANO, Neutron SDN, Ceph integrace |
### Varianta A: Malé nasazení (2-3 hosty, lokální storage)
Pro malé firmy, pobočky, edge, dev/test. Žádné sdílené storage — HA zajištěna aplikačně nebo replikací VM.
| Parametr | Proxmox VE | VMware vSphere | Hyper-V |
|----------|-----------|---------------|---------|
| **CPU** | 1× EPYC 9124-9224 / Xeon 4410Y (8-16C) | 1× EPYC 9124-9224 / Xeon 4410Y | 1× Xeon 4410Y / EPYC 9124 |
| **RAM** | 64-128 GB (DDR5-4800, 1DPC) | 64-128 GB | 64-128 GB |
| **OS disk** | 2× SATA SSD RAID1 (240-480 GB) | 2× SATA SSD RAID1 | 2× SATA SSD RAID1 |
| **VM storage** | ZFS RAID10 (4-6× NVMe/SATA SSD) | VMFS local (4-6× SSD RAID5/10) | ReFS CSV (4-6× SSD RAID10) |
| **Network** | 2× 10/25 GbE LACP | 2× 10/25 GbE LACP + management | 2× 10/25 GbE LACP |
| **Management** | Proxmox web UI (1× node) | vCSA / vCenter (1× appliance) | Windows Admin Center / SCVMM |
| **HA** | Proxmox HA (watchdog, fencing) | vSphere HA (1 host failure) | Hyper-V HA (WS Failover Cluster) |
| **Backup** | Proxmox Backup Server | Veeam B&R (Community) | Windows Server Backup / Veeam |
| **Licence** | Zdarma (support ~€500/host/rok) | vSphere Essentials (~$600/3 hosts) | Windows Server Standard (2 VMs) |
**Use case**: Startup, pobočka, dev/test, < 200 VM, bez SAN, minimální budget.
**Výhody**: Nízká cena, jednoduchá správa. **Nevýhody**: Omezená škálovatelnost, výpadek hostu = nedostupnost VM.
### Varianta B: Střední HCI (3-6 hostů, vSAN / Ceph)
Hyperkonvergovaná infrastruktura — storage běží na stejných hostech jako VM.
| Parametr | VMware vSAN | Proxmox + Ceph | Nutanix AHV |
|----------|------------|----------------|-------------|
| **CPU** | 1-2× EPYC 9334-9654 (16-32C) | 1-2× EPYC 9224-9334 (12-24C) | 1-2× EPYC 9334-9654 |
| **RAM** | 256-512 GB | 128-256 GB | 256-512 GB |
| **Cache tier** | 1-2× NVMe cache (write buffer) | — (Ceph používá RAM/OSD) | 1-2× NVMe (oplog) |
| **Capacity tier** | 4-8× SSD (SAS/SATA) | 4-8× HBA NVMe/SSD (OSD) | 4-6× SSD (extent store) |
| **Network** | 4× 25 GbE (vSAN + VM + mgmt) | 4× 25 GbE (Ceph public + cluster) | 4× 25 GbE (storage + VM) |
| **Fault domain** | Rack awareness (3 racks min) | CRUSH rack level | Rack awareness |
| **Replication** | RAID-1 mirroring (FTT=1) | 3× replikace / EC 8+3 | 2× kopie + EC |
| **Dedupe/Compress** | Dedup + compression (capacity) | ZFS / Ceph compression (inline) | Inline compression |
| **HA limit** | 1-3 host failures | 1-2 host failures (replication) | 1-2 host failures |
| **Min. hostů** | 2 + witness | 3 (MON + OSD) | 3 |
**Use case**: Střední firma, VDI, general virtualizace, 50-500 VM.
**Doporučení**: Pro vSAN → min. 4 hosty pro FTT=1 s erasure coding. Pro Ceph → min. 3 hosty, ideálně 5+, každý OSD host = 1 OSD na NVMe pro maximální IOPS.
### Varianta C: Enterprise FC SAN (6+ hostů)
Klasická 3-tier architektura — compute (hosty) + storage (SAN) + network oddělené.
| Parametr | VMware vSphere | Hyper-V |
|----------|---------------|---------|
| **CPU** | 2× EPYC 9654-9965 (32-64C) | 2× EPYC 9654-9965 / Xeon 8592+ |
| **RAM** | 512-2048 GB (DDR5) | 512-2048 GB |
| **OS disk** | 2× SATA SSD RAID1 (480 GB) | 2× SATA SSD RAID1 |
| **Storage** | FC SAN LUN (2× FC HBA 32/64G) | FC SAN LUN nebo CSV over SMB |
| **App network** | 2-4× 25/100 GbE LACP | 2-4× 25/100 GbE LACP |
| **Storage network** | 2× FC 32/64G (multipath) | 2× FC 32/64G nebo SMB Multichannel |
| **vMotion / Live Migration** | 2× 25 GbE dedikované (vMotion) | 2× 25 GbE dedikované (SMB/RDMA) |
| **Management** | vCenter (VCSA), NSX, Aria | SCVMM, Azure Arc |
| **Cluster max** | 64-96 hostů (vSphere 8/9) | 64 hostů (WS 2025) |
| **Admission control** | 1-4 host failures | Nodes reserve |
| **Drs / Balancování** | DRS (fully automated) | SCVMM / AKS load balancing |
**Use case**: Enterprise, databáze, kritické aplikace, 500-5000 VM.
**Varianty storage**: FC SAN (nejnižší latence), iSCSI (nižší CAPEX), NFS (jednodušší management).
**FC SAN topologie**:
```
┌─────────────────────────────────────┐
│ FC Fabric │
│ ┌─────────┐ ┌─────────┐ │
│ │ Switch 1│ │ Switch 2│ │
│ └────┬────┘ └────┬────┘ │
└────────┼─────────────────┼──────────┘
┌─────┴─────┐ ┌─────┴─────┐
┌───┤ FC HBA 1 ├─┐ ┌─┤ FC HBA 2 ├───┐
│ └───────────┘ │ │ └───────────┘ │
┌──┴──┐ ┌──┴──┴──┐ ┌──┴──┐
│Host1│ │Host2 │ │Host3│ ...
└─────┘ └────────┘ └─────┘
```
### Varianta D: Hyperscale OpenStack (20+ hostů)
Pro telco, velké private cloudy, MANO/NFVI prostředí.
| Parametr | Red Hat OpenStack | Canonical Charmed OpenStack |
|----------|-------------------|-----------------------------|
| **Compute** | Nova + KVM | Nova + KVM |
| **Storage** | Ceph (Cinder/RBD) + Swift | Ceph + Swift |
| **Network** | Neutron + OVN/OVS + DPDK | Neutron + OVN/OVS |
| **CPU per host** | 2× EPYC 9654-9965 (64-128C) | 2× EPYC 9654-9965 |
| **RAM per host** | 512-1024 GB | 512-1024 GB |
| **Storage per host** | Ceph OSD (4-12× NVMe/SSD) | Ceph OSD |
| **Network per host** | 4-8× 100 GbE (DPDK/VPP) | 4× 100 GbE |
| **Control plane** | 3-9× kontrolní nod (HA) | 3-7× kontrolní node |
| **Orchestrace** | TripleO / OpenStack Kolla | Juju + charms |
| **SDN** | OVN, OpenDaylight | OVN |
| **NFVI ready** | Yes (SR-IOV, NUMA, huge pages) | Yes |
| **Min. velikost** | 9 nodeů (3 ctl + 3 compute + 3 ceph) | 7 nodeů |
**Use case**: Telco (5G UPF, MNO), hyperscale private cloud, > 5000 VM.
### Connectivity summary podle platformy
| Platforma | App / VM síť | Storage síť | Replikace / HA | Management |
|-----------|-------------|-------------|----------------|------------|
| **Proxmox malý** | 2× 10/25 GbE LACP | — (lokální ZFS) | — | 1× 1 GbE |
| **vSAN (3-6)** | 2× 25 GbE LACP | 2× 25 GbE (vSAN) | vSAN traffic | 1× 1 GbE |
| **Proxmox Ceph (3-6)** | 2× 25 GbE | 2× 25 GbE (Ceph public) | 2× 25 GbE (Ceph cluster) | 1× 1 GbE |
| **Nutanix (3-6)** | 2× 25 GbE | Dedikované storage VLAN | Replication traffic | 1× 1 GbE |
| **vSphere FC SAN (6+)** | 2-4× 25/100 GbE LACP | 2× FC 32/64G multipath | 2× 25 GbE (vMotion) | 1× 1 GbE + SAN mgmt |
| **Hyper-V FC SAN (6+)** | 2-4× 25/100 GbE LACP | 2× FC 32/64G nebo SMB | 2× 25 GbE (Live Migration) | 1× 1 GbE |
| **OpenStack (20+)** | 2-4× 100 GbE | 2× 100 GbE (Ceph) | 2× 100 GbE (OVN) | 1× 1 GbE |
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
### Doporučená literatura
| Kniha | Autoři | ISBN | Popis |
|-------|--------|------|-------|
| Virtualization Essentials (3rd ed., 2023) | Matthew Portnoy | 978-1119481513 | Praktický průvodce virtualizací: od základů hypervisorů (Type 1/Type 2), konfigurace VM (CPU, memory, storage, networking) až po cloud computing a DevOps. "Learning-by-doing" přístup s tutorialy. Autor je Senior System Engineer u VMware/Splunk. |
| VMware vSphere Design (2nd ed.) | Guthrie, Lowe, Coleman | 978-1119130312 | Komplexní průvodce návrhem vSphere infrastruktury: hardware selection, network layout, security, storage a hypervisory. Popisuje framework pro design, analýzu rozhodnutí a best practices od zkušených VMware architectů. |
*Poslední revize: 2026-06-04*

12
INFRASTRUCTURE.en.md Normal file
View File

@@ -0,0 +1,12 @@
# 🏗️ Infrastructure
This file has been split into separate areas:
| Area | File |
|--------|--------|
| 🖥️ Hypervisors and virtualization | [HYPERVISORS.md](HYPERVISORS.md) |
| 🏭 Data centers | [DATACENTERS.md](DATACENTERS.md) |
| 💾 Storage | [STORAGE.md](STORAGE.md) |
| 🔧 Hardware and servers | [HARDWARE.md](HARDWARE.md) |
*Last revision: 2026-06-03*

12
INFRASTRUCTURE.md Normal file
View File

@@ -0,0 +1,12 @@
# 🏗️ Infrastruktura
Tento soubor byl rozdělen do samostatných oblastí:
| Oblast | Soubor |
|--------|--------|
| 🖥️ Hypervisory a virtualizace | [HYPERVISORS.md](HYPERVISORS.md) |
| 🏭 Datová centra | [DATACENTERS.md](DATACENTERS.md) |
| 💾 Storage | [STORAGE.md](STORAGE.md) |
| 🔧 Hardware a servery | [HARDWARE.md](HARDWARE.md) |
*Poslední revize: 2026-06-03*

116
MONGODB.en.md Normal file
View File

@@ -0,0 +1,116 @@
# 🥬 MongoDB
## Overview
MongoDB is the most widespread document-oriented NoSQL database. It stores data as BSON (binary JSON) documents with a flexible schema. Suitable for applications with rapid development where the schema frequently migrates or is diverse.
## Data model
- **Database** → Collection → Document (JSON/BSON)
- **Document** — fields with key-value, nested objects, arrays
- **Flexible schema** — each document can have different fields (but not recommended)
- **ObjectID** — default primary key (12-byte: timestamp + machine + PID + counter)
## Architecture
```
mongod (individual node)
├── WiredTiger storage engine (default since 3.2)
│ ├── B-Tree indexes (B-Tree, not LSM)
│ ├── MVCC (snapshot isolation)
│ ├── Compression (zlib, snappy, zstd)
│ └── Cache (WiredTiger internal cache)
├── Replication (replica set)
│ ├── Primary (all writes)
│ └── Secondary (replication, optional reads)
└── Sharding (cluster)
├── mongos (router)
├── Config servers (metadata)
└── Shards (replica sets)
```
### Replica set
- Primary node = all writes, secondary = replication (oplog)
- Automatic failover (election among secondaries)
- Up to 50 nodes in a replica set, max 7 voting nodes
- Read preference: primary (default), primaryPreferred, secondary, secondaryPreferred, nearest
### Sharding
- Shard key = decisive for distribution
- **Range sharding** — close data on the same shard (good for range queries, risk of hot spots)
- **Hashed sharding** — even distribution (good for write throughput, bad for range queries)
- **Zoned sharding** — data placed according to zones (geo-distribution, compliance)
## Index types
| Type | Description |
|------|-------------|
| **Single field** | Standard B-Tree index |
| **Compound** | Multiple fields in index (order matters) |
| **Multikey** | Index on array field — each value separately |
| **Text** | Full-text search |
| **Geospatial (2d, 2dsphere)** | Geo queries (near, within, intersect) |
| **Hashed** | For hashed sharding |
| **TTL** | Automatic document deletion after expiration |
| **Wildcard** | Index on unknown/irregular fields |
## Aggregation pipeline
MongoDB pipeline framework for data transformations:
```javascript
db.orders.aggregate([
{ $match: { status: "shipped" } },
{ $group: { _id: "$customer_id", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } },
{ $limit: 10 }
])
```
## Recommendations — where MongoDB is better
| Area | MongoDB | Competition | Why MongoDB |
|------|---------|-------------|-------------|
| **Flexible schema** | Schema-less, changes without migration | PostgreSQL (ALTER TABLE + migration) | Rapid development, MVP, frequent model changes |
| **JSON / documents** | Native BSON, nested objects | PostgreSQL (jsonb, but lacks $ operators) | Simpler object mapping from code |
| **Horizontal scaling** | Native sharding (mongos + config) | MySQL (Vitess external) | Built-in, simple to set up |
| **Geo-distribution** | Zoned sharding, replica set per region | Cassandra (AP model, different philosophy) | CP from CAP, consistency + distribution |
| **Aggregation** | Aggregation pipeline, $lookup (LEFT JOIN) | PostgreSQL (SQL JOINs, more powerful) | Useful for denormalized data |
| **Development speed** | ORM-like (Mongoose), natural JSON | SQL (schema first, migrations) | Fastest time-to-market |
### When to use MongoDB
- **Rapid development / MVP** — schema evolves frequently, no migrations
- **Catalog data** — products with varying attributes (e-commerce, marketplace)
- **Content management** — diverse content (blog, CMS, headless CMS)
- **Real-time analytics** — aggregations, dashboards, event data
- **IoT / sensor data** — diverse message structures
- **Mobile applications** — JSON documents naturally map to API responses
### When to use something else
- **Financial transactions** → PostgreSQL (ACID, referential integrity)
- **Complex reports / JOINs** → PostgreSQL or ClickHouse
- **Relationship data (friends, follows)** → Neo4j (graph DB)
- **High-throughput writes** → Cassandra (AP model, no master bottleneck)
- **Small data, single server** → SQLite (simpler, no daemon)
## MongoDB licensing
MongoDB changed its license in 2018 from GNU AGPL v3 to **SSPL** (Server Side Public License):
| Variant | License | Price | Conditions |
|---------|---------|-------|------------|
| **MongoDB Community** | SSPL | Free | SSPL: if you offer MongoDB as a managed service, you must release the entire stack (incl. orchestration, monitoring) as open source. Internal use without restrictions |
| **MongoDB Enterprise Advanced** | Commercial | ~$10,000/server/year (Atlas: pay-per-use) | Enterprise features (LDAP, Kerberos, auditing, encryption), 24/7 support |
| **MongoDB Atlas** | Managed | Pay-per-use (~$0.10-5.00/hour depending on instance) | Fully managed, multi-cloud, auto-scaling, backup, monitoring |
**Impact**: SSPL is similar to Redis model — self-hosted internal use without restrictions, cloud providers (AWS, Azure) cannot offer MongoDB as a managed service without commercial agreement. Alternative: **FerretDB** (open source proxy compatible with MongoDB wire protocol).
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
*Last revision: 2026-06-03*

116
MONGODB.md Normal file
View File

@@ -0,0 +1,116 @@
# 🥬 MongoDB
## Přehled
MongoDB je nejrozšířenější document-oriented NoSQL databáze. Ukládá data jako BSON (binární JSON) dokumenty s flexibilním schematem. Vhodná pro aplikace s rychlým vývojem, kde schema často migruje nebo je různorodé.
## Data model
- **Database** → Collection → Document (JSON/BSON)
- **Document** — pole s klíč-hodnota, vnořené objekty, pole
- **Flexibilní schema** — každý dokument může mít jiná pole (ale nedoporučuje se)
- **ObjectID** — výchozí primární klíč (12-bajtový: timestamp + machine + PID + counter)
## Architektura
```
mongod (jednotlivý node)
├── WiredTiger storage engine (výchozí od 3.2)
│ ├── B-Tree indexy (B-Tree, ne LSM)
│ ├── MVCC (snapshot isolation)
│ ├── Compression (zlib, snappy, zstd)
│ └── Cache (WiredTiger internal cache)
├── Replication (replica set)
│ ├── Primary (všechny zápisy)
│ └── Secondary (replikace, volitelné čtení)
└── Sharding (cluster)
├── mongos (router)
├── Config servers (metadata)
└── Shards (replica sets)
```
### Replica set
- Primární node = všechny zápisy, sekundární = replikace (oplog)
- Automatický failover (election mezi sekundáry)
- Až 50 nodeů v replica setu, max 7 voting nodes
- Read preference: primary (default), primaryPreferred, secondary, secondaryPreferred, nearest
### Sharding
- Shard klíč = rozhodující pro distribuci
- **Range sharding** — blízká data na stejném shardu (good for range queries, risk of hot spots)
- **Hashed sharding** — rovnoměrná distribuce (good for write throughput, bad for range queries)
- **Zoned sharding** — data umístěna podle zón (geo-distribuce, compliance)
## Index types
| Typ | Popis |
|-----|-------|
| **Single field** | Standard B-Tree index |
| **Compound** | Více polí v indexu (order matters) |
| **Multikey** | Index na pole (array) — každá hodnota samostatně |
| **Text** | Full-text search |
| **Geospatial (2d, 2dsphere)** | Geo dotazy (near, within, intersect) |
| **Hashed** | Pro hashed sharding |
| **TTL** | Automatické mazání dokumentů po expiraci |
| **Wildcard** | Index na neznámá/nepravidelná pole |
## Aggregation pipeline
MongoDB pipeline framework pro transformace dat:
```javascript
db.orders.aggregate([
{ $match: { status: "shipped" } },
{ $group: { _id: "$customer_id", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } },
{ $limit: 10 }
])
```
## Doporučení — v čem je MongoDB lepší
| Oblast | MongoDB | Konkurence | Proč MongoDB |
|--------|---------|------------|--------------|
| **Flexibilní schema** | Schema-less, změny bez migrace | PostgreSQL (ALTER TABLE + migration) | Rychlý vývoj, MVP, časté změny modelu |
| **JSON / dokumenty** | Nativní BSON, vnořené objekty | PostgreSQL (jsonb, ale chybí $ operators) | Jednodušší mapování objektů z kódu |
| **Horizontal scaling** | Nativní sharding (mongos + config) | MySQL (Vitess externí) | Vestavěný, jednoduchý na setup |
| **Geo-distribuce** | Zoned sharding, replica set per region | Cassandra (AP model, jiná filozofie) | CP z CAP, konzistence + distribuce |
| **Agregace** | Aggregation pipeline, $lookup (LEFT JOIN) | PostgreSQL (SQL JOINy, výkonnější) | Užitečné pro denormalizovaná data |
| **Rychlost developmentu** | ORM-like (Mongoose), JSON přirozený | SQL (schema first, migrace) | Nejrychlejší time-to-market |
### Kdy použít MongoDB
- **Rychlý vývoj / MVP** — schema evolves frequently, žádné migrace
- **Katalogová data** — produkty s různými atributy (e-commerce, marketplace)
- **Content management** — různorodý obsah (blog, CMS, headless CMS)
- **Real-time analytics** — agregace, dashboardy, event data
- **IoT / senzorová data** — různorodé struktury zpráv
- **Mobilní aplikace** — JSON dokumenty přirozeně mapují API response
### Kdy použít něco jiného
- **Finanční transakce** → PostgreSQL (ACID, referenční integrita)
- **Komplexní reporty / JOINy** → PostgreSQL nebo ClickHouse
- **Vztahová data (friends, follows)** → Neo4j (grafová DB)
- **High-throughput zápisů** → Cassandra (AP model, bez master bottlenecku)
- **Malá data, jeden server** → SQLite (jednodušší, žádný daemon)
## MongoDB licensing
MongoDB změnila licenci v roce 2018 z GNU AGPL v3 na **SSPL** (Server Side Public License):
| Varianta | Licence | Cena | Podmínky |
|----------|---------|------|----------|
| **MongoDB Community** | SSPL | Zdarma | SSPL: pokud nabízíte MongoDB jako managed službu, musíte uvolnit celý stack (vč. orchestrace, monitoringu) jako open source. Interní použití bez omezení |
| **MongoDB Enterprise Advanced** | Komerční | ~$10 000/server/rok (Atlas: pay-per-use) | Enterprise funkce (LDAP, Kerberos, auditing, encryption), support 24/7 |
| **MongoDB Atlas** | Managed | Pay-per-use (~$0.10-5.00/hod dle instance) | Plně managed, multi-cloud, auto-scaling, backup, monitoring |
**Dopad**: SSPL je podobný model jako u Redis — pro self-hosted interní použití bez omezení, cloud poskytovatelé (AWS, Azure) nesmí nabízet MongoDB jako managed službu bez komerční dohody. Alternativa: **FerretDB** (open source proxy kompatibilní s MongoDB wire protokolem).
## Zdroje
Odkazy, knihy a standardy: [sources/databases/sources.md](sources/databases/sources.md)
*Poslední revize: 2026-06-03*

502
MONITORING.en.md Normal file
View File

@@ -0,0 +1,502 @@
# 📊 Monitoring and observability
## OpenMetrics standard
OpenMetrics (CNCF sandbox) is the de-facto standard for metric exposition in cloud-native environments:
- Supports text representation and Protocol Buffers
- Foundation for Prometheus exposition format
- Specifies: counter, gauge, histogram, summary, gaugehistogram, statefulset
- `_total` suffix for cumulative values, `_bucket` for histograms
- Metadata: HELP, TYPE, UNIT, (timestamp optional)
The standard is developed within [OpenObservability](https://github.com/OpenObservability/OpenMetrics).
## New tools and trends (20242026)
| Tool | Description |
|------|-------------|
| **Grafana Sigil** | AI observability for LLM agents (OTel-native) |
| **InfraLens** | eBPF-based, zero-instrumentation network observability |
| **Ingero** | GPU causal observability (eBPF, CUDA tracing) |
| **GreptimeDB** | Unified observability DB — replaces Prometheus + Loki + ES |
| **Netdata** | AI-powered full-stack monitoring, 800+ integrations, edge ML |
## Three pillars of observability
1. **Logs** — unstructured event data (ERROR, WARN, INFO)
2. **Metrics** — numerical data over time (latency, error rate, CPU utilization)
3. **Traces** — request tracking across services (distributed tracing)
## SLI / SLO / SLA
| Term | Meaning | Example |
|------|---------|---------|
| **SLI** (Service Level Indicator) | Measured metric | Latency p99 = 250ms |
| **SLO** (Service Level Objective) | Target value | 99.9 % of requests < 300ms |
| **SLA** (Service Level Agreement) | Legal commitment | 99.95 % uptime |
### Error budget
`Error Budget = 100 % - SLO`
- If SLO is 99.9 %, error budget is 0.1 % of time
- While error budget remains, the team can deploy new features
- When exhausted — freeze on deploys, stability is priority
## Pyramid of metrics — RED vs USE vs 4 Golden Signals
### 4 Golden Signals (Google SRE)
1. **Latency** — request processing time (distinguish success vs error latency)
2. **Traffic** — number of requests / throughput (RPS, QPS, throughput)
3. **Errors** — explicit errors (5xx, 4xx) and implicit (success with wrong result)
4. **Saturation** — how "full" the service is (CPU, memory, queue depth, connection pool)
### USE (for infrastructure)
- **U**tilization — how busy the resource is (% time active)
- **S**aturation — how much is waiting in queue (run queue, I/O wait)
- **E**rrors — errors (dropped packets, disk errors, OOM)
### RED (for services)
- **R**ate — requests per second
- **E**rrors — number of erroneous requests
- **D**uration — latency (distribution, percentiles)
| Methodology | Focus | Typical metrics |
|-------------|-------|-----------------|
| **4 Golden Signals** | Services + infrastructure | Latency, RPS, errors, saturation |
| **USE** | Infrastructure | CPU util, I/O saturation, disk errors |
| **RED** | Microservices | RPS, error rate, p50/p95/p99 latency |
## PromQL examples
| Expression | Description |
|------------|-------------|
| `rate(http_requests_total[5m])` | Requests per second (average over 5 min) |
| `increase(http_requests_total[1h])` | Total increase over 1 hour |
| `sum by (status) (rate(http_requests_total[5m]))` | Requests aggregated by status code |
| `histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))` | p99 latency |
| `avg_over_time(cpu_usage[1h])` | Average CPU utilization over an hour |
| `topk(5, sum(rate(http_requests_total[5m])) by (service))` | Top 5 services by RPS |
| `max_over_time(memory_usage[24h])` | Max memory usage over 24h |
| `rate(node_network_drop_total[5m]) > 0` | Networks with dropped packets |
| `(1 - avg(rate(node_cpu_seconds_total{mode="idle"}[5m])))` | CPU utilization (1 - idle) |
| `delta(http_request_duration_seconds_sum[5m]) / delta(http_request_duration_seconds_count[5m])` | Average latency |
| `absent(metric)` | Alert when metric is missing |
## Recording rules
Pre-aggregation of frequently used PromQL queries to reduce query load.
### When to use
- Complex queries used across multiple dashboards
- Queries over raw data with high cardinality
- Frequently queried aggregations (e.g., p99 latency over last month)
### Example
```yaml
groups:
- name: service_rules
interval: 1m
rules:
- record: job:http_requests:rate5m
expr: sum(rate(http_requests_total[5m])) by (job)
- record: instance:cpu:utilization
expr: (1 - avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance))
- record: service:http_latency:p99
expr: histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le, service))
```
- **record** — new metric name (convention: `level:metric:aggregation`)
- **interval** — how often the rule evaluates (typically 1-5 min)
## Metrics — tools
### Metrics
| Tool | Description |
|------|-------------|
| Prometheus | Pull-based, time-series DB, powerful query language (PromQL) |
| Grafana | Visualization, dashboards, alerting |
| Zabbix | Enterprise monitoring, agent + agentless (SNMP/IPMI/JMX), auto-discovery, trigger-based alerting |
| Datadog | SaaS, APM, logs, metrics in one |
| New Relic | APM, browser monitoring |
| CloudWatch | AWS native |
| Azure Monitor | Azure native |
| Google Cloud Ops | GCP native |
### Logging
| Tool | Description |
|------|-------------|
| ELK Stack | Elasticsearch, Logstash, Kibana |
| Loki | Grafana Loki — lightweight, Prometheus-like |
| Splunk | Enterprise log management |
| Fluentd / Fluent Bit | Log collector and forwarder |
| Vector | High-performance log/metric collector |
### Tracing
| Tool | Description |
|------|-------------|
| Jaeger | Open-source distributed tracing |
| Zipkin | Open-source distributed tracing |
| OpenTelemetry | Standard for instrumentation (logs, metrics, traces) |
| Datadog APM | SaaS tracing |
| AWS X-Ray | AWS tracing |
## OpenTelemetry detail
### Span attributes
```yaml
resource:
attributes:
- service.name: "payment-service"
- service.version: "1.2.3"
- deployment.environment: "production"
scope:
name: "io.opentelemetry.payment"
spans:
- name: "processPayment"
kind: SPAN_KIND_INTERNAL
attributes:
- payment.method: "credit_card"
- payment.amount: 2499
- payment.currency: "CZK"
events:
- name: "authorization.complete"
timestamp: 1717428000000000000
```
### Context propagation (W3C TraceContext)
- **`traceparent`** — header carrying trace-id, span-id, trace flags
- Format: `00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01`
- Version (00) | Trace-ID (32 hex) | Span-ID (16 hex) | TraceFlags (01 = sampled)
- **`tracestate`** — vendor-specific data, compatible cross-provider
- Propagation happens via HTTP headers, gRPC metadata, message queue properties
### Sampling
| Type | Description | Use case |
|------|-------------|----------|
| **Head-based** | Sampling decision at trace start (based on ID) | Simple, deterministic |
| **Tail-based** | Decision after trace completion (based on result, latency) | Better sampling, more complex |
- Tail-based sampling: often used for critical traces (5xx, p99+, slow traces)
- Tools: Grafana Tempo (tail-based), Jaeger (head-based), OTel Collector (head + tail)
## Alerting
### Principles
- **Alert on symptom, not cause** — "500 errors" instead of "high CPU"
- **Reduce noise** — flapping alerts, alert fatigue
- **Runbook for every alert** — what to do when alert fires
- **Alert severity** — P0 (critical), P1 (high), P2 (medium), P3 (low)
### Alertmanager (Prometheus)
```yaml
route:
receiver: "team-pager"
group_by: ["alertname", "cluster"]
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
routes:
- match:
severity: critical
receiver: "team-pager"
repeat_interval: 1h
- match:
severity: warning
receiver: "team-slack"
receivers:
- name: "team-pager"
pagerduty_configs:
- routing_key: "<KEY>"
severity: "{{ .CommonLabels.severity }}"
- name: "team-slack"
slack_configs:
- channel: "#alerts"
title: "{{ .GroupLabels.alertname }}"
```
**Concepts**:
- **Grouping** — grouping alerts by labels (noise reduction, e.g., all down instances in a cluster)
- **Inhibition** — suppression of less severe alerts when a more severe one exists (e.g., nodedown inhibits pod alerts)
- **Silencing** — temporary alert suppression (matching labels + duration)
- **Routing tree** — hierarchical routing by label match (severity, service, team)
### ESM (Event / Incident Management)
- PagerDuty, Opsgenie, OnCall (Grafana)
- Escalation policies
- On-call rotations
## Structured logging
```json
{
"timestamp": "2026-06-03T10:30:00Z",
"level": "ERROR",
"service": "payment-service",
"trace_id": "abc123",
"user_id": "u456",
"message": "Payment gateway timeout",
"duration_ms": 1200,
"error": {
"type": "TimeoutError",
"message": "Gateway did not respond in 1000ms"
}
}
```
### Required fields of structured log
| Field | Description | Example |
|-------|-------------|---------|
| `timestamp` | ISO 8601 / RFC 3339 | `2026-06-03T10:30:00Z` |
| `level` | Log level (RFC 5424) | `ERROR`, `WARN`, `INFO`, `DEBUG` |
| `message` | Human-readable message | `Payment processed` |
| `service` | Service name | `payment-service` |
| `trace_id` | Correlation across services | `abc123def456` |
### RFC 5424 log levels
| Number | Level | Usage |
|--------|-------|-------|
| 0 | EMERG | System unusable |
| 1 | ALERT | Immediate action required |
| 2 | CRIT | Critical error |
| 3 | ERROR | Error (non-critical) |
| 4 | WARN | Warning |
| 5 | NOTICE | Normal but significant event |
| 6 | INFO | Informational message |
| 7 | DEBUG | Debugging (disabled in production) |
### Correlation ID (traceparent)
- Generated at system entry (API gateway, frontend, message consumer)
- Propagated in HTTP header `X-Correlation-ID` / `traceparent`
- Enables linking logs across microservices (→ Grafana Explore, Kibana Discover)
- Implementation: middleware in app, service mesh (Envoy), API gateway
## Distributed tracing detail
### Span kinds
| Kind | Description | Example |
|------|-------------|---------|
| **CLIENT** | Calling downstream service (outbound) | HTTP client calling API |
| **SERVER** | Processing incoming request | HTTP handler |
| **INTERNAL** | Local operation within service | Computation, transformation |
| **PRODUCER** | Sending message to queue | Kafka producer |
| **CONSUMER** | Receiving message from queue | Kafka consumer |
### Trace context chain
```
Trace: abc123
├── Span: /checkout (SERVER, root)
│ ├── Span: validateCart (INTERNAL)
│ ├── Span: POST /orders (CLIENT → payment-service)
│ │ └── Span: /processPayment (SERVER)
│ │ ├── Span: authorizeCard (INTERNAL)
│ │ └── Span: chargeCard (CLIENT → bank-gateway)
│ │ └── Span: /charge (SERVER, external)
│ └── Span: sendConfirmation (PRODUCER → kafka)
│ └── Span: consumeConfirmation (CONSUMER → email-service)
```
- **W3C TraceContext** — standardized cross-service tracing
- **Baggage** — transport of contextual data (tenant, user role) between spans
## Grafana
### Provisioning dashboards as code
```yaml
apiVersion: 1
providers:
- name: "default"
orgId: 1
folder: "Services"
type: file
options:
path: /etc/grafana/provisioning/dashboards
```
Dashboards JSON in git → CI/CD → automatic import into Grafana.
### Variables
- **Query variable** — dynamic values (e.g., list of service names from PromQL: `label_values(up, service)`)
- **Interval variable** — `$__auto_interval`, `$__interval` for variable time range
- **Custom variable** — manual list of values (env: prod, staging, dev)
- **Chained variable** — dependent variable (select namespace → show pods in namespace)
### Annotations
- Drawing events in graphs (deploys, incidents, config changes)
- Sources: Prometheus alerts, Loki logs, GitHub Actions, custom API
- Use case: "Deploy at 14:30 → spike in latency at 14:31 → correlation"
## On-call best practices
### Escalation policies
```
Level 1: Primary on-call (response within 5 min)
└── timeout 15 min
Level 2: Secondary / senior engineer (response within 15 min)
└── timeout 15 min
Level 3: Engineering manager / incident commander
```
### Incident severity matrix
| Severity | Description | Response | Communication |
|----------|-------------|----------|---------------|
| **P0 (Critical)** | Service completely unavailable, data loss, security breach | Immediate, 24/7 | Status page + Stakeholder update |
| **P1 (High)** | Major functionality degraded, part of users affected | Within 15 min | Slack channel + Team lead |
| **P2 (Medium)** | Non-critical feature broken, workaround exists | Within 1 h | Slack channel |
| **P3 (Low)** | Cosmetic issue, no user impact | Next business day | Jira ticket |
### Postmortem
- **Blameless** — goal is to learn, not blame
- **Structure**: Timeline, detection, root cause, resolution, action items
- **SRE principle**: every incident → postmortem → systemic improvement
- **Tools**: Jira, Incident.io, PagerDuty postmortem, Google Docs
## Logging patterns
### Best practices
- **Dashboard for each level** — executive, service, troubleshooting
- **Synthetic monitoring** — heartbeat checks, browser tests (Playwright, Cypress)
- **APM** — Application Performance Monitoring (database queries, external calls)
- **Anomaly detection** — ML-based outlier detection
- **Retention policy** — raw data short term, aggregations long term
- **Unified log format** — JSON, structured data
## Recommended literature
### Classic books
| Book | Authors | ISBN | Key topics |
|------|---------|------|------------|
| **Site Reliability Engineering** | Beyer, Jones, Petoff, Murphy | 978-1491929124 | How Google runs production systems — SRE principles, error budgets, toil, SLI/SLO |
| **The Site Reliability Workbook** | Beyer, Murphy, Rensin, Kawahara, Thorne | 978-1492029502 | Practical companion to SRE — case studies from Evernote, Home Depot, NY Times; SLO implementation, monitoring, on-call |
| **Observability Engineering** | Majors, Fong-Jones, Miranda | 978-1492076445 | First comprehensive book on observability — structured events, iterative hypothesis verification, core analysis loop; 2nd edition in 2026 (32 new chapters on AI, cost governance) |
### Cloud and monitoring
| Book | Author | ISBN/Year | Topics |
|------|--------|-----------|--------|
| **Cloud Observability in Action** | Michael Hausenblas | Manning, 2023 | Practical guide to observability in cloud-native environments — signal types (logs, metrics, traces, profiles), OTel Collector, SLOs, signal correlation, developer observability; open-source tools |
| **Mastering Prometheus** | William Hegedus | 978-1-80512-566-2 | Advanced Prometheus techniques — TSDB internals, custom service discovery, cardinality, remote storage (VictoriaMetrics, Mimir), SLO-based alerting; author is SRE manager at Akamai and Prometheus/Thanos contributor |
| **Observability with Grafana** | Chapman, Holmes | 978-1-80324-964-3 | Complete guide to LGTM stack (Loki, Grafana, Tempo, Mimir) — OTel instrumentation, LogQL/PromQL/TraceQL, AI/ML alerting, real user monitoring with Faro, Pyroscope profiling, k6 load testing |
| **Hands-On Monitoring and Alerting with Prometheus** | Muhammad Badawy | 978-9349887565 | Practical Prometheus guide — installation, configuration, service discovery, labeling, PromQL, Alertmanager, monitoring Linux, Windows, Docker, databases |
### AI and observability
| Book | Authors | ISBN/Year | Topics |
|------|---------|-----------|--------|
| **Observability in the AI-Native Era** | Lipsig, Grabner, Rati | 978-1-80638-959-9 | Connecting observability with AIOps — ML-based anomaly detection, root-cause analysis, self-healing systems, OTel + Prometheus + Grafana + Dynatrace/Datadog, compliance |
| **Open Source Observability** | Corless, Pawar | O'Reilly, 2025 | Report on disaggregated, modular observability stacks — flexibility, cost efficiency, data autonomy, blueprint for custom solutions from open-source components |
## Detailed tool overview
Extended information on tools from the table above:
### Grafana Sigil
AI observability product from Grafana Labs. OpenTelemetry-native SDK for instrumenting LLM agents:
- **Repository**: `github.com/grafana/sigil-sdk` (Go SDK) + `sigil-app` (Grafana plugin)
- **Features**: tracking conversations, generation, tool usage, cost tracking, quality evaluation
- **Growing problem**: 500M+ conversations, 5M+ agents in production (GrafanaCON 2026)
- **Integration**: automatic connection with Prometheus (metrics), Tempo (traces), AI Observability API
### InfraLens
Zero-instrumentation Kubernetes observability built on eBPF:
- **Repository**: `github.com/Herenn/Infralens` (Apache 2.0, Go)
- **Features**: automatic detection of service-to-service communication, topology visualization, AI-powered documentation
- **Architecture**: eBPF agent + Go backend + React frontend
- **Status**: early-stage (1 star, 10 commits), but eBPF-based observability concept is proven (Grafana Beyla, Cilium Hubble, Pixie)
### Ingero
GPU causal observability agent — first of its kind:
- **Repository**: `github.com/ingero-io/ingero` (Apache 2.0)
- **Features**: eBPF tracing from Linux kernel events through CUDA API to Python source code
- **Overhead**: < 2 %, zero code changes, single binary
- **MCP server**: native Model Context Protocol support — AI assistants can directly query GPU data
- **Use case**: diagnosis of GPU stalls, scheduler preemptions, CUDA memory spikes — causal chains instead of plain metrics
- **Version**: v0.19.0 (2026), active development
### GreptimeDB
Unified observability database — one backend for metrics, logs and traces:
- **Repository**: `github.com/GreptimeTeam/greptimedb` (Apache 2.0, Rust)
- **Architecture**: compute-storage disaggregation, object storage first (S3, GCS, Azure Blob), columnar storage
- **Querying**: SQL + PromQL in a single query, JOIN between metrics and logs possible
- **Drop-in replacement**: Prometheus (PromQL, remote write), Loki (Push API), Elasticsearch (bulk API), Jaeger (Query API)
- **Cost reduction**: up to 50× lower costs compared to traditional solutions
- **Roadmap 2026**: v1.0 GA (Q1 2026), v1.1v1.3 (Vector Index, AI Functions, Auto Rollup, adaptive resource management)
- **GreptimeDB Enterprise**: enhanced security, HA, enterprise support
### Netdata
Open-source, real-time monitoring platform for entire infrastructure:
- **Repository**: `github.com/netdata/netdata` (GPLv3+, C; 79k★)
- **Features**: per-second metrics, ML-based anomaly detection, AI-powered troubleshooting, 800+ integrations
- **Zero configuration**: auto-discovery, pre-configured alerts, ready dashboards
- **Architecture**: distributed agent → Netdata Cloud (optional), data stays local
- **Energy efficiency**: according to University of Amsterdam study, the most efficient tool for monitoring Docker systems
- **Netdata Cloud**: free tier (5 nodes), paid from $12/node/month
- **Licensing**: agent GPLv3+, dashboard NCUL1, cloud closed-source
## OpenStack Monitoring
OpenStack provides several services for telemetry and monitoring:
### Ceilometer (Telemetry)
- Metric collection (CPU, memory, network, storage) from compute, network and storage nodes
- Publishing to Gnocchi (time-series DB) or Panko (event storage)
- Notifications via oslo.messaging (RabbitMQ) — pipeline transformations
- Alarming: Aodh — threshold-based alarms, metric combinations
### Monasca
- More modern alternative to Ceilometer (primarily developed for telco use cases)
- Architecture: Monasca API → Log API → Transform → Threshold Engine → Notifier
- Backend: InfluxDB/Gnocchi, Kafka, Elasticsearch
- Supports alerting, notifications, graph dashboards
### Prometheus + OpenStack Exporter
- OpenStack-exporter for Prometheus (exports metrics from Ceilometer / API)
- Service discovery via Prometheus
- Grafana dashboards for visualization
### Masakari (VM High Availability)
- Detection and automatic recovery of VMs on hypervisor failure (host failure)
- Evacuation of instances to healthy compute node
- Integration with Pacemaker for cluster management
## Sources
Links, books and standards: [sources/monitoring/sources.md](sources/monitoring/sources.md)
*Last revision: 2026-06-03*

502
MONITORING.md Normal file
View File

@@ -0,0 +1,502 @@
# 📊 Monitoring a observabilita
## OpenMetrics standard
OpenMetrics (CNCF sandbox) je de-facto standard pro expozici metrik v cloud-native prostředí:
- Podpora text representation i Protocol Buffers
- Základ pro Prometheus exposition format
- Specifikuje: counter, gauge, histogram, summary, gaugehistogram, statefulset
- `_total` suffix pro kumulativní hodnoty, `_bucket` pro histogramy
- Metadata: HELP, TYPE, UNIT, (časové razítko volitelné)
Standard se vyvíjí v rámci [OpenObservability](https://github.com/OpenObservability/OpenMetrics).
## Nové nástroje a trendy (20242026)
| Nástroj | Popis |
|---------|-------|
| **Grafana Sigil** | AI observability pro LLM agenty (OTel-native) |
| **InfraLens** | eBPF-based, zero-instrumentation network observability |
| **Ingero** | GPU causal observability (eBPF, CUDA tracing) |
| **GreptimeDB** | Unified observability DB — nahrazuje Prometheus + Loki + ES |
| **Netdata** | AI-powered full-stack monitoring, 800+ integrations, edge ML |
## Tři pilíře observability
1. **Logs** — nestrukturovaná data o událostech (ERROR, WARN, INFO)
2. **Metrics** — číselná data v čase (latence, chybovost, vytížení CPU)
3. **Traces** — sledování požadavku napříč službami (distributed tracing)
## SLI / SLO / SLA
| Termín | Význam | Příklad |
|--------|--------|---------|
| **SLI** (Service Level Indicator) | Naměřená metrika | Latence p99 = 250ms |
| **SLO** (Service Level Objective) | Cílová hodnota | 99.9 % requestů < 300ms |
| **SLA** (Service Level Agreement) | Právní závazek | 99.95 % uptime |
### Error budget
`Error Budget = 100 % - SLO`
- Pokud je SLO 99.9 %, error budget je 0.1 % času
- Dokud error budget zbývá, tým může deployovat nové featury
- Po vyčerpání — freeze na deploye, priorita je stabilita
## Pyramid of metrics — RED vs USE vs 4 Golden Signals
### 4 Golden Signals (Google SRE)
1. **Latency** — čas zpracování requestu (rozlišovat success vs error latenci)
2. **Traffic** — počet requestů / propustnost (RPS, QPS, throughput)
3. **Errors** — explicitní chyby (5xx, 4xx) i implicitní (success s chybným výsledkem)
4. **Saturation** — jak je služba "plná" (CPU, memory, queue depth, connection pool)
### USE (pro infrastrukturu)
- **U**tilization — jak je resource vytížená (% času je aktivní)
- **S**aturation — kolik čeká ve frontě (run queue, I/O wait)
- **E**rrors — chyby (dropped packets, disk errors, OOM)
### RED (pro služby)
- **R**ate — počet requestů za sekundu
- **E**rrors — počet chybných requestů
- **D**uration — latence (distribuce, percentily)
| Metodologie | Zaměření | Typické metriky |
|------------|----------|----------------|
| **4 Golden Signals** | Služby + infrastruktura | Latence, RPS, errors, saturation |
| **USE** | Infrastruktura | CPU util, I/O saturace, disk errors |
| **RED** | Microservices | RPS, error rate, p50/p95/p99 latence |
## PromQL příklady
| Výraz | Popis |
|-------|-------|
| `rate(http_requests_total[5m])` | Počet requestů za sekundu (průměr za 5 min) |
| `increase(http_requests_total[1h])` | Celkový nárůst za 1 hodinu |
| `sum by (status) (rate(http_requests_total[5m]))` | Requesty agregované podle status kódu |
| `histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))` | p99 latence |
| `avg_over_time(cpu_usage[1h])` | Průměrné CPU vytížení za hodinu |
| `topk(5, sum(rate(http_requests_total[5m])) by (service))` | Top 5 služeb podle RPS |
| `max_over_time(memory_usage[24h])` | Maximální memory usage za 24h |
| `rate(node_network_drop_total[5m]) > 0` | Sítě s dropped pakety |
| `(1 - avg(rate(node_cpu_seconds_total{mode="idle"}[5m])))` | CPU utilization (1 - idle) |
| `delta(http_request_duration_seconds_sum[5m]) / delta(http_request_duration_seconds_count[5m])` | Průměrná latence |
| `absent(metric)` | Alert když metrika chybí |
## Recording rules
Pre-agregace často používaných PromQL dotazů pro snížení zátěže při dotazování.
### Kdy použít
- Složité dotazy používané na více dashboardech
- Dotazy nad surovými daty s vysokým kardinality
- Často dotazované agregace (např. p99 latence za poslední měsíc)
### Příklad
```yaml
groups:
- name: service_rules
interval: 1m
rules:
- record: job:http_requests:rate5m
expr: sum(rate(http_requests_total[5m])) by (job)
- record: instance:cpu:utilization
expr: (1 - avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance))
- record: service:http_latency:p99
expr: histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le, service))
```
- **record** — název nové metriky (konvence: `level:metric:aggregation`)
- **interval** — jak často se pravidlo vyhodnocuje (typicky 1-5 min)
## Metriky — nástroje
### Metrics
| Nástroj | Popis |
|---------|-------|
| Prometheus | Pull-based, time-series DB, silný query language (PromQL) |
| Grafana | Vizualizace, dashboardy, alerting |
| Zabbix | Enterprise monitoring, agent + agentless (SNMP/IPMI/JMX), auto-discovery, trigger-based alerting |
| Datadog | SaaS, APM, logs, metrics v jednom |
| New Relic | APM, browser monitoring |
| CloudWatch | AWS nativní |
| Azure Monitor | Azure nativní |
| Google Cloud Ops | GCP nativní |
### Logging
| Nástroj | Popis |
|---------|-------|
| ELK Stack | Elasticsearch, Logstash, Kibana |
| Loki | Grafana Loki — lightweight, Prometheus-like |
| Splunk | Enterprise log management |
| Fluentd / Fluent Bit | Log collector a forwarder |
| Vector | High-performance log/metric collector |
### Tracing
| Nástroj | Popis |
|---------|-------|
| Jaeger | Open-source distributed tracing |
| Zipkin | Open-source distributed tracing |
| OpenTelemetry | Standard pro instrumentaci (logs, metrics, traces) |
| Datadog APM | SaaS tracing |
| AWS X-Ray | AWS tracing |
## OpenTelemetry detail
### Span attributes
```yaml
resource:
attributes:
- service.name: "payment-service"
- service.version: "1.2.3"
- deployment.environment: "production"
scope:
name: "io.opentelemetry.payment"
spans:
- name: "processPayment"
kind: SPAN_KIND_INTERNAL
attributes:
- payment.method: "credit_card"
- payment.amount: 2499
- payment.currency: "CZK"
events:
- name: "authorization.complete"
timestamp: 1717428000000000000
```
### Context propagation (W3C TraceContext)
- **`traceparent`** — hlavička nesoucí trace-id, span-id, trace flags
- Formát: `00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01`
- Version (00) | Trace-ID (32 hex) | Span-ID (16 hex) | TraceFlags (01 = sampled)
- **`tracestate`** — vendor-specific data, kompatibilní cross-provider
- Propagace probíhá přes HTTP hlavičky, gRPC metadata, message queue properties
### Sampling
| Typ | Popis | Use case |
|-----|-------|----------|
| **Head-based** | Rozhodnutí o sample na začátku trace (na základě ID) | Jednoduchý, deterministický |
| **Tail-based** | Rozhodnutí po dokončení trace (podle výsledku, latence) | Kvalitnější sample, komplexnější |
- Tail-based sampling: často používán pro kritické trace (5xx, p99+, slow traces)
- Nástroje: Grafana Tempo (tail-based), Jaeger (head-based), OTel Collector (head + tail)
## Alerting
### Principy
- **Alert na symptom, ne na příčinu** — "500 errors" místo "high CPU"
- **Reduce noise** — flapping alerts, alert fatigue
- **Runbook pro každý alert** — co dělat když alert pípne
- **Alert severity** — P0 (critical), P1 (high), P2 (medium), P3 (low)
### Alertmanager (Prometheus)
```yaml
route:
receiver: "team-pager"
group_by: ["alertname", "cluster"]
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
routes:
- match:
severity: critical
receiver: "team-pager"
repeat_interval: 1h
- match:
severity: warning
receiver: "team-slack"
receivers:
- name: "team-pager"
pagerduty_configs:
- routing_key: "<KEY>"
severity: "{{ .CommonLabels.severity }}"
- name: "team-slack"
slack_configs:
- channel: "#alerts"
title: "{{ .GroupLabels.alertname }}"
```
**Koncepty**:
- **Grouping** — seskupování alertů podle labelů (snížení noise, např. všechny down instance v clusteru)
- **Inhibition** — potlačení méně závažných alertů při existenci závažnějšího (např. nodedown inhibuje pod alerty)
- **Silencing** — dočasné potlačení alertu (matching labels + duration)
- **Routing tree** — hierarchické routování podle label match (severity, service, team)
### ESM (Event / Incident Management)
- PagerDuty, Opsgenie, OnCall (Grafana)
- Escalation policies
- On-call rotations
## Strukturované logování
```json
{
"timestamp": "2026-06-03T10:30:00Z",
"level": "ERROR",
"service": "payment-service",
"trace_id": "abc123",
"user_id": "u456",
"message": "Payment gateway timeout",
"duration_ms": 1200,
"error": {
"type": "TimeoutError",
"message": "Gateway did not respond in 1000ms"
}
}
```
### Povinná pole strukturovaného logu
| Pole | Popis | Příklad |
|------|-------|---------|
| `timestamp` | ISO 8601 / RFC 3339 | `2026-06-03T10:30:00Z` |
| `level` | Log level (RFC 5424) | `ERROR`, `WARN`, `INFO`, `DEBUG` |
| `message` | Lidsky čitelná zpráva | `Payment processed` |
| `service` | Název služby | `payment-service` |
| `trace_id` | Korelace napříč službami | `abc123def456` |
### RFC 5424 log levels
| Číslo | Level | Použití |
|-------|-------|---------|
| 0 | EMERG | Systém nepoužitelný |
| 1 | ALERT | Nutná okamžitá akce |
| 2 | CRIT | Kritická chyba |
| 3 | ERROR | Chyba (ne kritická) |
| 4 | WARN | Varování |
| 5 | NOTICE | Normální, ale důležitá událost |
| 6 | INFO | Informační zpráva |
| 7 | DEBUG | Ladění (vypnuto v produkci) |
### Correlation ID (traceparent)
- Generován při vstupu do systému (API gateway, frontend, message consumer)
- Propagován v HTTP hlavičce `X-Correlation-ID` / `traceparent`
- Umožňuje spojit logy napříč microservices (→ Grafana Explore, Kibana Discover)
- Implementace: middleware v aplikaci, service mesh (Envoy), API gateway
## Distributed tracing detail
### Span kinds
| Kind | Popis | Příklad |
|------|-------|---------|
| **CLIENT** | Volání downstream služby (outbound) | HTTP klient volá API |
| **SERVER** | Zpracování příchozího požadavku | HTTP handler |
| **INTERNAL** | Lokální operace v rámci služby | Výpočet, transformace |
| **PRODUCER** | Odeslání zprávy do fronty | Kafka producer |
| **CONSUMER** | Příjem zprávy z fronty | Kafka consumer |
### Trace context chain
```
Trace: abc123
├── Span: /checkout (SERVER, root)
│ ├── Span: validateCart (INTERNAL)
│ ├── Span: POST /orders (CLIENT → payment-service)
│ │ └── Span: /processPayment (SERVER)
│ │ ├── Span: authorizeCard (INTERNAL)
│ │ └── Span: chargeCard (CLIENT → bank-gateway)
│ │ └── Span: /charge (SERVER, external)
│ └── Span: sendConfirmation (PRODUCER → kafka)
│ └── Span: consumeConfirmation (CONSUMER → email-service)
```
- **W3C TraceContext** — standardizace cross-service tracing
- **Baggage** — přenos kontextových dat (tenant, user role) mezi spans
## Grafana
### Provisioning dashboards as code
```yaml
apiVersion: 1
providers:
- name: "default"
orgId: 1
folder: "Services"
type: file
options:
path: /etc/grafana/provisioning/dashboards
```
Dashboards JSON v gitu → CI/CD → automatický import do Grafany.
### Variables
- **Query variable** — dynamické hodnoty (např. seznam service names z PromQL: `label_values(up, service)`)
- **Interval variable** — `$__auto_interval`, `$__interval` pro proměnlivý time range
- **Custom variable** — ruční seznam hodnot (env: prod, staging, dev)
- **Chained variable** — závislá proměnná (výběr namespace → zobrazí pody v namespace)
### Annotations
- Kreslení událostí do grafu (deploye, incidenty, config změny)
- Zdroje: Prometheus alerty, Loki logy, GitHub Actions, custom API
- Use case: "Deploy v 14:30 → spike v latenci v 14:31 → korelace"
## On-call best practices
### Escalation policies
```
Level 1: Primární on-call (reakce do 5 min)
└── timeout 15 min
Level 2: Sekundární / senior engineer (reakce do 15 min)
└── timeout 15 min
Level 3: Engineering manager / incident commander
```
### Incident severity matrix
| Severity | Popis | Reakce | Komunikace |
|----------|-------|--------|------------|
| **P0 (Critical)** | Služba kompletně nedostupná, data loss, security breach | Ihned, 24/7 | Status page + Stakeholder update |
| **P1 (High)** | Major funkčnost degradovaná, část uživatelů postižena | Do 15 min | Slack channel + Tým lead |
| **P2 (Medium)** | Non-critical funkce nefunguje, workaround existuje | Do 1 h | Slack channel |
| **P3 (Low)** | Kosmetický problém, žádný dopad na uživatele | Next business day | Jira ticket |
### Postmortem
- **Blameless** — cílem je naučit se, ne obviňovat
- **Struktura**: Timeline, detection, root cause, resolution, action items
- **SRE princip**: každá incident → postmortem → systémové zlepšení
- **Nástroje**: Jira, Incident.io, PagerDuty postmortem, Google Docs
## Logging patterns
### Best practices
- **Dashboard pro každou úroveň** — executive, service, troubleshooting
- **Syntetické monitoring** — Heartbeat checky, browser tests (Playwright, Cypress)
- **APM** — Application Performance Monitoring (databázové query, externí volání)
- **Anomaly detection** — ML-based detekce outlierů
- **Retention politika** — raw data krátce, agregace dlouhodobě
- **Jednotný formát logů** — JSON, strukturovaná data
## Doporučená literatura
### Klasické knihy
| Kniha | Autoři | ISBN | Klíčová témata |
|-------|--------|------|----------------|
| **Site Reliability Engineering** | Beyer, Jones, Petoff, Murphy | 978-1491929124 | Jak Google provozuje produkční systémy — SRE principy, error budgety, toil, SLI/SLO |
| **The Site Reliability Workbook** | Beyer, Murphy, Rensin, Kawahara, Thorne | 978-1492029502 | Praktický doprovod k SRE — case studies z Evernote, Home Depot, NY Times; implementace SLO, monitoring, on-call |
| **Observability Engineering** | Majors, Fong-Jones, Miranda | 978-1492076445 | První ucelená kniha o observability — structured events, iterativní verifikace hypotéz, core analysis loop; 2. vydání v roce 2026 (32 nových kapitol o AI, cost governance) |
### Cloud a monitoring
| Kniha | Autor | ISBN/Rok | Témata |
|-------|-------|----------|--------|
| **Cloud Observability in Action** | Michael Hausenblas | Manning, 2023 | Praktický průvodce observability v cloud-native prostředí — signal types (logs, metrics, traces, profiles), OTel Collector, SLOs, signal correlation, developer observability; open-source nástroje |
| **Mastering Prometheus** | William Hegedus | 978-1-80512-566-2 | Pokročilé techniky pro Prometheus — interní architektura TSDB, custom service discovery, kardinalita, remote storage (VictoriaMetrics, Mimir), SLO-based alerting; autor je SRE manager v Akamai a contributor Prometheus/Thanos |
| **Observability with Grafana** | Chapman, Holmes | 978-1-80324-964-3 | Kompletní průvodce LGTM stackem (Loki, Grafana, Tempo, Mimir) — instrumentace přes OTel, LogQL/PromQL/TraceQL, AI/ML alerting, real user monitoring s Faro, Pyroscope profiling, k6 zátěžové testování |
| **Hands-On Monitoring and Alerting with Prometheus** | Muhammad Badawy | 978-9349887565 | Praktický průvodce Prometheus — instalace, konfigurace, service discovery, labeling, PromQL, Alertmanager, monitoring Linux, Windows, Docker, databází |
### AI a observability
| Kniha | Autoři | ISBN/Rok | Témata |
|-------|--------|----------|--------|
| **Observability in the AI-Native Era** | Lipsig, Grabner, Rati | 978-1-80638-959-9 | Propojení observability s AIOps — ML-based anomaly detection, root-cause analysis, self-healing systémy, OTel + Prometheus + Grafana + Dynatrace/Datadog, compliance |
| **Open Source Observability** | Corless, Pawar | O'Reilly, 2025 | Report o disaggregated, modulárních observability stackách — flexibilita, cost efficiency, data autonomy, blueprint pro vlastní řešení z open-source komponent |
## Detailní přehled nástrojů
Rozšířené informace k nástrojům z tabulky výše:
### Grafana Sigil
AI observability produkt od Grafana Labs. OpenTelemetry-native SDK pro instrumentaci LLM agentů:
- **Repozitář**: `github.com/grafana/sigil-sdk` (Go SDK) + `sigil-app` (Grafana plugin)
- **Funkce**: sledování konverzací, generování, tool usage, cost tracking, quality evaluation
- **Rostoucí problém**: 500M+ konverzací, 5M+ agentů v produkci (GrafanaCON 2026)
- **Integrace**: automatické propojení s Prometheus (metrics), Tempo (traces), AI Observability API
### InfraLens
Zero-instrumentation Kubernetes observability postavená na eBPF:
- **Repozitář**: `github.com/Herenn/Infralens` (Apache 2.0, Go)
- **Funkce**: automatická detekce service-to-service komunikace, vizualizace topologie, AI-powered dokumentace
- **Architektura**: eBPF agent + Go backend + React frontend
- **Status**: early-stage (1 star, 10 commitů), ale koncept eBPF-based observability je potvrzený (Grafana Beyla, Cilium Hubble, Pixie)
### Ingero
GPU causal observability agent — první svého druhu:
- **Repozitář**: `github.com/ingero-io/ingero` (Apache 2.0)
- **Funkce**: eBPF tracing od Linux kernel eventů přes CUDA API až po Python zdrojový kód
- **Overhead**: < 2 %, zero code changes, jeden binární soubor
- **MCP server**: nativní podpora Model Context Protocol — AI asistenti mohou přímo queryovat GPU data
- **Use case**: diagnostika GPU stallů, scheduler preemptions, CUDA memory spikes — kauzální řetězce místo prostých metrik
- **Verze**: v0.19.0 (2026), aktivní vývoj
### GreptimeDB
Unified observability databáze — jeden backend pro metrics, logs a tracy:
- **Repozitář**: `github.com/GreptimeTeam/greptimedb` (Apache 2.0, Rust)
- **Architektura**: compute-storage disaggregation, object storage first (S3, GCS, Azure Blob), columnar storage
- **Dotazování**: SQL + PromQL v jedné query, možnost JOIN mezi metrikami a logy
- **Drop-in náhrada**: Prometheus (PromQL, remote write), Loki (Push API), Elasticsearch (bulk API), Jaeger (Query API)
- **Cost reduction**: až 50× nižší náklady oproti tradičním řešením
- **Roadmap 2026**: v1.0 GA (Q1 2026), v1.1v1.3 (Vector Index, AI Functions, Auto Rollup, adaptive resource management)
- **GreptimeDB Enterprise**: enhanced security, HA, enterprise support
### Netdata
Open-source, real-time monitoring platform pro celou infrastrukturu:
- **Repozitář**: `github.com/netdata/netdata` (GPLv3+, C; 79k★)
- **Funkce**: per-sekundové metriky, ML-based anomaly detection, AI-powered troubleshooting, 800+ integrací
- **Zero configuration**: auto-discovery, pre-configured alerts, hotové dashboardy
- **Architektura**: distributed agent → Netdata Cloud (volitelně), data zůstávají lokální
- **Energetická efektivita**: dle studie University of Amsterdam nejefektivnější nástroj pro monitoring Docker systémů
- **Netdata Cloud**: free tier (5 node), paid od $12/node/měsíc
- **Licencování**: agent GPLv3+, dashboard NCUL1, cloud closed-source
## OpenStack Monitoring
OpenStack poskytuje několik služeb pro telemetrii a monitoring:
### Ceilometer (Telemetry)
- Sběr metrik (CPU, memory, network, storage) z compute, network a storage uzlů
- Publikování do Gnocchi (time-series DB) nebo Panko (event storage)
- Notifikace přes oslo.messaging (RabbitMQ) — pipeline transformations
- Alarming: Aodh — threshold-based alarmy, kombinace metrik
### Monasca
- Modernější alternativa k Ceilometer (vyvíjen primárně pro telco use cases)
- Architektura: Monasca API → Log API → Transform → Threshold Engine → Notifier
- Backend: InfluxDB/Gnocchi, Kafka, Elasticsearch
- Podporuje alarmování, notifikace, grafové dashboardy
### Prometheus + OpenStack Exporter
- OpenStack-exporter pro Prometheus (exportuje metriky z Ceilometer / API)
- Service discovery přes Prometheus
- Grafana dashboardy pro vizualizaci
### Masakari (VM High Availability)
- Detekce a automatické zotavení VM při selhání hypervisoru (host failure)
- Evacuation instance na zdravý compute node
- Integrace s Pacemaker pro cluster management
## Zdroje
Odkazy, knihy a standardy: [sources/monitoring/sources.md](sources/monitoring/sources.md)
*Poslední revize: 2026-06-03*

142
MYSQL.en.md Normal file
View File

@@ -0,0 +1,142 @@
# 🐬 MySQL & MariaDB
## Overview
MySQL is the most widespread open-source relational database, especially in web environments (LAMP stack). MariaDB is a fork after Oracle's acquisition, fully compatible with extensions. Default choice for WordPress, Drupal, Magento, and most PHP applications.
## Architecture (server + storage engine)
Based on *High Performance MySQL* (Schwartz, Zaitsev, Tkachenko):
```text
MySQL Server Layer
├── Connection handling (thread-per-connection)
├── Query parser & optimizer
├── Built-in functions
└── Storage Engine API
├── InnoDB (default, MVCC, ACID)
├── MyISAM (legacy, table-level locks)
├── MEMORY (in-memory, HEAP)
└── ... (others)
```
### InnoDB (default engine since MySQL 5.5+)
- **MVCC** — Multi-Version Concurrency Control (snapshot isolation)
- **REPEATABLE READ** (default) — next-key locking prevents phantom reads
- **Clustered index** — primary key = physical data ordering
- **Buffer pool** — cache of data and indexes in RAM (main performance parameter)
- **Doublewrite buffer** — prevents partial page writes
### Schema design tips
- Prefer smaller data types (MEDIUMINT over INT, TIMESTAMP over DATETIME)
- Use NULL carefully (each NULL column increases index complexity)
- Use ENUM only for truly small, stable value lists
- JSON columns in MySQL 8+ — useful for flexible schema, but not for joins
### Deferred join pattern
```sql
-- 1. covering index finds PK
-- 2. only then join to full row
SELECT * FROM users
INNER JOIN (
SELECT id FROM users
WHERE status = 'active'
ORDER BY created_at DESC
LIMIT 100 OFFSET 1000
) AS tmp USING (id);
```
**Join decomposition**: Sometimes it's better to split a JOIN into several simple queries (better cache utilization, fewer locks, scaling across servers).
**IN() optimization**: MySQL sorts values in the IN() list and uses binary search (O(log n)), unlike OR clauses (O(n)).
## MariaDB differences from MySQL
| Feature | MySQL 8.x | MariaDB 11.x |
|---------|-----------|--------------|
| **Storage engine** | InnoDB (only) | InnoDB + XtraDB (fork) + Aria + MyRocks |
| **JSON** | Native JSON type | JSON alias to LONGTEXT + JSON functions |
| **CTE** | WITH (non-recursive + recursive) | WITH (non-recursive + recursive) |
| **Window functions** | Yes (8.0+) | Yes (10.2+) |
| **Sequence** | No (auto_increment only) | Yes (CREATE SEQUENCE) |
| **Thread pooling** | Enterprise only | Built-in |
| **Galera cluster** | No (natively) | Yes (native synchronous clustering) |
## ProxySQL
ProxySQL is an advanced proxy for MySQL with sophisticated routing:
| Feature | Description |
|---------|-------------|
| **Query routing** | Rules for directing queries (read/write split, sharding) |
| **Connection pooling** | Multiplexing thousands of connections into a small pool |
| **Query cache** | Result caching in memory (TTL, size limit) |
| **Query rewriting** | Rewrite SQL queries in transit |
| **Active monitoring** | Backend outage detection, automatic failover |
## Recommendations — where MySQL is better
| Area | MySQL | Competition | Why MySQL |
|------|-------|------------|-----------|
| **Web applications** | De facto standard for WP, Drupal, Magento | PostgreSQL (fewer CMS plugins) | Broadest support in web hosting providers |
| **Read-heavy (SELECT heavy)** | InnoDB buffer pool, covering index, adaptive hash | PostgreSQL (MVCC overhead on reads) | Cache-efficient, fast point lookups |
| **Replication** | Async replication, Group Replication, InnoDB Cluster | PostgreSQL (streaming replication) | Simpler setup, extensive documentation |
| **Ecosystem** | ProxySQL, Orchestrator, Vitess, PlanetScale | PostgreSQL (fewer tools) | Most tooling for cluster management |
| **JSON in MySQL 8+** | JSON data type, Multi-Value Indexes | PostgreSQL (jsonb, GIN) | Comparable, Multi-Value Index unique |
### When to use MySQL / MariaDB
- **CMS / e-commerce** — WordPress, Drupal, Magento, Joomla (all require MySQL)
- **Read-heavy applications** — InnoDB buffer pool efficiently caches frequently read data
- **Simple replication** — Group Replication / InnoDB Cluster for HA
- **MariaDB for Galera cluster** — synchronous multi-master clustering
- **PHP applications** — native PHP MySQL extensions (mysqli, PDO_MySQL)
## MySQL / MariaDB licensing
### MySQL licensing
| Variant | License | Price | Restrictions |
|---------|---------|-------|-------------|
| **MySQL Community (GPL)** | GPL v2 | $0 | If you distribute an application that contains MySQL (e.g., embedded), you must release the entire application under GPL. Web applications (over network) ≠ distribution — GPL does not apply |
| **MySQL Standard (Commercial)** | Commercial (Oracle) | ~$2,000/server/year | No GPL restrictions, production support, MySQL Enterprise Monitor |
| **MySQL Enterprise** | Commercial (Oracle) | ~$5,000/server/year | All above + MySQL Enterprise Backup, Audit, Firewall, Thread Pool, Encryption |
| **MySQL Cluster CGE** | Commercial (Oracle) | ~$10,000/server/year | Distributed multi-master cluster (NDB), telco-grade |
**When GPL matters**: If you embed MySQL into a commercial product (e.g., desktop application with MySQL library). Web applications communicating over TCP/IP are **not** distribution — GPL does not apply.
### MariaDB licensing
| Variant | License | Price | Restrictions |
|---------|---------|-------|-------------|
| **MariaDB Community** | GPL v2 | $0 | Same as MySQL Community — GPL, but without Oracle licensing risks |
| **MariaDB Enterprise** | Business Source License (BSL) | Subscription (~$2-5k/server/year) | Automatically converts to GPL v2 after 3 years. Includes enterprise features (ColumnStore, Spider, Xpand) |
| **MariaDB SkySQL** | Managed (BSL) | Pay-per-use (~$0.10-1.00/hour) | Fully managed DBaaS |
**Key difference from Oracle MySQL**:
- MariaDB is an independent fork, not controlled by Oracle
- BSL model is more liberal — becomes open source after 3 years
- MariaDB does not require commercial license for enterprise features (in MySQL they are enterprise-only)
### When to use something else
- **Complex queries / CTE / window functions** → PostgreSQL (more advanced optimizer)
- **GIS / geospatial data** → PostgreSQL + PostGIS
- **Consistency > speed** → PostgreSQL (SSI serializable)
- **High-throughput writes** → Cassandra (MySQL master bottleneck)
- **Distributed SQL cluster** → CockroachDB, Vitess (MySQL compatible sharding)
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
### Recommended reading
| Book | Authors | ISBN | Description |
|------|---------|------|-------------|
| High Performance MySQL (4th ed.) | Schwartz, Zaitsev, Tkachenko | 978-1492075292 | Comprehensive guide to MySQL architecture, optimization, and monitoring |
*Last revision: 2026-06-03*

142
MYSQL.md Normal file
View File

@@ -0,0 +1,142 @@
# 🐬 MySQL & MariaDB
## Přehled
MySQL je nejrozšířenější open-source relační databáze, zejména ve webovém prostředí (LAMP stack). MariaDB je fork po akvizici Oracle, plně kompatibilní s rozšířeními. Výchozí volba pro WordPress, Drupal, Magento a většinu PHP aplikací.
## Architektura (server + storage engine)
Na základě *High Performance MySQL* (Schwartz, Zaitsev, Tkachenko):
```text
MySQL Server Layer
├── Connection handling (thread-per-connection)
├── Query parser & optimizer
├── Built-in functions
└── Storage Engine API
├── InnoDB (výchozí, MVCC, ACID)
├── MyISAM (legacy, table-level locks)
├── MEMORY (in-memory, HEAP)
└── ... (ostatní)
```
### InnoDB (výchozí engine od MySQL 5.5+)
- **MVCC** — Multi-Version Concurrency Control (snapshot isolation)
- **REPEATABLE READ** (výchozí) — next-key locking zabraňuje phantom reads
- **Clustered index** — primární klíč = fyzické uspořádání dat
- **Buffer pool** — cache dat a indexů v RAM (hlavní parametr výkonu)
- **Doublewrite buffer** — prevence částečného zápisu stránky
### Schema design tipy
- Preferovat menší datové typy (MEDIUMINT místo INT, TIMESTAMP místo DATETIME)
- NULL používat opatrně (každý NULL sloupec zvyšuje složitost indexu)
- ENUM používat jen pro opravdu malé, stabilní seznamy hodnot
- JSON sloupce v MySQL 8+ — užitečné pro flexibilní schema, ale ne pro joinování
### Deferred join pattern
```sql
-- 1. covering index najde PK
-- 2. teprve pak join na plný řádek
SELECT * FROM users
INNER JOIN (
SELECT id FROM users
WHERE status = 'active'
ORDER BY created_at DESC
LIMIT 100 OFFSET 1000
) AS tmp USING (id);
```
**Join decomposition**: Někdy výhodnější rozdělit JOIN na několik jednoduchých dotazů (lepší využití cache, méně locků, škálování napříč servery).
**IN() optimalizace**: MySQL řadí hodnoty v IN() seznamu a používá binární vyhledávání (O(log n)), na rozdíl od OR klauzulí (O(n)).
## MariaDB rozdíly oproti MySQL
| Vlastnost | MySQL 8.x | MariaDB 11.x |
|-----------|-----------|--------------|
| **Storage engine** | InnoDB (pouze) | InnoDB + XtraDB (fork) + Aria + MyRocks |
| **JSON** | Native JSON typ | JSON alias na LONGTEXT + JSON funkce |
| **CTE** | WITH (non-recursive + recursive) | WITH (non-recursive + recursive) |
| **Window functions** | Ano (8.0+) | Ano (10.2+) |
| **Sequence** | Ne (auto_increment only) | Ano (CREATE SEQUENCE) |
| **Thread pooling** | Enterprise only | Vestavěný |
| **Galera cluster** | Ne (nativně) | Ano (nativní synchronní clustering) |
## ProxySQL
ProxySQL je advanced proxy pro MySQL s pokročilým routingem:
| Vlastnost | Popis |
|-----------|-------|
| **Query routing** | Pravidla pro směrování dotazů (read/write split, sharding) |
| **Connection pooling** | Multiplexování tisíců spojení do malého poolu |
| **Query cache** | Cache výsledků v paměti (TTL, size limit) |
| **Query rewriting** | Rewrite SQL dotazů na cestě |
| **Aktivní monitoring** | Detekce výpadků backendů, automatic failover |
## Doporučení — v čem je MySQL lepší
| Oblast | MySQL | Konkurence | Proč MySQL |
|--------|-------|------------|------------|
| **Webové aplikace** | De facto standard pro WP, Drupal, Magento | PostgreSQL (méně CMS pluginů) | Nejširší podpora ve web hosting providers |
| **Čtení (SELECT heavy)** | InnoDB buffer pool, covering index, adaptive hash | PostgreSQL (MVCC overhead u čtení) | Cache-efficient, rychlé point lookupy |
| **Replikace** | Async replication, Group Replication, InnoDB Cluster | PostgreSQL (streaming replication) | Jednodušší setup, široká dokumentace |
| **Ekosystém** | ProxySQL, Orchestrator, Vitess, PlanetScale | PostgreSQL (méně nástrojů) | Nejvíce toolingu pro správu clusteru |
| **JSON v MySQL 8+** | JSON datový typ, Multi-Value Indexes | PostgreSQL (jsonb, GIN) | Srovnatelné, Multi-Value Index unikátní |
### Kdy použít MySQL / MariaDB
- **CMS / e-commerce** — WordPress, Drupal, Magento, Joomla (všechny vyžadují MySQL)
- **Read-heavy aplikace** — InnoDB buffer pool efektivně cachuje často čtená data
- **Jednoduchá replicace** — Group Replication / InnoDB Cluster pro HA
- **MariaDB pro Galera cluster** — synchronní multi-master clustering
- **PHP aplikace** — nativní PHP MySQL extensions (mysqli, PDO_MySQL)
## MySQL / MariaDB licensing
### MySQL licensing
| Varianta | Licence | Cena | Omezení |
|----------|---------|------|---------|
| **MySQL Community (GPL)** | GPL v2 | $0 | Pokud distribuujete aplikaci, která obsahuje MySQL (např. embedded), musíte uvolnit celou aplikaci pod GPL. Webová aplikace (přes network) ≠ distribuce — GPL se netýká |
| **MySQL Standard (Commercial)** | Commercial (Oracle) | ~$2 000/server/rok | Bez GPL omezení, production support, MySQL Enterprise Monitor |
| **MySQL Enterprise** | Commercial (Oracle) | ~$5 000/server/rok | Vše výše + MySQL Enterprise Backup, Audit, Firewall, Thread Pool, Encryption |
| **MySQL Cluster CGE** | Commercial (Oracle) | ~$10 000/server/rok | Distributed multi-master cluster (NDB), telco-grade |
**Kdy GPL vadí**: Pokud embeddedujete MySQL do komerčního produktu (např. desktopová aplikace s MySQL knihovnou). Webová aplikace komunikující přes TCP/IP **není** distribuce — GPL se neuplatní.
### MariaDB licensing
| Varianta | Licence | Cena | Omezení |
|----------|---------|------|---------|
| **MariaDB Community** | GPL v2 | $0 | Stejné jako MySQL Community — GPL, ale bez Oracle licenčních rizik |
| **MariaDB Enterprise** | Business Source License (BSL) | Subscription (~$2-5k/server/rok) | Po 3 letech se automaticky mění na GPL v2. Zahrnuje enterprise funkce (ColumnStore, Spider, Xpand) |
| **MariaDB SkySQL** | Managed (BSL) | Pay-per-use (~$0.10-1.00/hod) | Fully managed DBaaS |
**Klíčový rozdíl oproti Oracle MySQL**:
- MariaDB je nezávislý fork, není pod kontrolou Oracle
- BSL model je liberálnější — po 3 letech se stává open source
- MariaDB nevyžaduje commercial licenci pro enterprise funkce (v MySQL jsou enterprise-only)
### Kdy použít něco jiného
- **Komplexní dotazy / CTE / window functions** → PostgreSQL (pokročilejší optimalizátor)
- **GIS / geoprostorová data** → PostgreSQL + PostGIS
- **Konzistence > rychlost** → PostgreSQL (SSI serializable)
- **High-throughput zápisů** → Cassandra (MySQL master bottleneck)
- **Distribuovaný SQL cluster** → CockroachDB, Vitess (MySQL kompatibilní sharding)
## Zdroje
Odkazy, knihy a standardy: [sources/databases/sources.md](sources/databases/sources.md)
### Doporučená literatura
| Kniha | Autoři | ISBN | Popis |
|-------|--------|------|-------|
| High Performance MySQL (4th ed.) | Schwartz, Zaitsev, Tkachenko | 978-1492075292 | Komplexní průvodce architekturou, optimalizací a monitoringem MySQL |
*Poslední revize: 2026-06-03*

635
NETWORKING.en.md Normal file
View File

@@ -0,0 +1,635 @@
# 🌐 Network Architecture
## Reference Model (TCP/IP)
| Layer | Protocols | Devices |
|-------|-----------|---------|
| Application | HTTP/HTTPS, DNS, SMTP, SSH | — |
| Transport | TCP, UDP | — |
| Network | IP, ICMP, BGP | Routers |
| Link | Ethernet, ARP, VLAN | Switches |
## TCP Detail
### 3-way Handshake
```
Client Server
| |
| SYN (seq=x) |
|──────────────────────────────>|
| |
| SYN+ACK (seq=y, ack=x+1) |
|<──────────────────────────────|
| |
| ACK (seq=x+1, ack=y+1) |
|──────────────────────────────>|
| |
| << established >> |
```
- **SYN** — client sends segment with SYN flag (Synchronize Sequence Number)
- **SYN-ACK** — server responds with its own SYN + acknowledgment of client's seq number
- **ACK** — client acknowledges, connection is established
- TCP Fast Open (TFO) — data in SYN packet for 0-RTT on repeated connections
### Flow Control (Sliding Window)
- **Receiver Window (rwnd)** — how much data the receiver is willing to accept
- **Sliding Window** — sender maintains a window of unacknowledged packets
- Window scaling (RFC 1323) — allows window up to 1 GB (instead of 64 KB)
- Zero Window — receiver advertises 0, sender stops, persist timer periodically tests
### Congestion Control
| Algorithm | Description | Use Case |
|-----------|-------------|----------|
| **Cubic** | Default in Linux (since kernel 2.6.19), cubic growth function | General networks, default for the Internet |
| **BBR** (Bottleneck Bandwidth and RTT) | Model-based, measures bandwidth and RTT, not packet loss | High-speed networks, YouTube, Google |
| **Reno** | Classic AIMD (Additive Increase Multiplicative Decrease) | Legacy, reference |
| **CDG** (CAIA Delay Gradient) | Delay-based, congestion detection by RTT gradient | Video streaming, real-time |
BBRv2 (2024+) — includes ECN signaling, coexistence with Cubic, better loss handling.
## DNS (Domain Name System)
- **Record types**: A, AAAA, CNAME, MX, TXT, NS, SRV, PTR, CAA, DS, DNSKEY, RRSIG, NSEC
- **DNS resolver**: recursive query through hierarchy (root → TLD → authoritative)
- **Anycast DNS** — same IP from multiple locations, routing to the nearest
- **DNS caching** — TTL control, cache poisoning protection (DNSSEC)
- **Cloud DNS** — Route53, Azure DNS, Cloud DNS
### DNS Lookup Flow (Step by Step)
```
1. User enters "api.example.com" in the browser
2. OS stub resolver checks local cache (/etc/hosts, systemd-resolved)
3. If not in cache → query to recursive resolver (ISP / 8.8.8.8 / 1.1.1.1)
4. Resolver checks its cache
5. Not found → resolver starts recursive lookup:
a. Query to root nameserver (.) → returns NS for .com
b. Query to .com TLD nameserver → returns NS for example.com
c. Query to authoritative NS for example.com → returns A record (IP)
6. Resolver stores in cache (TTL), returns IP to client
7. Client establishes TCP connection to the obtained IP
```
The whole process typically takes 10200 ms (with cache < 1 ms).
### DNSSEC Detail
- **RRSIG** — digital signature for each RRset (Resource Record Set)
- **DNSKEY** — zone public key (ZSK = Zone Signing Key, KSK = Key Signing Key)
- **DS** (Delegation Signer) — DNSKEY hash passed to the parent zone (chain of trust)
- **NSEC / NSEC3** — authenticated denial of existence (proof that a record does not exist)
- **Chain of trust**: root → .com → example.com (path from trust anchor through DS records)
```
Root DS → .com DNSKEY → .com DS → example.com DNSKEY → example.com RRSIG(A)
```
- Validation: resolver checks signatures across the entire chain up to the trust anchor
### DNS-based Service Discovery
| Mechanism | Description | Example |
|-----------|-------------|---------|
| **SRV record** | Service location (priority, weight, port, target) | `_http._tcp.example.com` |
| **Consul DNS** | Service discovery via DNS interface | `web.service.consul` |
| **CoreDNS** | Kubernetes DNS, plugin-based | `my-svc.my-namespace.svc.cluster.local` |
| **Kubernetes DNS** | Service discovery inside the cluster (kube-dns / CoreDNS) | `svc.cluster.local` |
| **mDNS** (Multicast DNS) | Zero-config, local network (Bonjour/Avahi) | `myprinter.local` |
## Load Balancing
| Type | OSI Layer | Description |
|------|-----------|-------------|
| L4 (NLB) | 4 | TCP/UDP, fast, lower latency |
| L7 (ALB) | 7 | HTTP/HTTPS, path-based routing, sticky sessions |
| Global | DNS | Geo-routing, latency-based, weighted |
### Algorithms
- Round Robin / Weighted RR
- Least Connections
- IP Hash (session persistence)
- Random
### Health Check Types
| Type | Description | Suitable For |
|------|-------------|--------------|
| **TCP health check** | TCP handshake to target port | L4 NLB, basic check |
| **HTTP health check** | HTTP GET to URL, expects 200 OK | L7 ALB, web services |
| **HTTPS health check** | HTTP + TLS handshake | Services with TLS termination |
| **gRPC health check** | gRPC Health/Check RPC (gRPC specific) | Microservices, gRPC services |
| **ICMP ping** | Ping to target IP | Basic connectivity |
### Connection Draining
- **Connection draining** (AWS) / **Deregistration delay** — when a target is removed from ASG/LB, it waits for existing connections to finish (configurable: 1-3600 s)
- **Slow start** — new target gradually receives more requests (prevents cold cache overload)
### Cross-zone Load Balancing
- **Enabled**: LB evenly distributes traffic across all AZs (even with uneven instance count)
- **Disabled**: traffic split evenly between AZs, then within AZ among instances
- AWS ALB/NLB: enabled by default (2022+), no additional charges
## Firewalls and Security
- **Stateful firewall** — tracks connection state (AWS Security Groups, Azure NSG)
- **Stateless firewall** — ACL (Network ACLs)
- **NGFW** — application layer, IPS/IDS (Palo Alto, Fortinet)
- **WAF** — web application protection (Cloudflare, AWS WAF, Azure WAF)
## Network Segmentation — Security Groups vs Network ACLs
| Property | Security Group (SG) | Network ACL (NACL) |
|----------|---------------------|---------------------|
| **State** | Stateful (automatically allows return traffic) | Stateless (explicit rule required for both directions) |
| **Level** | Instance / ENI | Subnet |
| **Rules** | Allow only | Allow and deny |
| **Evaluation** | All rules evaluated (OR) | Rules from lowest number (first match) |
| **Default** | All traffic denied (inbound), all traffic allowed (outbound) | All traffic denied (inbound and outbound) |
| **Support** | AWS, GCP (firewall rules), Azure (NSG) | AWS (NACL), GCP (firewall rules on subnet), Azure (NSG) |
### Micro-segmentation
- **Zero Trust networking** — each workload has its own security group / NGFW policy
- **Service mesh** — Istio, Linkerd, Consul Connect for L7 micro-segmentation (mTLS, authorization policies)
- **Network policies** — Kubernetes NetworkPolicy for pod-to-pod traffic segmentation
- **Tanzu / NSX** — micro-segmentation at hypervisor level
## VPN
- **Site-to-Site** — IPSec, permanent connection between sites
- **Client-to-Site** — OpenVPN, WireGuard, AnyConnect
- **Cloud VPN** — AWS VPN, Azure VPN Gateway, GCP Cloud VPN
## CDN (Content Delivery Network)
- Edge locations for caching static content
- DDoS protection
- SSL/TLS termination at edge
- Providers: CloudFront, Cloudflare, Akamai, Fastly
## BGP and Routing
- **BGP** — protocol for exchanging routes between AS (Autonomous Systems)
- **ASN** — unique network identifier
- **iBGP** — internal BGP (within AS)
- **eBGP** — external BGP (between AS)
### BGP Path Selection Algorithm
BGP router selects a single best path according to the following criteria (in priority order):
1. **WEIGHT** (Cisco-specific) — highest weight (local to router)
2. **LOCAL_PREF** — highest local preference (within AS)
3. **Originate** — prefers route originated by local router
4. **AS_PATH** — shortest AS_PATH length
5. **ORIGIN** — IGP < EGP < INCOMPLETE
6. **MED** (Multi-Exit Discriminator) — lowest MED (with same AS neighbor)
7. **eBGP > iBGP** — prefers external BGP over internal
8. **Next-hop reachable** — path to next-hop must be reachable
9. **Neighbor IP** — prefers path from router with lowest IP
10. **Router ID** — prefers path with lowest Router ID
### iBGP Full Mesh vs Route Reflectors
| Aspect | Full Mesh | Route Reflectors |
|--------|-----------|------------------|
| **Number of sessions** | n(n-1)/2 | n (each peer to RR) |
| **With 100 routers** | 4,950 sessions | 100 (with 1 RR) |
| **Scaling** | Poor (quadratic) | Linear |
| **Redundancy** | Natural | Requires multi-RR + cluster |
| **Configuration** | Simple logic | RR rules (non-transitive) |
BGP must be known for: Cloud interconnects, MPLS L3VPN, SD-WAN, Data center fabrics (VXLAN + BGP EVPN)
## VPC / Virtual Network Architecture
```
Internet ──┬── Internet Gateway (IGW)
┌──────▼──────┐
│ Public Subnet │
│ ┌──────────┐ │
│ │ ALB/NAT │ │
│ └────┬─────┘ │
└───────┼────────┘
┌───────▼────────┐
│ Private Subnet │
│ ┌──────────┐ │
│ │ App │ │
│ └────┬─────┘ │
└───────┼─────────┘
┌───────▼─────────┐
│ Data Subnet │
│ ┌────────────┐ │
│ │ Database │ │
│ └────────────┘ │
└──────────────────┘
```
### VPC Design Patterns
**Three-tier architecture**
- Web tier (public subnets) → ALB
- App tier (private subnets) → auto-scaling
- Data tier (private subnets) → RDS / self-managed DB
- NAT Gateway / Instance in public subnet for outbound traffic from app/data tier
**VPC Peering**
- Direct connection between two VPCs (same or cross-account)
- Transitive peering is **not** supported (A→B, B→C does not imply A→C)
- Use cases: sharing resources (LDAP, monitoring), service endpoints
**Transit Gateway**
- Hub-and-spoke topology, transitive routing
- Supports: VPC, VPN, Direct Connect, peering between TGWs
- Route tables per attachment — environment isolation
- AWS TGW: 50 Gbps per attachment, up to 5000 attachments
**PrivateLink / VPC Endpoint**
- Private access to services without IGW, NAT, VPC peering
- **Interface Endpoint** (ENI in subnet) — for AWS services, SaaS
- **Gateway Endpoint** (S3, DynamoDB) — route table entry, free
- **AWS PrivateLink** — Service Consumer ↔ NLB/ENI ↔ Service Provider
## MTU, Jumbo Frames, PMTUD
| Network | Standard MTU | Jumbo Frames |
|---------|--------------|--------------|
| Ethernet | 1500 B | 9001 B (AWS: 9001, Azure: 1400→9000) |
| GRE tunnel | 1476 B | — |
| PPPoE | 1492 B | — |
| VLAN (802.1Q) | 1496 B | — |
| VXLAN | N/A (inner 1500 + 50) | 8950 B |
**PMTUD** (Path MTU Discovery)
- Sets DF (Don't Fragment) bit in IP header
- If path requires fragmentation → ICMP "Fragmentation Needed" (Type 3, Code 4)
- Decreases MTU until packet passes
- Common problem: ICMP blocked by firewall → black hole (TCP connection hangs)
- **Workaround**: MSS clamping (TCP MSS = MTU - 40)
**Jumbo Frames Use Cases**
- NFS / SMB (NAS)
- iSCSI / NVMe-oF (SAN)
- HPC / MPI workloads
- Data replication (DB, DRBD)
- Amazon EFS, AWS Managed Streaming for Kafka
## Anycast vs Unicast vs Multicast
| Type | Description | Example |
|------|-------------|---------|
| **Unicast** | 1:1 — one source, one destination | Regular TCP/IP traffic |
| **Multicast** | 1:N — one source, group of receivers | IPTV, mDNS, VXLAN BUM traffic |
| **Anycast** | 1:1 from nearest — same IP from multiple locations | DNS (8.8.8.8, 1.1.1.1), Cloudflare |
| **Broadcast** | 1:ALL — all devices on the network | ARP, DHCP (limited to L2 broadcast domain) |
Anycast detail:
- Same IP prefix is announced from multiple locations (BGP)
- Traffic goes to the topologically nearest node (BGP path selection)
- **Advantages**: simple redundancy, DDoS absorption, lower latency
- **Challenges**: connection persistence (TCP), stateful anycast, routing convergence
- **Cloud**: Route53, CloudFront, Cloudflare, Google DNS
## Cloud Networking Resilience (2026)
See also: [CLOUD.md](CLOUD.md) — cloud architecture, multi-AZ, hybrid cloud connectivity.
### Cell-based Architectures
- Isolate fault domain into "cells" (group of AZ + services)
- Each cell independently deployable, own DB, own LB
- Limit blast radius: failure of one cell does not affect others
- Implementation: AWS Cell-based architecture, Azure STAG (Scale Tier Availability Group)
### DNS Resilience
- **Anycast DNS** — same IP from multiple regions, routing to the nearest
- **DNS failover** — health checks automatically remove unavailable endpoints
- **Multi-DNS provider** — Route53 + Cloudflare + UltraDNS to eliminate SPOF
### Traffic Engineering
- **BGP optimization** — AS path prepend, MED, local pref for controlling inbound/outbound traffic
- **Global Load Balancing** — GSLB at DNS level (latency-based, geo-proximity, weighted)
- **AIOps** — ML-based traffic pattern prediction and automatic scaling
### New Trends
- **Path Aware Networking** — applications choose the network path based on current conditions
- **Segment Routing (SR-MPLS / SRv6)** — MPLS simplification, programmable paths
- **Zero Trust Networking** — micro-segmentation, identity-based access, never trust / always verify
## Advanced Topics from Books
### TCP/IP Illustrated (Stevens, ISBN 978-0321336316)
Key architectural principles according to the book:
- **End-to-End Argument** — correctness and completeness of communication can only be ensured at the application level, not at lower layers. The network should be "dumb", end stations "smart".
- **Fate Sharing** — all state necessary to maintain active communication must be stored at the endpoints, not in the network.
- **Layering** — hierarchical layering of protocols per the OSI model; each layer encapsulates the PDU from the higher layer and adds its own header.
- **Multiplexing/Demultiplexing** — protocols at the same layer coexist thanks to identifiers (IP proto, TCP/UDP port).
- **Sliding Window** — efficient link utilization under high latency (window size = bandwidth × RTT).
The book covers the entire TCP/IP stack from the link layer (Ethernet, ARP, PPP) through IP, ICMP, DHCP, NAT, DNS, UDP, TCP (connection management, timeout, retransmission, congestion control, keepalive) to applications (SNMP, Telnet, FTP, SMTP, NFS, HTTP).
### AI Data Center Network Design (Subramaniam, ISBN 978-0-13-543628-8)
Comprehensive, vendor-agnostic guide to designing network infrastructure for AI clusters.
**Key Concepts:**
- **Rail-Optimized Design (ROD)** — connecting GPUs across racks along "rails", each rail forms an independent network for all-reduce communication. Minimizes latency for synchronous training.
- **Rail-Unified Design (RUD)** — shared network fabric for all GPUs, more flexible but higher demands on load balancing.
- **RoCEv2 (RDMA over Converged Ethernet)** — primary transport for AI clusters: requires lossless fabric (ECN, PFC, DCQCN, SFC, CSIG).
- **Load Balancing for AI** — ECMP is insufficient, requires dynamic/global load balancing (DLB/GLB), flowlet-based rebalancing, per-packet spraying.
- **Topologies** — Clos (3-stage/5-stage), Dragonfly, Torus for scaling to tens of thousands of GPUs.
- **Ultra Ethernet Consortium (UEC)** — new standard for Ethernet in AI clusters (2025+), addresses RoCEv2 limitations.
- **Storage for AI** — NVMe-oF, GPUDirect Storage, parallel file systems for checkpointing and dataset loading.
- **KPIs** — Job Completion Time (JCT), tail latency, fabric utilization, PFC storm detection.
### Cloud Networking and Resilience (Critelli, ISBN 979-8868824357)
Practical guide to building resilient cloud networks (Apress, 2026). The author is EMEA Lead for Networking & Resilience at AWS.
**Layered Approach to Resilience (per OSI model):**
| Layer | Measures |
|-------|----------|
| L1 (Physical) | Redundant connections, diverse fibre paths, DWDM |
| L2 (Link) | MLAG, LACP, spanning-tree fast convergence |
| L3 (Network) | BGP multi-homing, AS path prepend, Anycast |
| L4 (Transport) | Connection draining, slow start, health checks |
| L7 (Application) | DNS failover, global load balancing, cell-based architectures |
**Regulatory Frameworks:** DORA (Digital Operational Resilience Act), NIS2 — require regular resilience testing, chaos engineering, business continuity plans.
**AIOps in Resilience:** ML-based traffic pattern prediction, automatic scaling, proactive fault detection (transition from reactive monitoring to predictive prevention).
### Zero Trust in Resilient Cloud and Network Architectures (Halley et al., ISBN 978-0-13-820460-0)
Cisco Press — practical guide for deploying Zero Trust architectures in hybrid and cloud environments.
**Implementation Framework:**
- **User and Device Trust** — verification of both user and device identity before granting access (SSE — Security Service Edge).
- **Application Access Policies** — granular rules at the application level, not IP addresses.
- **Greenfield vs Brownfield** — new networks built as Zero Trust from the ground up vs. migration of existing infrastructure.
- **Automation** — Terraform, Ansible for provisioning; Meraki, EVPN, Pub/Sub telemetry.
- **Industrial Zero Trust** — extending the concept to OT/ICS environments.
- **Quantum Security** — preparation for post-quantum cryptography in network architectures.
### The Segmentation Blueprint (Kulkarni, ISBN 978-0-13-546236-2)
Cisco Press (2026) — pragmatic guide to network segmentation from VLAN to nanosegmentation.
**Evolution of Segmentation:**
| Generation | Technology | Scope |
|------------|------------|-------|
| Traditional | VLAN, ACL, firewall | Subnet |
| Micro-segmentation | Security Groups, Network Policies | Workload / instance |
| Nanosegmentation | Service mesh (Istio, Linkerd), mTLS | Application / API / process |
**Segmentation Maturity Model:**
1. **Initial** — flat network, no segmentation
2. **Basic** — VLANs, firewall between environments
3. **Defined** — Security Groups, service access policies
4. **Managed** — Micro-segmentation, Network Policies, EVPN
5. **Optimized** — Nanosegmentation, service mesh, Zero Trust, AI-driven policy management
**Key Metric:** Blast radius — how many workloads are compromised when one node is breached. Goal is reduction to a minimum.
### Segment Routing for SP and Enterprise Networks (Deragisch, ISBN 978-0-13-823101-9)
Cisco Press (2024) — comprehensive guide to Segment Routing for both MPLS and IPv6 data plane.
**SR-MPLS vs SRv6:**
| Property | SR-MPLS | SRv6 |
|----------|---------|------|
| SID length | 20 bit (MPLS label) | 128 bit (IPv6 address) |
| Data plane | MPLS | IPv6 + SRH (Segment Routing Header) |
| Signaling | IGP (IS-IS/OSPF) extensions | IGP + BGP extensions |
| Maturity | Mature, widely deployed | Emerging, standardization complete |
| Use case | SP networks, MPLS migration | Cloud, DC, 5G, end-to-end programmability |
**Advantages of SR over classic MPLS:**
- Elimination of LDP/RSVP-TE (signaling is part of IGP)
- Traffic engineering state moved from nodes to packet headers (source routing)
- Fast reroute (FRR) without additional protocols
- Egress Peer Engineering (EPE) — selection of AS exit point
- Micro-loop avoidance during convergence
**Migration Strategies:** Greenfield (new network), Brownfield (gradual migration from MPLS), "SR in a box" — combination of SR and LDP.
### Understanding and Designing Azure Networking (Stuart, Moreno, 2025)
Practical guide to designing Azure networks by two Microsoft Solution Engineers (former CCIEs).
**Key Topics:**
| Area | Key Services and Concepts |
|------|---------------------------|
| **Topologies** | Hub-and-spoke, Virtual WAN, multi-hub designs, Azure Route Server |
| **Hybrid Connectivity** | ExpressRoute, VPN Gateway, SD-WAN integration |
| **Multi-cloud** | Azure ↔ AWS/GCP, cross-cloud fabrics |
| **Security** | NSG, Azure Firewall, DDoS Protection, WAF, AVNM, ZTNA |
| **DNS & PaaS** | Private Link, Private DNS Zones, Private Resolver, hybrid DNS forwarding |
| **Application Delivery** | Azure Load Balancer, App Gateway, Front Door, Traffic Manager |
| **Monitoring** | Network Watcher, Traffic Analytics, Azure Monitor, Policy-as-code |
**Design Decision Framework:** Gather requirements → analyze constraints → select topology → implement → monitor.
### Mastering Next-Gen Juniper Data Centers (Chatterjee, ISBN 978-0-13-533636-6)
Addison-Wesley (2026) — hands-on guide to EVPN VXLAN fabrics on Juniper devices.
**Key Architectures:**
- **EVPN VXLAN fabric** — multi-tenant overlay networks with BGP EVPN control plane and VXLAN data plane.
- **Multivendor interoperability** — detailed procedures for EVPN across Juniper, Cisco NX-OS, Arista EOS.
- **Multicast in EVPN VXLAN** — intra-subnet and inter-subnet multicast design (IGMP/MLD proxying, PIM, EVPN Type-6/7 routes).
- **Day-2 operations** — Juniper Apstra for automation (Terraform provider), telemetry (gNMI, OpenConfig).
- **Service chaining** — connecting NGFW, load balancers within the fabric.
- **DCI with EVPN** — Over-the-Top (OTT) and Integrated Interconnect (VXLAN stitching, MPLS transit).
**Evolution from previous book (Deploying Juniper Data Centers with EVPN VXLAN, 2024):** Expansion with advanced topics — multicast, interoperability, Apstra Day-2, observability stack.
### Intelligent Cloud Networking: AI-Driven Resource Management (Yadav, ISBN 9364220110)
Intersection of AI/ML and cloud network management (Addition Publishing, 2026).
**AI Applications in Network Management:**
| Area | Technique | Benefit |
|------|-----------|---------|
| **Flow prediction** | LSTM, Transformer | Traffic pattern prediction, proactive scaling |
| **Flow classification** | CNN, RL | Traffic type identification for QoS |
| **Load balancing** | DRL (Deep RL) | Dynamic load distribution, congestion reduction |
| **Resource management** | Q-learning, DQN | Optimization of CPU/memory/network allocation |
| **Routing optimization** | DRL, GNN | Adaptive routing based on current conditions |
| **Congestion control** | ML-based CC | Predictive congestion control (instead of reacting to loss) |
| **Anomaly detection** | Autoencoders, Isolation Forest | Real-time attack and anomaly detection |
| **Blockchain security** | Smart contracts | Decentralized access control, audit trail |
**Technology Trends:**
- **Ultra Ethernet Consortium (UEC)** — next-generation Ethernet for AI, lossless fabric, telemetry, adaptive routing.
- **Path Aware Networking** — applications choose path based on current conditions (latency, loss, cost).
- **Self-optimizing networks** — closed loop: telemetry → AI analysis → automatic action → feedback.
## OpenStack Networking (Neutron)
OpenStack Neutron is an SDN framework for managing virtual networks in a multi-tenant environment. Supports VLAN, VXLAN, GRE tunnels, security groups, QoS, and LBaaS (Octavia).
### Backends
| Backend | Description | Suitable For |
|---------|-------------|--------------|
| **OVN (Open Virtual Network)** | Native OpenFlow/OVSDB backend; replaces OVS+agent architecture | Production, scalable deployments |
| **OVS (Open vSwitch)** | Classic agent-based backend (neutron-openvswitch-agent) | Small deployments, legacy |
| **Linux Bridge** | Simple backend without OVS | Development, testing, embedded |
| **Hyper-V** | Windows Server backend | Hybrid environments |
### Important Concepts
- **Networks, Subnets, Ports** — basic network objects
- **Routers** — L3 forwarding between tenant networks (DVR for distributed routing)
- **Security Groups** — stateful firewall rules at the port level
- **Floating IPs** — public IPs mapped to instances (1:1 NAT)
- **LBaaS / Octavia** — load balancing as a service (HAProxy, amphora)
- **Trunk ports** — VLAN tagging for instances (parent + subports)
### Performance Tuning
- **DPDK** — userspace packet processing, bypass kernel, lower latency
- **SR-IOV** — passthrough VF to instance, minimal hypervisor overhead
- **NUMA pinning** — vCPU/memory/NIC affinity for compute instances
- **Hardware offload** — OVS TC Flower, ASAP²
### Use Cases
- Multi-tenant cloud (public and private)
- Telco/NFVI (DPDK, SR-IOV, low-latency)
- SDN lab / network function virtualization
## Zero Trust Networking
Zero Trust is a "never trust, always verify" security model — no entity is implicitly trusted, regardless of its location in the network.
### Principles (NIST SP 800-207)
1. **All resources are external** — there is no trusted internal network
2. **Least privilege** — access only to necessary resources
3. **Micro-segmentation** — workload isolation at the individual process/container level
4. **Encrypt everything** — TLS/mTLS for all communication
5. **Continuous verification** — every request is authenticated and authorized
6. **Dynamic policies** — rules change based on context (user, device, location, time)
### Implementation Layers
| Layer | Technology | Description |
|-------|------------|-------------|
| **Identity** | OIDC, SAML, LDAP | User and device authentication |
| **Device** | MDM, UEM, device certificates | Device state verification (compliant, patch level) |
| **Network** | Micro-segmentation, firewall, SDN | Traffic isolation between workloads |
| **Application** | mTLS, service mesh, API gateway | Applications enforce mutual authentication |
| **Data** | Encryption at rest + in transit, DLP | Data protection regardless of location |
| **Analytics** | SIEM, UEBA, AI/ML | Real-time anomaly and threat detection |
### Tools and Platforms
| Category | Tools |
|----------|-------|
| **ZTNA (Zero Trust Network Access)** | Cloudflare Access, Zscaler, Netskope, Palo Alto Prisma |
| **Service Mesh** | Istio, Linkerd, Consul Connect, Cilium |
| **Micro-segmentation** | VMware NSX, Illumio, Guardicore, Akamai |
| **BeyondCorp** | Google BeyondCorp (open-source: BeyondCorp Alliance) |
| **Security Service Edge (SSE)** | SWG, CASB, ZTNA in one (Zscaler, Netskope, Cloudflare) |
### Zero Trust in the Data Center
In a private DC, Zero Trust is deployed via:
- **EVPN VXLAN** — overlay network with tenant isolation
- **Network Policies** (Kubernetes) — per-pod firewall rules
- **Cilium** — eBPF-based L3/L7 policy enforcement
- **WireGuard / IPsec** — encryption between nodes
- **HashiCorp Boundary** — identity-based access to servers without bastion host
## Best Practices
- **Network segmentation** — separate environments (dev/staging/prod), tiers (web/app/db)
- **Least privilege access** — security groups allow only necessary traffic
- **Monitoring** — VPC Flow Logs, netflow, sFlow
- **Redundancy** — multi-AZ, multi-region for critical services
- **Encryption in transit** — TLS everywhere, mTLS for service-to-service
- **DDoS protection** — AWS Shield, Azure DDoS Protection, Cloudflare
## Resources
Links, books and standards: [sources/networking/sources.md](sources/networking/sources.md)
- **MTU alignment** — consistent MTU across the entire path, check ICMP blocking for PMTUD
- **IP planning** — RFC 1918 (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), avoid overlaps for peering
## TLS Detail
### TLS 1.3 Handshake (1-RTT)
```
Client Server
| |
| ClientHello (key_share, sig_algs) |
|─────────────────────────────────────────>|
| |
| ServerHello + EncryptedExtensions |
| + Certificate + CertificateVerify |
| + Finished |
|<─────────────────────────────────────────|
| |
| Finished |
|─────────────────────────────────────────>|
| |
| << Application Data >> |
```
- **0-RTT** (early data) — client sends data immediately with the first message (on repeated connection with PSK)
- 0-RTT risk: replay attacks (HTTP GET is safe, POST requires protection)
- Compared to TLS 1.2: removed obsolete ciphers, AEAD required, faster handshake (2 RTT → 1 RTT)
### Cipher Suites
| Suite | Key exchange | Auth | Encryption | MAC | Status |
|-------|-------------|------|-----------|-----|--------|
| `TLS_AES_128_GCM_SHA256` | (EC)DHE | (EC)DHE | AES-128-GCM | AEAD | TLS 1.3 default |
| `TLS_AES_256_GCM_SHA384` | (EC)DHE | (EC)DHE | AES-256-GCM | AEAD | Higher security |
| `TLS_CHACHA20_POLY1305_SHA256` | (EC)DHE | (EC)DHE | ChaCha20-Poly1305 | AEAD | Mobile / no AES-NI |
| `ECDHE-ECDSA-AES128-GCM-SHA256` | ECDHE | ECDSA | AES-128-GCM | AEAD | TLS 1.2 (PFS) |
| `ECDHE-RSA-AES128-GCM-SHA256` | ECDHE | RSA | AES-128-GCM | AEAD | TLS 1.2 (PFS) |
PFS (Perfect Forward Secrecy) — compromising the private key cannot decrypt previously captured traffic (ECDHE + ephemeral keys).
### Certificate Chain Validation
```
1. Client receives certificate chain from server
2. Validation:
a. Date: certificate is not expired and is valid (notBefore, notAfter)
b. CRL / OCSP: certificate is not revoked (OCSP stapling to reduce latency)
c. Signature chain: each cert's signature in the chain is verified with the issuer's public key
d. Root CA: the last cert in the chain is trusted (in client's trust store)
3. CN / SAN: domain name in the certificate must match the target domain
```
Typical chain: `Leaf Cert → Intermediate CA → Root CA` (self-signed, in trust store).
*Last revised: 2026-06-03*

635
NETWORKING.md Normal file
View File

@@ -0,0 +1,635 @@
# 🌐 Síťová architektura
## Referenční model (TCP/IP)
| Vrstva | Protokoly | Zařízení |
|--------|-----------|----------|
| Aplikační | HTTP/HTTPS, DNS, SMTP, SSH | — |
| Transportní | TCP, UDP | — |
| Síťová | IP, ICMP, BGP | Routery |
| Linková | Ethernet, ARP, VLAN | Switche |
## TCP detail
### 3-way handshake
```
Client Server
| |
| SYN (seq=x) |
|──────────────────────────────>|
| |
| SYN+ACK (seq=y, ack=x+1) |
|<──────────────────────────────|
| |
| ACK (seq=x+1, ack=y+1) |
|──────────────────────────────>|
| |
| << established >> |
```
- **SYN** — client odešle segment s příznakem SYN (Synchronize Sequence Number)
- **SYN-ACK** — server odpoví vlastním SYN + potvrzením clientova seq čísla
- **ACK** — client potvrdí, spojení je navázáno
- TCP Fast Open (TFO) — data v SYN paketu pro 0-RTT u opakovaných spojení
### Flow control (sliding window)
- **Receiver Window (rwnd)** — kolik dat je receiver ochoten přijmout
- **Sliding window** — sender udržuje okno nepotvrzených paketů
- Window scaling (RFC 1323) — umožňuje okno až 1 GB (místo 64 KB)
- Zero Window — receiver oznámí 0, sender zastaví, persist timer periodicky testuje
### Congestion control
| Algoritmus | Popis | Use case |
|-----------|-------|----------|
| **Cubic** | Výchozí v Linuxu (od kernel 2.6.19), kubická growth funkce | Obecné sítě, výchozí pro Internet |
| **BBR** (Bottleneck Bandwidth and RTT) | Model-based, měří bandwidth a RTT, ne packet loss | Vysokorychlostní sítě, YouTube, Google |
| **Reno** | Classic AIMD (Additive Increase Multiplicative Decrease) | Legacy, reference |
| **CDG** (CAIA Delay Gradient) | Delay-based, detekce congeste podle RTT gradientu | Videostreaming, real-time |
BBRv2 (2024+) — zahrnuje ECN signalizaci, koexistenci s Cubic, lepší handling při loss.
## DNS (Domain Name System)
- **Record typy**: A, AAAA, CNAME, MX, TXT, NS, SRV, PTR, CAA, DS, DNSKEY, RRSIG, NSEC
- **DNS resolver**: rekurzivní dotaz přes hierarchii (root → TLD → authoritative)
- **Anycast DNS** — stejná IP z více lokací, směrování k nejbližší
- **DNS caching** — TTL řízení, cache poisoning ochrana (DNSSEC)
- **Cloud DNS** — Route53, Azure DNS, Cloud DNS
### DNS lookup flow (krok za krokem)
```
1. Uživatel zadá "api.example.com" do prohlížeče
2. OS stub resolver zkontroluje lokální cache (/etc/hosts, systemd-resolved)
3. Pokud není v cache → dotaz na rekurzivní resolver (ISP / 8.8.8.8 / 1.1.1.1)
4. Resolver zkontroluje svou cache
5. Nenalezeno → resolver začne rekurzivní lookup:
a. Dotaz na root nameserver (.) → vrátí NS pro .com
b. Dotaz na .com TLD nameserver → vrátí NS pro example.com
c. Dotaz na autoritativní NS pro example.com → vrátí A záznam (IP)
6. Resolver uloží do cache (TTL), vrátí IP clientu
7. Client naváže TCP spojení na získanou IP
```
Celý proces typicky trvá 10200 ms (s cache < 1 ms).
### DNSSEC detail
- **RRSIG** — digitální podpis pro každý RRset (Resource Record Set)
- **DNSKEY** — veřejný klíč zóny (ZSK = Zone Signing Key, KSK = Key Signing Key)
- **DS** (Delegation Signer) — hash DNSKEY předávaný do parent zóny (řetěz důvěry)
- **NSEC / NSEC3** — authenticated denial of existence (důkaz, že záznam neexistuje)
- **Chain of trust**: root → .com → example.com (cesta od trust anchor přes DS recordy)
```
Root DS → .com DNSKEY → .com DS → example.com DNSKEY → example.com RRSIG(A)
```
- Validace: resolver zkontroluje podpisy přes celý řetěz až k trust anchor
### DNS-based service discovery
| Mechanismus | Popis | Příklad |
|------------|-------|---------|
| **SRV record** | Service location (priority, weight, port, target) | `_http._tcp.example.com` |
| **Consul DNS** | Service discovery přes DNS rozhraní | `web.service.consul` |
| **CoreDNS** | Kubernetes DNS, plugin-based | `my-svc.my-namespace.svc.cluster.local` |
| **Kubernetes DNS** | Service discovery uvnitř clusteru (kube-dns / CoreDNS) | `svc.cluster.local` |
| **mDNS** (Multicast DNS) | Zero-config, lokální síť (Bonjour/Avahi) | `myprinter.local` |
## Load Balancing
| Typ | Vrstva (OSI) | Popis |
|-----|-------------|-------|
| L4 (NLB) | 4 | TCP/UDP, rychlý, nižší latence |
| L7 (ALB) | 7 | HTTP/HTTPS, path-based routing, sticky sessions |
| Global | DNS | Geo-routing, latency-based, weighted |
### Algoritmy
- Round Robin / Weighted RR
- Least Connections
- IP Hash (session persistence)
- Random
### Health check typy
| Typ | Popis | Vhodné pro |
|-----|-------|-----------|
| **TCP health check** | TCP handshake na cílový port | L4 NLB, základní check |
| **HTTP health check** | HTTP GET na URL, očekává 200 OK | L7 ALB, webové služby |
| **HTTPS health check** | HTTP + TLS handshake | Služby s TLS terminací |
| **gRPC health check** | gRPC Health/Check RPC (gRPC specific) | Microservices, gRPC služby |
| **ICMP ping** | Ping na cílovou IP | Základní konektivita |
### Connection draining
- **Connection draining** (AWS) / **Deregistration delay** — při odebrání targetu z ASG/LB se počká, až existující spojení skončí (configurable: 1-3600 s)
- **Slow start** — nový target dostává postupně více requestů (zabrání přetížení cold cache)
### Cross-zone load balancing
- **Enabled**: LB rovnoměrně rozděluje traffic napříč všemi AZ (i nerovnoměrný počet instancí)
- **Disabled**: traffic rozdělen rovnoměrně mezi AZ, pak v rámci AZ mezi instance
- AWS ALB/NLB: enabled by default (2022+), bez dalších poplatků
## Firewally a bezpečnost
- **Stateful firewall** — sleduje stav spojení (AWS Security Groups, Azure NSG)
- **Stateless firewall** — ACL (Network ACLs)
- **NGFW** — aplikační vrstva, IPS/IDS (Palo Alto, Fortinet)
- **WAF** — ochrana web aplikací (Cloudflare, AWS WAF, Azure WAF)
## Network segmentation — Security Groups vs Network ACLs
| Vlastnost | Security Group (SG) | Network ACL (NACL) |
|-----------|-------------------|-------------------|
| **State** | Stateful (automaticky povoluje return traffic) | Stateless (nutné explicitní pravidlo pro oba směry) |
| **Úroveň** | Instance / ENI | Subnet |
| **Pravidla** | Povolující (allow only) | Povolující i zakazující (allow + deny) |
| **Vyhodnocení** | Všechna pravidla se vyhodnotí (OR) | Pravidla od nejnižšího čísla (first match) |
| **Default** | All traffic denied (inbound), all traffic allowed (outbound) | All traffic denied (inbound i outbound) |
| **Podpora** | AWS, GCP (firewall rules), Azure (NSG) | AWS (NACL), GCP (firewall rules na subnet), Azure (NSG) |
### Mikrosegmentace
- **Zero Trust networking** — každý workload má vlastní security group / NGFW policy
- **Service mesh** — Istio, Linkerd, Consul Connect pro L7 mikrosegmentaci (mTLS, authorization policies)
- **Network policies** — Kubernetes NetworkPolicy pro segmentaci pod-to-pod trafficu
- **Tanzu / NSX** — micro-segmentation na hypervisor úrovni
## VPN
- **Site-to-Site** — IPSec, trvalé spojení mezi lokalitami
- **Client-to-Site** — OpenVPN, WireGuard, AnyConnect
- **Cloud VPN** — AWS VPN, Azure VPN Gateway, GCP Cloud VPN
## CDN (Content Delivery Network)
- Edge lokace pro cachování statického obsahu
- DDoS ochrana
- SSL/TLS termination na edge
- Poskytovatelé: CloudFront, Cloudflare, Akamai, Fastly
## BGP a routing
- **BGP** — protokol pro výměnu rout mezi AS (autonomními systémy)
- **ASN** — unikátní identifikátor sítě
- **iBGP** — interní BGP (uvnitř AS)
- **eBGP** — externí BGP (mezi AS)
### BGP path selection algoritmus
BGP router vybírá jedinou nejlepší cestu podle následujících kritérií (v pořadí priority):
1. **WEIGHT** (Cisco-specific) — nejvyšší weight (local to router)
2. **LOCAL_PREF** — nejvyšší local preference (v rámci AS)
3. **Originate** — preferuje route originovanou lokálním routerem
4. **AS_PATH** — nejkratší AS_PATH length
5. **ORIGIN** — IGP < EGP < INCOMPLETE
6. **MED** (Multi-Exit Discriminator) — nejnižší MED (při stejném AS souseda)
7. **eBGP > iBGP** — preferuje externí BGP před interním
8. **Next-hop reachable** — cesta k next-hop musí být dosažitelná
9. **Neighbor IP** — preferuje cestu od routeru s nejnižší IP
10. **Router ID** — preferuje cestu s nejnižším Router ID
### iBGP full mesh vs Route Reflectors
| Aspekt | Full mesh | Route reflectors |
|--------|-----------|-----------------|
| **Počet session** | n(n-1)/2 | n (každý peer k RR) |
| **Při 100 routerech** | 4 950 session | 100 (při 1 RR) |
| **Škálování** | Špatné (kvadratické) | Lineární |
| **Redundance** | Přirozená | Nutné multi-RR + cluster |
| **Konfigurace** | Jednoduchá logika | RR pravidel (non-transitive) |
BGP nutné znát pro: Cloud interconnects, MPLS L3VPN, SD-WAN, Data center fabrics (VXLAN + BGP EVPN)
## Architektura VPC / Virtual Network
```
Internet ──┬── Internet Gateway (IGW)
┌──────▼──────┐
│ Public Subnet │
│ ┌──────────┐ │
│ │ ALB/NAT │ │
│ └────┬─────┘ │
└───────┼────────┘
┌───────▼────────┐
│ Private Subnet │
│ ┌──────────┐ │
│ │ App │ │
│ └────┬─────┘ │
└───────┼─────────┘
┌───────▼─────────┐
│ Data Subnet │
│ ┌────────────┐ │
│ │ Database │ │
│ └────────────┘ │
└──────────────────┘
```
### VPC design patterns
**Three-tier architecture**
- Web tier (public subnets) → ALB
- App tier (private subnets) → auto-scaling
- Data tier (private subnets) → RDS / self-managed DB
- NAT Gateway / Instance v public subnet pro outbound traffic z app/data tier
**VPC Peering**
- Přímé spojení mezi dvěma VPC (same nebo cross-account)
- Transitive peering **není** podporován (A→B, B→C neznamená A→C)
- Případy: sharing resources (LDAP, monitoring), service endpoints
**Transit Gateway**
- Hub-and-spoke topologie, transitive routing
- Podporuje: VPC, VPN, Direct Connect, peering mezi TGW
- Route tables per attachment — izolace environmentů
- AWS TGW: 50 Gbps per attachment, až 5000 attachments
**PrivateLink / VPC Endpoint**
- Privátní přístup k službám bez IGW, NAT, VPC peering
- **Interface Endpoint** (ENI v subnet) — pro AWS services, SaaS
- **Gateway Endpoint** (S3, DynamoDB) — route table entry, zdarma
- **AWS PrivateLink** — Service Consumer ↔ NLB/ENI ↔ Service Provider
## MTU, jumbo frames, PMTUD
| Síť | Standardní MTU | Jumbo frames |
|-----|---------------|--------------|
| Ethernet | 1500 B | 9001 B (AWS: 9001, Azure: 1400→9000) |
| GRE tunel | 1476 B | — |
| PPPoE | 1492 B | — |
| VLAN (802.1Q) | 1496 B | — |
| VXLAN | N/A (inner 1500 + 50) | 8950 B |
**PMTUD** (Path MTU Discovery)
- Nastaví DF (Don't Fragment) bit v IP hlavičce
- Pokud cesta vyžaduje fragmentaci → ICMP "Fragmentation Needed" (Type 3, Code 4)
- Snižuje MTU, dokud paket neprojde
- Častý problém: ICMP blokovaný firewallem → black hole (TCP connection hangs)
- **Workaround**: MSS clamping (TCP MSS = MTU - 40)
**Jumbo frames use cases**
- NFS / SMB (NAS)
- iSCSI / NVMe-oF (SAN)
- HPC / MPI workloads
- Data replication (DB, DRBD)
- Amazon EFS, AWS Managed Streaming pro Kafka
## Anycast vs Unicast vs Multicast
| Typ | Popis | Příklad |
|-----|-------|---------|
| **Unicast** | 1:1 — jeden source, jeden destination | Běžný TCP/IP provoz |
| **Multicast** | 1:N — jeden source, skupina receiverů | IPTV, mDNS, VXLAN BUM traffic |
| **Anycast** | 1:1 z nejbližšího — stejná IP z více lokací | DNS (8.8.8.8, 1.1.1.1), Cloudflare |
| **Broadcast** | 1:VŠICHNI — všechna zařízení v síti | ARP, DHCP (omezeno na L2 broadcast doménu) |
Anycast detail:
- Stejná IP prefix je anunciována z více lokací (BGP)
- Traffic jde k topologicky nejbližšímu uzlu (BGP path selection)
- **Výhody**: jednoduchá redundance, DDoS absorpce, nižší latence
- **Výzvy**: connection persistence (TCP), stateful anycast, routing convergence
- **Cloud**: Route53, CloudFront, Cloudflare, Google DNS
## Cloud networking resilience (2026)
Viz také: [CLOUD.md](CLOUD.md) — cloud architektura, multi-AZ, hybrid cloud konektivita.
### Cell-based architektury
- Izolace fault domain do "cell" (skupina AZ + services)
- Každá cell samostatně deploysovatelná, vlastní DB, vlastní LB
- Limit blast radius: selhání jedné cell neovlivní ostatní
- Implementace: AWS Cell-based architecture, Azure STAG (Scale Tier Availability Group)
### DNS resilience
- **Anycast DNS** — stejná IP z více regionů, směrování k nejbližšímu
- **DNS failover** — health checky automaticky odstraňují nedostupné endpointy
- **Multi-DNS provider** — Route53 + Cloudflare + UltraDNS pro eliminaci SPOF
### Traffic engineering
- **BGP optimization** — AS path prepend, MED, local pref pro řízení vstupního/výstupního trafficu
- **Global Load Balancing** — GSLB na DNS úrovni (latency-based, geo-proximity, weighted)
- **AIOps** — ML-based predikce traffic patternů a automatické škálování
### Nové trendy
- **Path Aware Networking** — aplikace si vybírá cestu sítí podle aktuálních podmínek
- **Segment Routing (SR-MPLS / SRv6)** — zjednodušení MPLS, programovatelné cesty
- **Zero Trust Networking** — mikrosegmentace, identity-based access, never trust / always verify
## Rozšířená témata podle knih
### TCP/IP Illustrated (Stevens, ISBN 978-0321336316)
Klíčové architektonické principy dle knihy:
- **End-to-End Argument** — korektnost a kompletnost komunikace může být zajištěna pouze na úrovni aplikace, nikoliv nižších vrstev. Síť má být "hloupá", koncové stanice "chytré".
- **Fate Sharing** — veškerý stav nutný k udržení aktivní komunikace musí být uložen na koncových bodech (endpoints), nikoliv v síti.
- **Layering** — hierarchické vrstvení protokolů dle OSI modelu; každá vrstva zapouzdřuje PDU z vyšší vrstvy a přidává vlastní hlavičku.
- **Multiplexing/Demultiplexing** — protokoly na stejné vrstvě koexistují díky identifikátorům (IP proto, TCP/UDP port).
- **Sliding window** — efektivní využití linky při vysoké latenci (window size = bandwidth × RTT).
Kniha pokrývá celý TCP/IP stack od linkové vrstvy (Ethernet, ARP, PPP) přes IP, ICMP, DHCP, NAT, DNS, UDP, TCP (connection management, timeout, retransmission, congestion control, keepalive) až po aplikace (SNMP, Telnet, FTP, SMTP, NFS, HTTP).
### AI Data Center Network Design (Subramaniam, ISBN 978-0-13-543628-8)
Komplexní, vendor-agnostický průvodce návrhem síťové infrastruktury pro AI clustery.
**Klíčové koncepty:**
- **Rail-Optimized Design (ROD)** — propojení GPU napříč racky po "kolejích" (rail), každá kolej tvoří nezávislou síť pro all-reduce komunikaci. Minimalizuje latenci pro synchronní training.
- **Rail-Unified Design (RUD)** — sdílená network fabric pro všechny GPU, flexibilnější, ale vyšší nároky na load balancing.
- **RoCEv2 (RDMA over Converged Ethernet)** — primární transport pro AI clustery: vyžaduje lossless fabric (ECN, PFC, DCQCN, SFC, CSIG).
- **Load balancing pro AI** — ECMP nestačí, nutné dynamic/global load balancing (DLB/GLB), flowlet-based rebalancing, per-packet spraying.
- **Topologie** — Clos (3-stage/5-stage), Dragonfly, Torus pro škálování na desítky tisíc GPU.
- **Ultra Ethernet Consortium (UEC)** — nový standard pro Ethernet v AI clastrech (2025+), řeší omezení RoCEv2.
- **Storage pro AI** — NVMe-oF, GPUDirect Storage, parallel file systems pro checkpointing a dataset loading.
- **KPIs** — Job Completion Time (JCT), tail latency, fabric utilization, PFC storm detection.
### Cloud Networking and Resilience (Critelli, ISBN 979-8868824357)
Praktický průvodce budováním resilientních cloudových sítí (Apress, 2026). Autor je EMEA Lead pro Networking & Resilience v AWS.
**Vrstvový přístup k resilienci (podle OSI modelu):**
| Vrstva | Opatření |
|--------|----------|
| L1 (Fyzická) | Redundantní přípojky, diverse fibre paths, DWDM |
| L2 (Linková) | MLAG, LACP, spanning-tree rychlá konvergence |
| L3 (Síťová) | BGP multi-homing, AS path prepend, Anycast |
| L4 (Transportní) | Connection draining, slow start, health checky |
| L7 (Aplikační) | DNS failover, global load balancing, cell-based architektury |
**Regulatorní rámce:** DORA (Digital Operational Resilience Act), NIS2 — vyžadují pravidelné testování resiliency, chaos engineering, business continuity plány.
**AIOps v resilienci:** ML-based predikce traffic patternů, automatické škálování, proaktivní fault detection (přechod od reaktivního monitoringu k prediktivní prevenci).
### Zero Trust in Resilient Cloud and Network Architectures (Halley et al., ISBN 978-0-13-820460-0)
Cisco Press — praktická příručka pro nasazení Zero Trust architektur v hybridních a cloudových prostředích.
**Implementační rámec:**
- **User and Device Trust** — ověření identity uživatele i zařízení před udělením přístupu (SSE — Security Service Edge).
- **Application Access Policies** — granulární pravidla na úrovni aplikace, nikoliv IP adresy.
- **Greenfield vs Brownfield** — nové sítě stavěné jako Zero Trust od základu vs. migrace existující infrastruktury.
- **Automation** — Terraform, Ansible pro provisioning; Meraki, EVPN, Pub/Sub telemetrie.
- **Industrial Zero Trust** — rozšíření konceptu do OT/ICS prostředí.
- **Quantum security** — příprava na post-quantum kryptografii v síťových architekturách.
### The Segmentation Blueprint (Kulkarni, ISBN 978-0-13-546236-2)
Cisco Press (2026) — pragmatický průvodce segmentací sítě od VLAN po nanosegmentaci.
**Evoluce segmentace:**
| Generace | Technologie | Rozsah |
|----------|-------------|--------|
| Tradiční | VLAN, ACL, firewall | Podsíť / subnet |
| Mikrosegmentace | Security Groups, Network Policies | Workload / instance |
| Nanosegmentace | Service mesh (Istio, Linkerd), mTLS | Aplikace / API / proces |
**Maturity model segmentace:**
1. **Initial** — flat network, žádná segmentace
2. **Basic** — VLANy, firewall mezi environmenty
3. **Defined** — Security Groups, service access policies
4. **Managed** — Mikrosegmentace, Network Policies, EVPN
5. **Optimized** — Nanosegmentace, service mesh, Zero Trust, AI-driven policy management
**Klíčová metrika:** Blast radius — kolik workloadů je ohroženo při kompromitaci jednoho uzlu. Cílem je redukce na minimum.
### Segment Routing for SP and Enterprise Networks (Deragisch, ISBN 978-0-13-823101-9)
Cisco Press (2024) — ucelený průvodce Segment Routingem pro MPLS i IPv6 data plane.
**SR-MPLS vs SRv6:**
| Vlastnost | SR-MPLS | SRv6 |
|-----------|---------|------|
| SID délka | 20 bit (MPLS label) | 128 bit (IPv6 address) |
| Data plane | MPLS | IPv6 + SRH (Segment Routing Header) |
| Signalizace | IGP (IS-IS/OSPF) extensions | IGP + BGP extensions |
| Zrání | Mature, široce nasazeno | Emerging, standardizace dokončena |
| Use case | SP sítě, MPLS migration | Cloud, DC, 5G, end-to-end programovatelnost |
**Výhody SR oproti klasickému MPLS:**
- Eliminace LDP/RSVP-TE (signaling je součástí IGP)
- Traffic engineering state přesunutý z uzlů do packet headerů (source routing)
- Fast reroute (FRR) bez dodatečných protokolů
- Egress Peer Engineering (EPE) — výběr exit pointu AS
- Mikro-loop avoidance při konvergenci
**Migrační strategie:** Greenfield (nová síť), Brownfield (postupná migrace z MPLS), "SR in a box" — kombinace SR a LDP.
### Understanding and Designing Azure Networking (Stuart, Moreno, 2025)
Praktický průvodce návrhem Azure sítí od dvou Microsoft Solution Engineers (bývalí CCIE).
**Klíčová témata:**
| Oblast | Klíčové služby a koncepty |
|--------|--------------------------|
| **Topologie** | Hub-and-spoke, Virtual WAN, multi-hub designs, Azure Route Server |
| **Hybridní konektivita** | ExpressRoute, VPN Gateway, SD-WAN integrace |
| **Multi-cloud** | Azure ↔ AWS/GCP, cross-cloud fabrics |
| **Bezpečnost** | NSG, Azure Firewall, DDoS Protection, WAF, AVNM, ZTNA |
| **DNS & PaaS** | Private Link, Private DNS Zones, Private Resolver, hybrid DNS forwarding |
| **Application delivery** | Azure Load Balancer, App Gateway, Front Door, Traffic Manager |
| **Monitoring** | Network Watcher, Traffic Analytics, Azure Monitor, Policy-as-code |
**Design decision framework:** Sbírej požadavky → analyzuj constraints → vyber topologii → implementuj → monitoruj.
### Mastering Next-Gen Juniper Data Centers (Chatterjee, ISBN 978-0-13-533636-6)
Addison-Wesley (2026) — hands-on průvodce EVPN VXLAN fabrikami na Juniper zařízeních.
**Klíčové architektury:**
- **EVPN VXLAN fabric** — multi-tenant overlay sítě s BGP EVPN control plane a VXLAN data plane.
- **Multivendor interoperability** — detailní postupy pro EVPN napříč Juniper, Cisco NX-OS, Arista EOS.
- **Multicast v EVPN VXLAN** — intra-subnet i inter-subnet multicast design (IGMP/MLD proxying, PIM, EVPN Type-6/7 routes).
- **Day-2 operations** — Juniper Apstra pro automatizaci (Terraform provider), telemetrie (gNMI, OpenConfig).
- **Service chaining** — propojení NGFW, load balancerů v rámci fabric.
- **DCI s EVPN** — Over-the-Top (OTT) i Integrated Interconnect (VXLAN stitching, MPLS transit).
**Evoluce od předchozí knihy (Deploying Juniper Data Centers with EVPN VXLAN, 2024):** Rozšíření o pokročilá témata — multicast, interoperability, Apstra Day-2, observability stack.
### Intelligent Cloud Networking: AI-Driven Resource Management (Yadav, ISBN 9364220110)
Průnik AI/ML a cloudového network managementu (Addition Publishing, 2026).
**Aplikace AI v síťovém managementu:**
| Oblast | Technika | Přínos |
|--------|----------|--------|
| **Flow prediction** | LSTM, Transformer | Predikce traffic patternů, proaktivní škálování |
| **Flow classification** | CNN, RL | Identifikace typů provozu pro QoS |
| **Load balancing** | DRL (Deep RL) | Dynamická distribuce zátěže, redukce congestion |
| **Resource management** | Q-learning, DQN | Optimalizace alokace CPU/memory/network |
| **Routing optimization** | DRL, GNN | Adaptivní routing podle aktuálních podmínek |
| **Congestion control** | ML-based CC | Prediktivní řízení congeste (místo reakce na loss) |
| **Anomaly detection** | Autoencoders, Isolation Forest | Detekce útoků a anomálií v reálném čase |
| **Blockchain security** | Smart contracts | Decentralizované řízení přístupu, audit trail |
**Technologické směry:**
- **Ultra Ethernet Consortium (UEC)** — nová generace Ethernetu pro AI, lossless fabric, telemetrie, adaptive routing.
- **Path Aware Networking** — aplikace si vybírá cestu podle aktuálních podmínek (latence, loss, cena).
- **Self-optimizing networks** — uzavřená smyčka: telemetrie → AI analýza → automatická akce → zpětná vazba.
## OpenStack Networking (Neutron)
OpenStack Neutron je SDN framework pro správu virtuálních sítí v multi-tenant prostředí. Podporuje VLAN, VXLAN, GRE tunely, security groups, QoS, a LBaaS (Octavia).
### Backendy
| Backend | Popis | Vhodné pro |
|---------|-------|-----------|
| **OVN (Open Virtual Network)** | Nativní OpenFlow/OVSDB backend; nahrazuje OVS+agent architekturu | Produkce, škálovatelná nasazení |
| **OVS (Open vSwitch)** | Klasický agent-based backend (neutron-openvswitch-agent) | Malé nasazení, legacy |
| **Linux Bridge** | Jednoduchý backend bez OVS | Vývoj, testování, embedded |
| **Hyper-V** | Windows Server backend | Hybridní prostředí |
### Důležité koncepty
- **Networks, Subnets, Ports** — základní síťové objekty
- **Routers** — L3 forwarding mezi tenant sítěmi (DVR pro distribuovaný routing)
- **Security Groups** — stateful firewall rules na úrovni portu
- **Floating IPs** — veřejné IP mapované na instance (1:1 NAT)
- **LBaaS / Octavia** — load balancing as a service (HAProxy, amphora)
- **Trunk ports** — VLAN tagging pro instance (parent + subports)
### Performance tuning
- **DPDK** — userspace packet processing, bypass kernel, nižší latence
- **SR-IOV** — passthrough VF do instance, minimální režie hypervisoru
- **NUMA pinning** — afinita vCPU/memory/NIC pro výpočetní instance
- **Hardware offload** — OVS TC Flower, ASAP²
### Use cases
- Multi-tenant cloud (veřejný i privátní)
- Telco/NFVI (DPDK, SR-IOV, low-latency)
- SDN lab / network function virtualization
## Zero Trust Networking
Zero Trust je bezpečnostní model "never trust, always verify" — žádná entita není implicitně důvěryhodná, bez ohledu na umístění v síti.
### Principy (NIST SP 800-207)
1. **All resources are external** — není důvěryhodná interní síť
2. **Least privilege** — přístup jen k nezbytným zdrojům
3. **Micro-segmentation** — izolace workloadů na úrovni jednotlivých procesů/kontejnerů
4. **Encrypt everything** — TLS/mTLS pro veškerou komunikaci
5. **Continuous verification** — každý request je autentizován a autorizován
6. **Dynamic policies** — pravidla se mění podle kontextu (uživatel, zařízení, lokace, čas)
### Implementační vrstvy
| Vrstva | Technologie | Popis |
|--------|-------------|-------|
| **Identity** | OIDC, SAML, LDAP | Autentizace uživatelů a zařízení |
| **Device** | MDM, UEM, device certificates | Ověření stavu zařízení (compliant, patch level) |
| **Network** | Micro-segmentation, firewall, SDN | Izolace trafficu mezi workloady |
| **Application** | mTLS, service mesh, API gateway | Aplikace si vynucují vzájemnou autentizaci |
| **Data** | Encryption at rest + in transit, DLP | Ochrana dat bez ohledu na umístění |
| **Analytics** | SIEM, UEBA, AI/ML | Detekce anomálií a hrozeb v reálném čase |
### Nástroje a platformy
| Kategorie | Nástroje |
|-----------|----------|
| **ZTNA (Zero Trust Network Access)** | Cloudflare Access, Zscaler, Netskope, Palo Alto Prisma |
| **Service Mesh** | Istio, Linkerd, Consul Connect, Cilium |
| **Micro-segmentation** | VMware NSX, Illumio, Guardicore, Akamai |
| **BeyondCorp** | Google BeyondCorp (open-source: BeyondCorp Alliance) |
| **Security Service Edge (SSE)** | SWG, CASB, ZTNA v jednom (Zscaler, Netskope, Cloudflare) |
### Zero Trust v datacentru
V privátním DC se Zero Trust nasazuje přes:
- **EVPN VXLAN** — overlay síť s tenant isolation
- **Network Policies** (Kubernetes) — per-pod firewall pravidla
- **Cilium** — eBPF-based L3/L7 policy enforcement
- **WireGuard / IPsec** — šifrování mezi uzly
- **HashiCorp Boundary** — identity-based access k serverům bez bastion host
## Best practices
- **Segmentace sítě** — oddělení environmentů (dev/staging/prod), vrstev (web/app/db)
- **Least privilege access** — security groups povolují jen nutný provoz
- **Monitoring** — VPC Flow Logs, netflow, sFlow
- **Redundance** — multi-AZ, multi-region pro kritické služby
- **Encryption in transit** — TLS všude, mTLS pro service-to-service
- **DDoS protection** — AWS Shield, Azure DDoS Protection, Cloudflare
## Zdroje
Odkazy, knihy a standardy: [sources/networking/sources.md](sources/networking/sources.md)
- **MTU alignment** — konzistentní MTU napříč celou cestou, kontrola ICMP blokování pro PMTUD
- **IP planning** — RFC 1918 (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), vyhnout se překryvům pro peering
## TLS detail
### TLS 1.3 handshake (1-RTT)
```
Client Server
| |
| ClientHello (key_share, sig_algs) |
|─────────────────────────────────────────>|
| |
| ServerHello + EncryptedExtensions |
| + Certificate + CertificateVerify |
| + Finished |
|<─────────────────────────────────────────|
| |
| Finished |
|─────────────────────────────────────────>|
| |
| << Application Data >> |
```
- **0-RTT** (early data) — client posílá data ihned s první zprávou (při opakovaném spojení s PSK)
- Riziko 0-RTT: replay attacks (HTTP GET je bezpečný, POST vyžaduje ochranu)
- Oproti TLS 1.2: odstraněny zastaralé ciphery, AEAD required, faster handshake (2 RTT → 1 RTT)
### Cipher suites
| Suite | Key exchange | Auth | Encryption | MAC | Status |
|-------|-------------|------|-----------|-----|--------|
| `TLS_AES_128_GCM_SHA256` | (EC)DHE | (EC)DHE | AES-128-GCM | AEAD | TLS 1.3 default |
| `TLS_AES_256_GCM_SHA384` | (EC)DHE | (EC)DHE | AES-256-GCM | AEAD | Vyšší bezpečnost |
| `TLS_CHACHA20_POLY1305_SHA256` | (EC)DHE | (EC)DHE | ChaCha20-Poly1305 | AEAD | Mobilní / bez AES-NI |
| `ECDHE-ECDSA-AES128-GCM-SHA256` | ECDHE | ECDSA | AES-128-GCM | AEAD | TLS 1.2 (PFS) |
| `ECDHE-RSA-AES128-GCM-SHA256` | ECDHE | RSA | AES-128-GCM | AEAD | TLS 1.2 (PFS) |
PFS (Perfect Forward Secrecy) — při kompromitaci privátního klíče nelze dešifrovat dříve zachycený provoz (ECDHE + ephemeral klíče).
### Certificate chain validation
```
1. Client obdrží certificate chain od serveru
2. Validace:
a. Datum: certifikát není expirovaný a je platný (notBefore, notAfter)
b. CRL / OCSP: certifikát není revokován (OCSP stapling pro snížení latence)
c. Signature chain: podpis každého certu v řetězu je ověřen veřejným klíčem vydavatele
d. Root CA: poslední cert v řetězu je důvěryhodný (v trust store klienta)
3. CN / SAN: doménové jméno v certifikátu musí odpovídat cílové doméně
```
Typický řetěz: `Leaf Cert → Intermediate CA → Root CA` (self-signed, v trust store).
*Poslední revize: 2026-06-03*

207
ORACLE.en.md Normal file
View File

@@ -0,0 +1,207 @@
# 🏛️ Oracle Database
## Overview
Oracle Database is a proprietary relational database with the broadest range of enterprise features — RAC clustering, Active Data Guard, partitioning, advanced compression, in-memory options, and Oracle Exadata integration. Dominant in the corporate world, finance, telecommunications, and mainframe ecosystem.
## Architecture
### Oracle instance + database
```
Oracle Instance (memory + processes)
├── System Global Area (SGA)
│ ├── Database Buffer Cache
│ ├── Shared Pool (library cache, dictionary cache)
│ ├── Redo Log Buffer
│ ├── Java Pool
│ ├── Large Pool (backup, parallel)
│ └── In-Memory Column Store (option)
├── Program Global Area (PGA) — per session
└── Background processes
├── PMON (process monitor)
├── SMON (system monitor)
├── DBWn (database writer)
├── LGWR (log writer)
├── CKPT (checkpoint)
├── ARCn (archiver)
└── MMON (manageability monitor)
```
### Multitenant architecture (12c+)
```
Container Database (CDB)
├── Root (CDB$ROOT) — metadata, system objects
├── Seed (PDB$SEED) — template for new PDBs
├── Pluggable Database 1 (PDB1) — application A
├── Pluggable Database 2 (PDB2) — application B
└── Pluggable Database 3 (PDB3) — application C
```
Each PDB looks like a separate database but shares SGA and background processes. Advantage: higher density, simpler patching (CDB level), resource management per PDB.
### Oracle RAC (Real Application Clusters)
Multi-instance architecture — multiple servers access the same storage:
```text
Node 1 ─── Oracle ASM ─── Shared Storage (SAN/NFS)
Node 2 ─── Oracle ASM ─── Shared Storage (SAN/NFS)
Node 3 ─── Oracle ASM ─── Shared Storage (SAN/NFS)
Cache Fusion (private interconnect)
```
- Up to 64 nodes in a cluster
- **Cache Fusion** — transfer of dirty blocks between instances via private interconnect (RAC-specific)
- **ASM** (Automatic Storage Management) — clustered filesystem + volume manager
- **Service** — workload routing (primary, report, batch)
### Oracle Data Guard
| Mode | Protection | Latency | Use case |
|------|-----------|---------|----------|
| **Maximum Protection** | Zero data loss (sync) | Highest | Critical systems |
| **Maximum Availability** | Zero data loss (sync, fallback to async) | High | Enterprise standard |
| **Maximum Performance** | Async | Lowest | Remote DR |
- **Active Data Guard** — standby for reads (reporting, backup) — requires license
- **Far Sync** — synchronous write to remote standby via async (compromise)
### Oracle Exadata
Hardware+software platform for Oracle DB:
| Component | Description |
|-----------|-------------|
| **Database Servers** | x86 (Xeon), 2-8× per rack, NVMe, 1.5-6 TB RAM |
| **Storage Servers** | Total capacity up to 2.7 PB raw per rack |
| **Smart Scan** | Predicate filtering at the storage layer (instead of DB server) |
| **Smart Flash Cache** | Multiple caching layers (RAM, Flash, disk) |
| **RDMA over Converged Ethernet** | Low latency between DB and storage servers |
Suitable for: largest OLTP, data warehousing, consolidation.
## Key enterprise features
| Feature | Description | Competition |
|---------|-------------|-------------|
| **RAC** | Shared-everything cluster up to 64 nodes | MSSQL AlwaysOn FCI (2 nodes) |
| **Active Data Guard** | Standby for reads, far sync, automatic failover | MSSQL AlwaysOn AG, PostgreSQL streaming |
| **Partitioning** | Range, List, Hash, Composite, interval, reference | PostgreSQL (declarative partitioning 10+) |
| **Advanced Compression** | Columnar, HCC (Exadata), OLTP compression | InnoDB page compression, PG TOAST |
| **In-Memory** | Column store in SGA for real-time analytics | PG (no native), MSSQL (columnstore index) |
| **Advanced Security** | TDE, data redaction, VPD, audit vault, database firewall | PG (pgcrypto, pgaudit), MSSQL (TDE, Always Encrypted) |
| **Flashback** | Querying historical data (Flashback Query, Table, Database) | PG (temporal tables via extension), MSSQL (system-versioned) |
| **Sharding** | System-managed, composite, user-defined | MongoDB (native), Vitess (MySQL), Citus (PG) |
| **ASM** | Clustered filesystem + volume manager | VMware VMFS, Windows CSV |
## Oracle licensing detail
### Editions
| Edition | Metric | Price (indicative) | Limitations |
|---------|--------|-------------------|-------------|
| **Oracle Database Standard Edition 2 (SE2)** | Per core (core factor 0.5) | ~$17,500/core | Max 16 CPU threads (per server), max 2 sockets, no RAC (only Oracle RAC One), no partitioning, in-memory, compression |
| **Oracle Database Enterprise Edition (EE)** | Per core (core factor 0.5) | ~$47,500/core | No limits, all features (RAC, partitioning, in-memory, compression, Advanced Security) — but all as **optional licenses** |
| **Oracle Database Enterprise Edition (RAC)** | Per core (EE + RAC option) | ~$47,500 + $23,000/core | EE + RAC clustering |
### Optional licenses (options) — EE only
| Option | Price (indicative / core) | Use case |
|--------|--------------------------|----------|
| **Real Application Clusters (RAC)** | ~$23,000 | Multi-instance cluster |
| **Active Data Guard** | ~$10,000 | Standby for reads, far sync, automatic failover |
| **Partitioning** | ~$11,500 | Range, list, hash, interval, reference, system |
| **Advanced Compression** | ~$11,500 | OLTP compression, HCC (Exadata), JSON compression |
| **Advanced Security** | ~$15,000 | TDE, data redaction, database firewall |
| **In-Memory Database** | ~$23,000 | Column store in SGA for real-time analytics |
| **Database Vault** | ~$5,750 | Separation of duties, multi-tenancy security |
| **Multitenant (EE)** | Free (since 21c) | CDB/PDB — max 3 PDB per CDB in EE without license. Unlimited with Multitenant option (~$17,500) |
| **Spatial / Graph** | ~$5,750 | Geospatial data, property graph |
| **Label Security** | ~$5,750 | Row-level security with classifications |
### Oracle Cloud (OCI) licensing
| Service | Model | Price | Note |
|---------|-------|-------|------|
| **OCI Base Database (RDS-like)** | BYOL or License Included | ~$1-5/hour (BYOL cheaper) | Single instance or RAC, automatic backup, patching |
| **OCI Exadata Database Service** | BYOL or License Included | ~$5-30/hour (depending on shape) | Exadata X9M/X10M in OCI, elastic, full Exadata features |
| **OCI Autonomous Database** | Per CPU (ECPU) | ~$0.50-3.00/ECPU/hour | Auto-tuning, auto-scaling, auto-patching |
| **BYOL (Bring Your Own License)** | Own Oracle license in OCI | Infrastructure only | Can use existing perpetual license, including support |
### RAC sizing — license cost
```text
4-node RAC, each node 2× EPYC 9654 (96C) = 192 cores per node
Core factor 0.5 → 96 Oracle licenses per node
4 × 96 = 384 Oracle EE licenses
EE: 384 × $47,500 = $18,240,000
RAC option: 384 × $23,000 = $8,832,000
Support 22% annually: ($18,240,000 + $8,832,000) × 0.22 = $5,955,840/year
Tip: For RAC, consider smaller CPUs (e.g., 64C instead of 96C) — license cost often exceeds hardware cost.
```
### Oracle vs PostgreSQL — comparison
| Area | Oracle | PostgreSQL |
|------|--------|------------|
| **License** | Proprietary (per core, ~$17.5k-47.5k/core + 22% support annually) | Open source (PostgreSQL license, MIT-like) |
| **RAC clustering** | Native, shared-everything | None (Citus = shared-nothing) |
| **Multitenant** | CDB/PDB architecture | None (schemas per tenant) |
| **Parallel execution** | Mature (auto DOP, parallel index scan) | Good (parallel seq/index scan, join) |
| **Storage management** | ASM (integrated) | OS volume / LVM |
| **Materialized views** | With refresh on commit, query rewrite | No query rewrite |
| **Partitioning** | 40+ options (interval, referential, system) | Declarative (range, list, hash since 10+) |
| **In-memory** | Columnar in SGA | Not native |
| **Standby usage** | Active Data Guard (read-only, license) | Hot standby (read-only, free) |
| **Cloud** | OCI (Oracle Cloud), AWS RDS, Azure | All clouds (native) |
## Recommendations — where Oracle is better
| Area | Oracle | Competition | Why Oracle |
|------|--------|-------------|------------|
| **License cost (4-node RAC, 384 cores)** | ~$50M (1st year incl. support) | PostgreSQL: $0 | Oracle: 22% support annually on license fee |
| **Vendor lock-in** | High (GoldenGate migration difficult, PL/SQL specific) | PostgreSQL: none | MySQL and PG have migration tools from Oracle (ora2pg, AWS DMS) |
| **Enterprise OLTP** | RAC + ASM, zero-downtime patching | MSSQL (FCI limit 2 nodes) | Shared-everything cluster, transparent failover |
| **Finance / Banking** | Audit Vault, Database Vault, TDE, VPD | PG (pgaudit, row-level security) | Compliance certifications (SOX, PCI, GDPR) |
| **Consolidation** | Multitenant (CDB/PDB) — hundreds of DBs on 1 instance | PG (citizen schemas) | Lower overhead, simpler management |
| **Data Warehouse** | Exadata Smart Scan, parallel execution, in-memory | ClickHouse (specialized) | Hybrid workload (OLTP + DW in one DB) |
| **High-end hardware** | Exadata engineered system | PG (runs on anything) | Full-stack HW+SW optimization |
| **Partitioning** | Range of options (reference, interval, system) | PG (basic) | 10+ years lead in implementation |
| **Flashback / recovery** | Flashback Database, Table, Query — any point in time | PG (PITR, point-in-time) | Faster, more granular recovery |
| **Ecosystem** | OEM, Data Pump, SQL Developer, Toad, GoldenGate | PG (pgAdmin, pg_dump, Patroni) | Decades of enterprise tooling |
### When to use Oracle
- **Critical OLTP systems** — banking, payment processing, trading
- **Enterprise consolidation** — hundreds of DBs on one RAC cluster (multitenant)
- **Regulated environments** — finance, healthcare, government (audit, security, compliance)
- **Oracle ecosystem** — E-Business Suite, PeopleSoft, Siebel, JD Edwards
- **Exadata customers** — maximum performance for hybrid workload (OLTP + DW)
- **GoldenGate replication** — heterogeneous replication (Oracle → Kafka, Oracle → PostgreSQL)
- **Cloud migration** — OCI, AWS RDS for Oracle, Azure Oracle Database Service
### When to use something else
- **Startup / SME** → PostgreSQL (free, sufficient performance, no vendor lock-in)
- **Web / LAMP stack** → MySQL (simpler, cheaper, broad support)
- **Cloud-native** → Aurora, CockroachDB (architecture for cloud, not port of on-prem to cloud)
- **Need only SQL** → PostgreSQL (Oracle overhead not worth it)
- **Horizontal write scaling** → Cassandra (RAC scales reads, writes go through one node)
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
### Recommended reading
| Book | Authors | ISBN | Description |
|------|---------|------|-------------|
| Oracle Database 23ai New Features | Oracle Corporation | — | Official guide to new features — AI Vector Search, JSON Relational Duality, property graphs, schema privileges |
| Expert Oracle Architecture (3rd ed.) | Thomas Kyte, Darl Kuhn | 978-1484249602 | Comprehensive explanation of Oracle architecture — from storage to RAC and Data Guard |
*Last revision: 2026-06-03*

207
ORACLE.md Normal file
View File

@@ -0,0 +1,207 @@
# 🏛️ Oracle Database
## Přehled
Oracle Database je proprietární relační databáze s nejširší škálou enterprise funkcí — RAC clustering, Active Data Guard, partitioning, advanced compression, in-memory options a Oracle Exadata integrace. Dominantní v korporátním světě, financích, telekomunikacích a mainframe ekosystému.
## Architektura
### Oracle instance + database
```
Oracle Instance (memory + processes)
├── System Global Area (SGA)
│ ├── Database Buffer Cache
│ ├── Shared Pool (library cache, dictionary cache)
│ ├── Redo Log Buffer
│ ├── Java Pool
│ ├── Large Pool (backup, parallel)
│ └── In-Memory Column Store (option)
├── Program Global Area (PGA) — per session
└── Background processes
├── PMON (process monitor)
├── SMON (system monitor)
├── DBWn (database writer)
├── LGWR (log writer)
├── CKPT (checkpoint)
├── ARCn (archiver)
└── MMON (manageability monitor)
```
### Multitenant architektura (12c+)
```
Container Database (CDB)
├── Root (CDB$ROOT) — metadata, system objects
├── Seed (PDB$SEED) — template pro nové PDB
├── Pluggable Database 1 (PDB1) — aplikace A
├── Pluggable Database 2 (PDB2) — aplikace B
└── Pluggable Database 3 (PDB3) — aplikace C
```
Každé PDB vypadá jako samostatná databáze, ale sdílí SGA a background procesy. Výhoda: vyšší densita, jednodušší patchování (CDB úroveň), resource management per PDB.
### Oracle RAC (Real Application Clusters)
Multi-instance architektura — více serverů přistupuje ke stejné storage:
```text
Node 1 ─── Oracle ASM ─── Shared Storage (SAN/NFS)
Node 2 ─── Oracle ASM ─── Shared Storage (SAN/NFS)
Node 3 ─── Oracle ASM ─── Shared Storage (SAN/NFS)
Cache Fusion (private interconnect)
```
- Až 64 nodů v clusteru
- **Cache Fusion** — transfer dirty blocks mezi instancemi přes private interconnect (RAC-specific)
- **ASM** (Automatic Storage Management) — clustered filesystem + volume manager
- **Service** — workload routing (primární, report, batch)
### Oracle Data Guard
| Režim | Ochrana | Latence | Use case |
|-------|---------|---------|----------|
| **Maximum Protection** | Zero data loss (sync) | Nejvyšší | Kritické systémy |
| **Maximum Availability** | Zero data loss (sync, fallback na async) | Vysoká | Enterprise standard |
| **Maximum Performance** | Async | Nejnižší | DR na dálku |
- **Active Data Guard** — standby pro čtení (reporting, backup) — vyžaduje licenci
- **Far Sync** — synchronní zápis na vzdálený standby přes async (kompromis)
### Oracle Exadata
Hardware+software platforma pro Oracle DB:
| Komponenta | Popis |
|-----------|-------|
| **Database Servers** | x86 (Xeon), 2-8× per rack, NVMe, 1.5-6 TB RAM |
| **Storage Servers** | Celková kapacita až 2.7 PB raw per rack |
| **Smart Scan** | Predikátová filtrace na storage vrstvě (místo v DB serveru) |
| **Smart Flash Cache** | Násobné vrstvy caching (RAM, Flash, disk) |
| **RDMA over Converged Ethernet** | Nízká latence mezi DB a storage servery |
Vhodné pro: největší OLTP, data warehousing, consolidation.
## Klíčové enterprise funkce
| Funkce | Popis | Konkurence |
|--------|-------|------------|
| **RAC** | Shared-everything cluster až 64 uzlů | MSSQL AlwaysOn FCI (2 uzly) |
| **Active Data Guard** | Standby pro čtení, far sync, automatic failover | MSSQL AlwaysOn AG, PostgreSQL streaming |
| **Partitioning** | Range, List, Hash, Composite, interval, reference | PostgreSQL (declarative partitioning 10+) |
| **Advanced Compression** | Columnar, HCC (Exadata), OLTP compression | InnoDB page compression, PG TOAST |
| **In-Memory** | Column store v SGA pro real-time analytics | PG (no native), MSSQL (columnstore index) |
| **Advanced Security** | TDE, data redaction, VPD, audit vault, database firewall | PG (pgcrypto, pgaudit), MSSQL (TDE, Always Encrypted) |
| **Flashback** | Dotazování na historická data (Flashback Query, Table, Database) | PG (temporal tables via extension), MSSQL (system-versioned) |
| **Sharding** | System-managed, composite, user-defined | MongoDB (native), Vitess (MySQL), Citus (PG) |
| **ASM** | Clustered filesystem + volume manager | VMware VMFS, Windows CSV |
## Oracle licensing detail
### Edice
| Edice | Metrika | Cena (orientační) | Limitace |
|-------|---------|-------------------|----------|
| **Oracle Database Standard Edition 2 (SE2)** | Per core (core factor 0.5) | ~$17 500/core | Max 16 CPU threads (na server), max 2 sockets, žádný RAC (pouze Oracle RAC One), žádné partitioning, in-memory, compression |
| **Oracle Database Enterprise Edition (EE)** | Per core (core factor 0.5) | ~$47 500/core | Bez omezení, všechny funkce (RAC, partitioning, in-memory, compression, Advanced Security) — ale vše jako **volitelné licence** |
| **Oracle Database Enterprise Edition (RAC)** | Per core (EE + RAC option) | ~$47 500 + $23 000/core | EE + RAC clustering |
### Volitelné licence (options) — EE only
| Option | Cena (orientační / core) | Use case |
|--------|--------------------------|----------|
| **Real Application Clusters (RAC)** | ~$23 000 | Multi-instance cluster |
| **Active Data Guard** | ~$10 000 | Standby pro čtení, far sync, automatic failover |
| **Partitioning** | ~$11 500 | Range, list, hash, interval, reference, system |
| **Advanced Compression** | ~$11 500 | OLTP compression, HCC (Exadata), JSON compression |
| **Advanced Security** | ~$15 000 | TDE, data redaction, database firewall |
| **In-Memory Database** | ~$23 000 | Column store v SGA pro real-time analytics |
| **Database Vault** | ~$5 750 | Separation of duties, multi-tenancy security |
| **Multitenant (EE)** | Zdarma (od 21c) | CDB/PDB — max 3 PDB na CDB v EE bez license. Neomezeno s Multitenant option (~$17 500) |
| **Spatial / Graph** | ~$5 750 | Geoprostorová data, property graph |
| **Label Security** | ~$5 750 | Row-level security s klasifikacemi |
### Oracle Cloud (OCI) licensing
| Služba | Model | Cena | Poznámka |
|--------|-------|------|----------|
| **OCI Base Database (RDS-like)** | BYOL nebo License Included | ~$1-5/hod (BYOL levnější) | Single instance nebo RAC, automatické backup, patching |
| **OCI Exadata Database Service** | BYOL nebo License Included | ~$5-30/hod (dle shape) | Exadata X9M/X10M v OCI, elastic, full Exadata features |
| **OCI Autonomous Database** | Per CPU (ECPU) | ~$0.50-3.00/ECPU/hod | Auto-tuning, auto-scaling, auto-patching |
| **BYOL (Bring Your Own License)** | Vlastní Oracle license v OCI | Jen infrastruktura | Lze použít stávající perpetual license, včetně supportu |
### RAC sizing — licence cost
```text
4-node RAC, každý node 2× EPYC 9654 (96C) = 192 cores per node
Core factor 0.5 → 96 Oracle licenses per node
4 × 96 = 384 Oracle EE licenses
EE: 384 × $47 500 = $18 240 000
RAC option: 384 × $23 000 = $8 832 000
Support 22 % ročně: ($18 240 000 + $8 832 000) × 0.22 = $5 955 840/rok
Tip: Pro RAC zvažte menší CPU (např. 64C místo 96C) — license cost často převyšuje hardware cost.
```
### Oracle vs PostgreSQL — srovnání
| Oblast | Oracle | PostgreSQL |
|--------|--------|------------|
| **Licence** | Proprietary (per core, ~$17.5k-47.5k/core + 22 % support ročně) | Open source (PostgreSQL license, MIT-like) |
| **RAC clustering** | Nativní, shared-everything | Žádné (Citus = shared-nothing) |
| **Multitenant** | CDB/PDB architektura | Žádné (schemas per tenant) |
| **Parallel execution** | Vyspělý (auto DOP, parallel index scan) | Dobrý (parallel seq/index scan, join) |
| **Storage management** | ASM (integrovaný) | OS volume / LVM |
| **Materialized views** | S refresh on commit, query rewrite | Není query rewrite |
| **Partitioning** | 40+ možností (interval, referential, system) | Declarative (range, list, hash od 10+) |
| **In-memory** | Columnar in SGA | Není nativní |
| **Standby použitek** | Active Data Guard (read-only, licence) | Hot standby (read-only, zdarma) |
| **Cloud** | OCI (Oracle Cloud), AWS RDS, Azure | Všechny cloudy (native) |
## Doporučení — v čem je Oracle lepší
| Oblast | Oracle | Konkurence | Proč Oracle |
|--------|--------|------------|-------------|
| **Licence cost (4-node RAC, 384 cores)** | ~$50M (1. rok vč. supportu) | PostgreSQL: $0 | Oracle: 22 % support ročně z license fee |
| **Vendor lock-in** | Vysoký (GoldenGate migrace náročná, PL/SQL specific) | PostgreSQL: žádný | MySQL i PG mají nástroje pro migraci z Oracle (ora2pg, AWS DMS) |
| **Enterprise OLTP** | RAC + ASM, zero-downtime patching | MSSQL (FCI limit 2 nodes) | Shared-everything cluster, transparent failover |
| **Finance / Banking** | Audit Vault, Database Vault, TDE, VPD | PG (pgaudit, row-level security) | Compliance certifikace (SOX, PCI, GDPR) |
| **Consolidace** | Multitenant (CDB/PDB) — stovky DB na 1 instanci | PG (citizen schemas) | Nižší overhead, jednodušší management |
| **Data Warehouse** | Exadata Smart Scan, parallel execution, in-memory | ClickHouse (specializovaná) | Hybrid workload (OLTP + DW v jedné DB) |
| **High-end hardware** | Exadata engineered system | PG (běží na čemkoliv) | Full-stack optimalizace HW+SW |
| **Partitioning** | Rozsah možností (reference, interval, system) | PG (basic) | 10+ let náskok v implementaci |
| **Flashback / recovery** | Flashback Database, Table, Query — libovolný čas | PG (PITR, point-in-time) | Rychlejší, granularnější recovery |
| **Ekosystém** | OEM, Data Pump, SQL Developer, Toad, GoldenGate | PG (pgAdmin, pg_dump, Patroni) | Desítky let enterprise toolingu |
### Kdy použít Oracle
- **Kritické OLTP systémy** — banking, payment processing, trading
- **Enterprise konsolidace** — stovky DB na jednom RAC clusteru (multitenant)
- **Regulované prostředí** — finance, healthcare, government (audit, security, compliance)
- **Oracle ekosystém** — E-Business Suite, PeopleSoft, Siebel, JD Edwards
- **Exadata zákazníci** — maximální výkon pro hybrid workload (OLTP + DW)
- **GoldenGate replikace** — heterogenní replikace (Oracle → Kafka, Oracle → PostgreSQL)
- **Cloud migration** — OCI, AWS RDS for Oracle, Azure Oracle Database Service
### Kdy použít něco jiného
- **Startup / SME** → PostgreSQL (zdarma, dostatečný výkon, žádný vendor lock-in)
- **Web / LAMP stack** → MySQL (jednodušší, levnější, široká podpora)
- **Cloud-native** → Aurora, CockroachDB (architektura pro cloud, ne port on-prem do cloudu)
- **Potřebujete jen SQL** → PostgreSQL (Oracle overhead se nevyplatí)
- **Horizontální škálování zápisů** → Cassandra (RAC škáluje čtení, zápisy jdou přes jeden nod)
## Zdroje
Odkazy, knihy a standardy: [sources/databases/sources.md](sources/databases/sources.md)
### Doporučená literatura
| Kniha | Autoři | ISBN | Popis |
|-------|--------|------|-------|
| Oracle Database 23ai New Features | Oracle Corporation | — | Oficiální průvodce novinkami — AI Vector Search, JSON Relational Duality, property graphs, schema privileges |
| Expert Oracle Architecture (3rd ed.) | Thomas Kyte, Darl Kuhn | 978-1484249602 | Komplexní výklad Oracle architektury — od storage po RAC a Data Guard |
*Poslední revize: 2026-06-03*

178
POSTGRESQL.en.md Normal file
View File

@@ -0,0 +1,178 @@
# 🐘 PostgreSQL
## Overview
PostgreSQL is the most advanced open-source relational database with emphasis on extensibility, SQL standards, and reliability. Development since 1996, strong community, active release cycle (major version every year).
## Architecture
### Process model
```text
Postmaster (supervisor)
├── Backend process (1 per connection)
├── WAL writer
├── Checkpointer
├── Autovacuum launcher
├── Stats collector
├── Logical replication launcher
└── Archiver (WAL archiving)
```
Each connection = its own OS process (not thread). Advantage: isolation, stability. Disadvantage: higher memory footprint with thousands of connections → connection pooler required (PgBouncer).
### MVCC (Multi-Version Concurrency Control)
Each transaction sees a snapshot of data from the moment it started. Old row versions (tuples) remain in the table:
- INSERT creates a new tuple with `xmin = current_xid`
- DELETE marks tuple with `xmax = current_xid` (doesn't disappear immediately)
- UPDATE = DELETE old + INSERT new
- VACUUM physically deletes tuples older than the oldest active snapshot
### VACUUM and autovacuum
| Parameter | Description | Default |
|-----------|-------------|---------|
| `autovacuum_vacuum_threshold` | Min. dead rows to trigger vacuum | 50 |
| `autovacuum_vacuum_scale_factor` | % of table as threshold | 0.2 (20%) |
| `autovacuum_analyze_threshold` | Min. changed rows for ANALYZE | 50 |
| `autovacuum_vacuum_cost_limit` | Limits I/O of vacuum (prevents load) | 200 |
| `autovacuum_naptime` | Interval between checks | 1 min |
| `deadlock_timeout` | Deadlock detection | 1 s |
**Signs of insufficient vacuum**: table growth (bloat), degraded index scan performance, XID wraparound hazard.
### WAL (Write-Ahead Log)
Append-only log of all changes for crash recovery and replication:
```conf
wal_level = replica # or logical
archive_mode = on
archive_command = 'aws s3 cp %p s3://backups/pg-wal/%f'
```
**PITR (Point-In-Time Recovery)**:
1. Restore base backup (pg_basebackup)
2. Replay WAL archives up to target time
3. `recovery_target_time = '2026-06-03 10:30:00 UTC'`
### Replication slots
- **Physical** — guarantees WAL is not deleted by master until replica consumes it
- **Logical** — for logical replication (selective tables, data transformation)
- **Risk**: if replica fails, WAL grows on disk (disk full)
- Monitoring: `pg_replication_slots`, `pg_stat_replication`
### Configuration
Main files (per Obe & Hsu):
- `postgresql.conf` — memory, network, logging, storage
- `pg_hba.conf` — access privileges
- `pg_ident.conf` — OS user to PostgreSQL role mapping
### AI-Ready PostgreSQL 18
(Kumar, Linster, 2026) — PostgreSQL 18 as a unified platform for transactions, analytics, and AI:
| Area | Technique |
|------|-----------|
| Vectors | pgvector — embeddings directly in table rows |
| Hybrid pattern | Semantic recall → SQL filtering |
| LLM integration | PostgreSQL + MCP (Model Context Protocol) |
| Embedding pipeline | Batch and stream embedding generation |
**Hybrid query**:
```sql
SELECT p.*, pm.name
FROM products p
JOIN product_embeddings pe ON p.id = pe.product_id
WHERE pe.embedding <-> '[0.1, 0.3, ...]' < 0.8
AND p.in_stock = true
AND p.price < 100.00
ORDER BY pe.embedding <-> '[0.1, 0.3, ...]'
LIMIT 10;
```
### Extensions
| Extension | Purpose |
|-----------|---------|
| pgvector | Vector search for AI/embeddings |
| PostGIS | Geographic data, spatial queries |
| pg_stat_statements | Query performance monitoring |
| pg_duckdb | Analytical queries (DuckDB engine inside PG) |
| pg_search | Full-text and hybrid search |
| pg_cron | DB job scheduling |
| citus | Horizontal scaling (sharding) |
| timescaledb | Time-series optimization |
| pgaudit | Audit logging |
## Connection pooling
| Pooler | Type | Protocol |
|--------|------|----------|
| PgBouncer | Proxy (transaction/session) | PostgreSQL wire |
| Odyssey | Proxy (multithreaded) | PostgreSQL wire |
| pgpool-II | Proxy (replication, load balancing) | PostgreSQL wire |
| RDS Proxy | Managed proxy (AWS) | PostgreSQL wire |
**PgBouncer modes**:
- **Session pooling** — connection held for entire application session → overhead
- **Transaction pooling** — connection returned after transaction completes → more efficient (requires statelessness)
## Recommendations — where PostgreSQL is better
| Area | PostgreSQL | Competition | Why PG |
|------|-----------|-------------|--------|
| **Extensibility** | Extensions, custom types, operators, index methods | MySQL limited | Can add anything from vectors to full-text in DB |
| **SQL standard** | Closest to ANSI SQL | MySQL deviations (GROUP BY, ALTER TABLE) | Portability, fewer surprises |
| **Geospatial data** | PostGIS (gold standard GIS) | MySQL GIS (limited) | Only real open-source choice for GIS |
| **Consistency** | SSI serializable, foreign keys, CHECK, exclusions | MySQL MyISAM no FK, InnoDB only RC | Suitable for financial and critical systems |
| **Concurrent read/write** | MVCC without reader/writer blocking | MySQL InnoDB reader blocks writer (and vice versa) in older versions | Better read scalability |
| **AI/vectors** | pgvector natively in DB | Separate vector DB (increased latency) | Hybrid queries in single SQL |
| **License** | PostgreSQL license (MIT-like) | MySQL dual license (Oracle) | No vendor lock-in |
### When to use PostgreSQL
- **Enterprise applications** — require ACID, referential integrity, complex transactions
- **Geographic systems** — GIS, map applications, location services
- **Financial systems** — accounting, banking, compliance (audit logging, SSI)
- **AI / RAG applications** — hybrid vector + relational queries in one DB
- **Analytics on relational data** — pg_duckdb, materialized views, window functions
- **Multi-tenant applications** — row-level security, schemas per tenant
## PostgreSQL licensing
| Variant | License | Price | Restrictions |
|---------|---------|-------|-------------|
| **PostgreSQL** | PostgreSQL license (MIT-like) | $0 | None — can use, modify, distribute in commercial products. No "commercial license" needed |
| **Amazon Aurora PostgreSQL** | Proprietary (AWS) | ~$0.10-1.00/hour | AWS managed, PostgreSQL compatible. AWS may use PG code thanks to PostgreSQL license |
| **YugabyteDB** | Apache 2.0 | $0 (core) | PostgreSQL compatible distributed SQL, built on PG query layer |
| **TimescaleDB** | Apache 2.0 (community) / Timescale License (enterprise) | $0 (community) | Time-series extensions for PostgreSQL. Enterprise: tiered storage, compression, multi-node |
**Key point**: The PostgreSQL license is one of the most liberal — it allows cloud providers (AWS, GCP, Azure) to offer PostgreSQL as a managed service without restrictions. This is different from MongoDB (SSPL) and Redis (RSALv2). Thanks to this, PostgreSQL has the broadest cloud support of any database.
**Impact on choice**: No license risk, no vendor lock-in, no hidden costs. PostgreSQL is a safe choice for any project.
### When to use something else
- **Simple web / blog** → SQLite (lighter in embedded scenarios)
- **High-throughput key-value** → Redis (order of magnitude lower latency)
- **Time-series at massive scale** → TimescaleDB, InfluxDB
- **Globally distributed data** → CockroachDB, Spanner
- **Full-text search primarily** → Elasticsearch
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
### Recommended reading
| Book | Authors | ISBN | Description |
|------|---------|------|-------------|
| PostgreSQL: Up and Running (3rd ed.) | Regina Obe, Leo Hsu | 978-1491962935 | Practical guide to administration, configuration, and extensions |
| AI-Ready PostgreSQL 18 | Kumar, Linster | — | PostgreSQL as unified platform for AI workloads |
*Last revision: 2026-06-03*

178
POSTGRESQL.md Normal file
View File

@@ -0,0 +1,178 @@
# 🐘 PostgreSQL
## Přehled
PostgreSQL je nejpokročilejší open-source relační databáze s důrazem na rozšiřitelnost, standardy SQL a spolehlivost. Vývoj od 1996, silná komunita, aktivní release cyklus (major verze každý rok).
## Architektura
### Procesový model
```text
Postmaster (supervisor)
├── Backend process (1 per connection)
├── WAL writer
├── Checkpointer
├── Autovacuum launcher
├── Stats collector
├── Logical replication launcher
└── Archiver (WAL archiving)
```
Každé spojení = vlastní OS proces (ne vlákno). Výhoda: izolace, stabilita. Nevýhoda: vyšší memory footprint u tisíců spojení → nutný connection pooler (PgBouncer).
### MVCC (Multi-Version Concurrency Control)
Každá transakce vidí snapshot dat z okamžiku startu. Staré verze řádků (tuple) zůstávají v tabulce:
- INSERT vytvoří nový tuple s `xmin = current_xid`
- DELETE označí tuple `xmax = current_xid` (nezmizí hned)
- UPDATE = DELETE old + INSERT new
- VACUUM fyzicky maže tuple starší než nejstarší aktivní snapshot
### VACUUM a autovacuum
| Parametr | Popis | Výchozí |
|----------|-------|---------|
| `autovacuum_vacuum_threshold` | Min. mrtvých řádků pro spuštění | 50 |
| `autovacuum_vacuum_scale_factor` | % z tabulky jako threshold | 0.2 (20 %) |
| `autovacuum_analyze_threshold` | Min. změněných řádků pro ANALYZE | 50 |
| `autovacuum_vacuum_cost_limit` | Limituje I/O vacuum (prevence zátěže) | 200 |
| `autovacuum_naptime` | Interval mezi kontrolami | 1 min |
| `deadlock_timeout` | Detekce deadlocků | 1 s |
**Příznaky nedostatečného vacuum**: růst tabulky (bloat), zhoršení výkonu index scanů, XID wraparound hazard.
### WAL (Write-Ahead Log)
Append-only log všech změn pro crash recovery a replikaci:
```conf
wal_level = replica # nebo logical
archive_mode = on
archive_command = 'aws s3 cp %p s3://backups/pg-wal/%f'
```
**PITR (Point-In-Time Recovery)**:
1. Restore base backup (pg_basebackup)
2. Replay WAL archivů až k cílovému času
3. `recovery_target_time = '2026-06-03 10:30:00 UTC'`
### Replication slots
- **Physical** — zaručuje, že WAL není smazán masterem, dokud ho replica nespotřebuje
- **Logical** — pro logickou replikaci (selektivní tabulky, transformace dat)
- **Riziko**: pokud replica spadne, WAL naroste na disku (disk full)
- Monitoring: `pg_replication_slots`, `pg_stat_replication`
### Konfigurace
Hlavní soubory (dle Obe & Hsu):
- `postgresql.conf` — paměť, síť, logování, storage
- `pg_hba.conf` — přístupová práva
- `pg_ident.conf` — mapování OS uživatelů na PostgreSQL role
### AI-Ready PostgreSQL 18
(Kumar, Linster, 2026) — PostgreSQL 18 jako unified platform pro transakce, analytiku a AI:
| Oblast | Technika |
|--------|----------|
| Vektory | pgvector — embeddingy přímo v řádcích tabulky |
| Hybridní pattern | Semantic recall → SQL filtrování |
| LLM integrace | PostgreSQL + MCP (Model Context Protocol) |
| Embedding pipeline | Batch i stream generování embeddingů |
**Hybridní dotaz**:
```sql
SELECT p.*, pm.name
FROM products p
JOIN product_embeddings pe ON p.id = pe.product_id
WHERE pe.embedding <-> '[0.1, 0.3, ...]' < 0.8
AND p.in_stock = true
AND p.price < 100.00
ORDER BY pe.embedding <-> '[0.1, 0.3, ...]'
LIMIT 10;
```
### Rozšíření (extensions)
| Extension | Účel |
|-----------|-------|
| pgvector | Vektorové vyhledávání pro AI/embeddings |
| PostGIS | Geografická data, prostorové dotazy |
| pg_stat_statements | Monitoring výkonu dotazů |
| pg_duckdb | Analytické dotazy (DuckDB engine uvnitř PG) |
| pg_search | Full-text a hybridní vyhledávání |
| pg_cron | Scheduling úloh v DB |
| citus | Horizontální škálování (sharding) |
| timescaledb | Time-series optimalizace |
| pgaudit | Auditní logování |
## Connection pooling
| Pooler | Typ | Protokol |
|--------|-----|----------|
| PgBouncer | Proxy (transaction/session) | PostgreSQL wire |
| Odyssey | Proxy (multithreaded) | PostgreSQL wire |
| pgpool-II | Proxy (replication, load balancing) | PostgreSQL wire |
| RDS Proxy | Managed proxy (AWS) | PostgreSQL wire |
**PgBouncer režimy**:
- **Session pooling** — spojení drženo po celou dobu session (aplikace) → overhead
- **Transaction pooling** — spojení vráceno po dokončení transakce → efektivnější (vyžaduje bezstavovost)
## Doporučení — v čem je PostgreSQL lepší
| Oblast | PostgreSQL | Konkurence | Proč PG |
|--------|-----------|------------|---------|
| **Rozšiřitelnost** | Extensions, custom types, operators, index methods | MySQL omezené | Lze přidat cokoliv od vektorů po full-text v DB |
| **SQL standard** | Nejbližší ANSI SQL | MySQL odbočky (GROUP BY, ALTER TABLE) | Přenositelnost, méně překvapení |
| **Geoprostorová data** | PostGIS (zlatý standard GIS) | MySQL GIS (omezený) | Jediná reálná open-source volba pro GIS |
| **Konzistence** | SSI serializable, foreign keys, CHECK, exclusions | MySQL MyISAM bez FK, InnoDB jen RC | Vhodné pro finanční a kritické systémy |
| **Concurrent读写** | MVCC bez reader/writer blokování | MySQL InnoDB reader blokuje writer (a naopak) u starších verzí | Lepší škálovatelnost čtení |
| **AI/vektory** | pgvector nativně v DB | Samostatná vektorová DB (zvýšení latence) | Hybridní dotazy v jediném SQL |
| **Licence** | PostgreSQL license (MIT-like) | MySQL dvojí licence (Oracle) | Žádná vendor lock-in |
### Kdy použít PostgreSQL
- **Enterprise aplikace** — vyžadují ACID, referenční integritu, komplexní transakce
- **Geografické systémy** — GIS, mapové aplikace, lokalitní služby
- **Finanční systémy** — účetnictví, banking, compliance (audit logging, SSI)
- **AI / RAG aplikace** — hybridní vektorové + relační dotazy v jedné DB
- **Analytika na relačních datech** — pg_duckdb, materializované views, window functions
- **Multi-tenant aplikace** — row-level security, schemas per tenant
## PostgreSQL licensing
| Varianta | Licence | Cena | Omezení |
|----------|---------|------|---------|
| **PostgreSQL** | PostgreSQL license (MIT-like) | $0 | Žádná — lze používat, modifikovat, distribuovat v komerčních produktech. Není potřeba žádný "commercial license" |
| **Amazon Aurora PostgreSQL** | Proprietary (AWS) | ~$0.10-1.00/hod | AWS managed, PostgreSQL compatible. AWS smí používat PG kód díky PostgreSQL license |
| **YugabyteDB** | Apache 2.0 | $0 (core) | PostgreSQL kompatibilní distributed SQL, postaveno na PG query layer |
| **TimescaleDB** | Apache 2.0 (community) / Timescale License (enterprise) | $0 (community) | Časově řadová rozšíření PostgreSQL. Enterprise: tiered storage, compression, multi-node |
**Klíčové**: PostgreSQL license je jedna z nejliberálnějších — umožňuje cloud providerům (AWS, GCP, Azure) nabízet PostgreSQL jako managed službu bez omezení. To je rozdíl oproti MongoDB (SSPL) a Redis (RSALv2). Díky tomu má PostgreSQL nejširší cloud podporu ze všech databází.
**Dopad na výběr**: Žádný license risk, žádný vendor lock-in, žádné skryté náklady. PostgreSQL je bezpečná volba pro jakýkoliv projekt.
### Kdy použít něco jiného
- **Jednoduchý web / blog** → SQLite (v embedded scénáři lehčí)
- **High-throughput key-value** → Redis (o řád nižší latence)
- **Time-series v masivním měřítku** → TimescaleDB, InfluxDB
- **Globálně distribuovaná data** → CockroachDB, Spanner
- **Full-text search primárně** → Elasticsearch
## Zdroje
Odkazy, knihy a standardy: [sources/databases/sources.md](sources/databases/sources.md)
### Doporučená literatura
| Kniha | Autoři | ISBN | Popis |
|-------|--------|------|-------|
| PostgreSQL: Up and Running (3rd ed.) | Regina Obe, Leo Hsu | 978-1491962935 | Praktický průvodce administrací, konfigurací a extensions |
| AI-Ready PostgreSQL 18 | Kumar, Linster | — | PostgreSQL jako unified platform pro AI workloads |
*Poslední revize: 2026-06-03*

228
PROVISIONING.en.md Normal file
View File

@@ -0,0 +1,228 @@
# 📦 Provisioning — boot, installation, server management
## Network boot (PXE / iPXE)
### PXE boot flow
```
1. Server power-on → PXE ROM in NIC / UEFI
2. DHCP Broadcast → DHCP server offers IP + next-server (TFTP) + boot file
3. TFTP downloads pxelinux.0 (BIOS) / bootx64.efi (UEFI)
4. Loads configuration (pxelinux.cfg/default or MAC/IP-based)
5. Downloads kernel + initrd via TFTP/HTTP (iPXE)
6. Kernel boot → automated installation (Kickstart / Preseed / AutoYaST)
```
### DHCP configuration (ISC DHCP)
```
subnet 10.0.0.0 netmask 255.255.255.0 {
next-server 10.0.0.10; # TFTP server
filename "ipxe.efi"; # Boot file (UEFI)
option domain-name-servers 10.0.0.10;
option routers 10.0.0.1;
}
```
### iPXE (modern PXE replacement)
- HTTP instead of TFTP (faster, more reliable)
- HTTPS support (Image verification, secure boot)
- iSCSI boot, FCoE boot
- Scriptable: `chain http://boot.example.com/script.ipxe`
- Embedded: iPXE ROM flashed directly into NIC
### PXE vs iPXE comparison
| Feature | PXE | iPXE |
|---------|-----|------|
| Protocol | TFTP (slow, 512B/block) | HTTP/HTTPS/iSCSI |
| Encryption | No | HTTPS, TLS |
| Scripting | Menu only | Full scripting engine |
| Debugging | Limited | Built-in shell |
| UEFI/BIOS | Both | Both |
## Automated installation
### Kickstart (RHEL/Alma/Rocky)
```
# Minimal kickstart for RHEL 9
text
url --url="http://10.0.0.10/install/rhel9"
lang en_US.UTF-8
keyboard us
timezone Europe/Prague --isUtc
rootpw --iscrypted $6$...
%packages
@^minimal-environment
vim
net-tools
%end
%post
echo "node001" > /etc/hostname
%end
reboot
```
### Preseed (Debian/Ubuntu)
```
d-i debian-installer/locale string en_US.UTF-8
d-i keyboard-configuration/xkb-keymap us
d-i netcfg/choose_interface select auto
d-i netcfg/get_hostname string node001
d-i clock-setup/utc boolean true
d-i time/zone string Europe/Prague
d-i partman-auto/method string regular
d-i partman-auto/choose_recipe select atomic
d-i passwd/root-login boolean true
d-i passwd/root-password password securepass
d-i passwd/root-password-again password securepass
d-i pkgsel/include string openssh-server vim
d-i finish-install/reboot_in_progress note
```
## Metal as a Service
### MAAS (Canonical)
- **Discovery**: DHCP → PXE boot → hardware detection (CPU, RAM, disk, MAC)
- **Commissioning**: node goes through commissioning, stores inventory in DB
- **Deploy**: OS image (Ubuntu, RHEL, ESXi) written to disk → reboot
- **Integration**: Juju, OpenStack, Kubernetes (Charmed Kubernetes)
- **Networking**: VLAN, subnet, DNS/DHCP management, BGP peering
### Digital Rebar / RackN
- **Provisioning**: workflow-based (stages: discovery → firmware → OS → config)
- **Multi-cloud**: bare metal + cloud + edge
- **Template**: templates for OS deployment (RHEL, Ubuntu, VMware)
- **API**: fully REST API, Terraform provider
## Management API — Redfish
### DMTF Standard
REST API (JSON) → successor to IPMI.
| Endpoint | Purpose |
|----------|---------|
| `/redfish/v1/Systems/` | Server management (power, boot, inventory) |
| `/redfish/v1/Chassis/` | Physical hardware (PSU, fan, temp, sensors) |
| `/redfish/v1/Managers/` | BMC (iLO, iDRAC, XClarity) |
| `/redfish/v1/UpdateService/` | Firmware updates |
| `/redfish/v1/EventService/` | Event subscription (webhook) |
### Redfish examples
```
# Power on server
POST /redfish/v1/Systems/1/Actions/ComputerSystem.Reset
Body: {"ResetType": "On"}
# Set boot override (one-shot PXE)
PATCH /redfish/v1/Systems/1
Body: {"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}}
# Get sensor data
GET /redfish/v1/Chassis/1/Thermal
→ {"Temperatures": [{"Name": "CPU1", "ReadingCelsius": 45}], "Fans": [...]}
```
### IPMI (legacy)
- Port 623/UDP (RMCP)
- `ipmitool power on/off/status`
- `ipmitool sensor list`
- `ipmitool chassis bootdev pxe`
- Serial over LAN: `ipmitool sol activate`
## Terraform for provisioning
```hcl
# Terraform provider for VMware vSphere
provider "vsphere" {
user = var.vsphere_user
password = var.vsphere_password
vsphere_server = var.vsphere_server
}
resource "vsphere_virtual_machine" "web" {
name = "web-${count.index}"
resource_pool_id = data.vsphere_resource_pool.pool.id
datastore_id = data.vsphere_datastore.ds.id
num_cpus = 4
memory = 16384
guest_id = "rhel9_64Guest"
network_interface { network_id = data.vsphere_network.net.id }
disk { label = "os", size = 80 }
}
```
More in [CICD.md](CICD.md#infrastructure-as-code-iac).
## Firmware management
- **BIOS/UEFI settings**: profile update during provisioning (Redfish `PATCH /Systems/1/Bios`)
- **Firmware updates**: Redfish UpdateService, SUU (Dell), SUM (HPE), SMM (Supermicro)
- **Lifecycle Controller** (Dell LC): integrated OS for firmware management
- **Baseline management**: maintain consistent firmware versions across fleet
- **Boot: UEFI vs Legacy BIOS**:
- **UEFI**: Secure Boot, GPT, larger disks, faster boot
- **Legacy BIOS**: MBR, compatibility, 2 TB boot disk limit
## Configuration management (post-provisioning)
| Tool | Language | Push/Pull | Use case |
|------|----------|-----------|----------|
| **Ansible** | YAML | Push (SSH) | General config management, ad-hoc |
| **Puppet** | Ruby DSL | Pull (agent) | State management, enterprise |
| **Chef** | Ruby DSL | Pull (agent) | Compliance, infrastructure automation |
| **SaltStack** | YAML/Python | Both (salt-minion) | High-speed config, event-driven |
More in [CICD.md](CICD.md).
## OpenStack Provisioning
OpenStack offers several methods for provisioning infrastructure:
### Deployment tools
| Tool | Description | Use case |
|------|-------------|----------|
| **TripleO (OpenStack on OpenStack)** | Deploy OpenStack using bare metal (Ironic) + Heat orchestration | Production, Red Hat OSP |
| **Kolla (Ansible + Docker)** | Containerized OpenStack services, Ansible orchestration | Production, flexible |
| **Kolla-Kubernetes** | OpenStack on Kubernetes | Kubernetes-native, edge |
| **Charmed OpenStack (Juju)** | Canonical, Juju charms for OpenStack | Ubuntu, hybrid cloud |
| **OpenStack Charms** | Juju charms for individual services | Fine-grained deployment |
| **DevStack** | Fast development deployment | Dev/test, learning |
| **OpenStack-Ansible** | Ansible playbooks for OpenStack (OSA) | Legacy, AIO |
### Ironic (Bare Metal Provisioning)
- OpenStack service for managing and provisioning bare metal servers
- Supports PXE, iPXE, Redfish, IPMI
- Concepts: **Node** (HW), **Port** (MAC), **Driver** (HW type)
- Lifecycle: enroll → manage → inspect → provide → available → active
- Integration with Nova: Nova runs instances on bare metal via Ironic
### Glance (Image Management)
- Image catalog for VM images and ISO
- Supported formats: raw, qcow2, vmdk, vhd, iso
- Image caching on compute node (for faster boot)
- Multi-backend: file, Ceph RBD, Swift, NFS
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Last revision: 2026-06-03*

228
PROVISIONING.md Normal file
View File

@@ -0,0 +1,228 @@
# 📦 Provisioning — boot, instalace, správa serverů
## Síťový boot (PXE / iPXE)
### PXE boot flow
```
1. Server power-on → PXE ROM v NIC / UEFI
2. DHCP Broadcast → DHCP server nabídne IP + next-server (TFTP) + boot file
3. TFTP stáhne pxelinux.0 (BIOS) / bootx64.efi (UEFI)
4. Načte konfiguraci (pxelinux.cfg/default nebo MAC/IP-based)
5. Stáhne kernel + initrd přes TFTP/HTTP (iPXE)
6. Kernel boot → automatická instalace (Kickstart / Preseed / AutoYaST)
```
### DHCP konfigurace (ISC DHCP)
```
subnet 10.0.0.0 netmask 255.255.255.0 {
next-server 10.0.0.10; # TFTP server
filename "ipxe.efi"; # Boot file (UEFI)
option domain-name-servers 10.0.0.10;
option routers 10.0.0.1;
}
```
### iPXE (moderní náhrada PXE)
- HTTP místo TFTP (rychlejší, spolehlivější)
- HTTPS support (Image verification, secure boot)
- iSCSI boot, FCoE boot
- Scriptable: `chain http://boot.example.com/script.ipxe`
- Embedded: iPXE ROM flashnutá přímo do NIC
### Porovnání PXE vs iPXE
| Vlastnost | PXE | iPXE |
|-----------|-----|------|
| Protokol | TFTP (pomalý, 512B/blok) | HTTP/HTTPS/iSCSI |
| Šifrování | Ne | HTTPS, TLS |
| Scripting | Pouze menu | Plný scripting engine |
| Debugging | Omezený | Vestavěný shell |
| UEFI/BIOS | Oba | Oba |
## Automatická instalace
### Kickstart (RHEL/Alma/Rocky)
```
# Minimal kickstart pro RHEL 9
text
url --url="http://10.0.0.10/install/rhel9"
lang en_US.UTF-8
keyboard us
timezone Europe/Prague --isUtc
rootpw --iscrypted $6$...
%packages
@^minimal-environment
vim
net-tools
%end
%post
echo "node001" > /etc/hostname
%end
reboot
```
### Preseed (Debian/Ubuntu)
```
d-i debian-installer/locale string en_US.UTF-8
d-i keyboard-configuration/xkb-keymap us
d-i netcfg/choose_interface select auto
d-i netcfg/get_hostname string node001
d-i clock-setup/utc boolean true
d-i time/zone string Europe/Prague
d-i partman-auto/method string regular
d-i partman-auto/choose_recipe select atomic
d-i passwd/root-login boolean true
d-i passwd/root-password password securepass
d-i passwd/root-password-again password securepass
d-i pkgsel/include string openssh-server vim
d-i finish-install/reboot_in_progress note
```
## Metal as a Service
### MAAS (Canonical)
- **Discovery**: DHCP → PXE boot → hardware detection (CPU, RAM, disk, MAC)
- **Komisionování**: node projde commissioning, uloží inventory do DB
- **Deploy**: obraz OS (Ubuntu, RHEL, ESXi) nahrán na disk → reboot
- **Integrace**: Juju, OpenStack, Kubernetes (Charmed Kubernetes)
- **Networking**: VLAN, subnet, DNS/DHCP management, BGP peering
### Digital Rebar / RackN
- **Provisioning**: workflow-based (stages: discovery → firmware → OS → config)
- **Multi-cloud**: bare metal + cloud + edge
- **Template**: šablony pro OS deployment (RHEL, Ubuntu, VMware)
- **API**: plně REST API, Terraform provider
## Management API — Redfish
### Standard DMTF
REST API (JSON) → nástupce IPMI.
| Endpoint | Účel |
|----------|------|
| `/redfish/v1/Systems/` | Server management (power, boot, inventory) |
| `/redfish/v1/Chassis/` | Fyzický hardware (PSU, fan, temp, sensors) |
| `/redfish/v1/Managers/` | BMC (iLO, iDRAC, XClarity) |
| `/redfish/v1/UpdateService/` | Firmware updates |
| `/redfish/v1/EventService/` | Event subscription (webhook) |
### Redfish příklady
```
# Power on server
POST /redfish/v1/Systems/1/Actions/ComputerSystem.Reset
Body: {"ResetType": "On"}
# Set boot override (one-shot PXE)
PATCH /redfish/v1/Systems/1
Body: {"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}}
# Get sensor data
GET /redfish/v1/Chassis/1/Thermal
→ {"Temperatures": [{"Name": "CPU1", "ReadingCelsius": 45}], "Fans": [...]}
```
### IPMI (legacy)
- Port 623/UDP (RMCP)
- `ipmitool power on/off/status`
- `ipmitool sensor list`
- `ipmitool chassis bootdev pxe`
- Serial over LAN: `ipmitool sol activate`
## Terraform pro provisioning
```hcl
# Terraform provider pro VMware vSphere
provider "vsphere" {
user = var.vsphere_user
password = var.vsphere_password
vsphere_server = var.vsphere_server
}
resource "vsphere_virtual_machine" "web" {
name = "web-${count.index}"
resource_pool_id = data.vsphere_resource_pool.pool.id
datastore_id = data.vsphere_datastore.ds.id
num_cpus = 4
memory = 16384
guest_id = "rhel9_64Guest"
network_interface { network_id = data.vsphere_network.net.id }
disk { label = "os", size = 80 }
}
```
Více v [CICD.md](CICD.md#infrastructure-as-code-iac).
## Firmware management
- **BIOS/UEFI settings**: profilový update při provisioningu (Redfish `PATCH /Systems/1/Bios`)
- **Firmware updates**: Redfish UpdateService, SUU (Dell), SUM (HPE), SMM (Supermicro)
- **Lifecycle Controller** (Dell LC): integrovaný OS pro firmware management
- **Baseline management**: udržovat konzistentní firmware verze napříč fleetem
- **Boot: UEFI vs Legacy BIOS**:
- **UEFI**: Secure Boot, GPT, větší disky, rychlejší boot
- **Legacy BIOS**: MBR, kompatibilita, limit 2 TB boot disk
## Configuration management (post-provisioning)
| Nástroj | Jazyk | Push/Pull | Use case |
|---------|-------|-----------|----------|
| **Ansible** | YAML | Push (SSH) | General config management, ad-hoc |
| **Puppet** | Ruby DSL | Pull (agent) | State management, enterprise |
| **Chef** | Ruby DSL | Pull (agent) | Compliance, infrastructure automation |
| **SaltStack** | YAML/Python | Both (salt-minion) | High-speed config, event-driven |
Více v [CICD.md](CICD.md).
## OpenStack Provisioning
OpenStack nabízí několik metod pro provisionování infrastruktury:
### Deployment nástroje
| Nástroj | Popis | Use case |
|---------|-------|----------|
| **TripleO (OpenStack on OpenStack)** | Deploy OpenStack pomocí bare metal (Ironic) + Heat orchestrace | Produkce, Red Hat OSP |
| **Kolla (Ansible + Docker)** | Containerizované OpenStack služby, Ansible orchestrace | Produkce, flexibilní |
| **Kolla-Kubernetes** | OpenStack na Kubernetes | Kubernetes-native, edge |
| **Charmed OpenStack (Juju)** | Canonical, Juju charmy pro OpenStack | Ubuntu, hybrid cloud |
| **OpenStack Charms** | Juju charmy pro jednotlivé služby | Fine-grained deployment |
| **DevStack** | Rychlý vývojový deployment | Dev/test, learning |
| **OpenStack-Ansible** | Ansible playbooky pro OpenStack (OSA) | Legacy, AIO |
### Ironic (Bare Metal Provisioning)
- OpenStack service pro správu a provisionování bare metal serverů
- Podporuje PXE, iPXE, Redfish, IPMI
- Koncepty: **Node** (HW), **Port** (MAC), **Driver** (HW typ)
- Lifecycle: enroll → manage → inspect → provide → available → active
- Integrace s Nova: Nova spouští instance na bare metal přes Ironic
### Glance (Image Management)
- Image catalog pro VM images a ISO
- Podpora formátů: raw, qcow2, vmdk, vhd, iso
- Image caching na compute node (pro rychlejší boot)
- Multi-backend: file, Ceph RBD, Swift, NFS
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Poslední revize: 2026-06-03*

190
README.en.md Normal file
View File

@@ -0,0 +1,190 @@
# 🏗️ Infrastructure Architecture — Knowledge Base
Comprehensive overview of topics, principles, and best practices for infrastructure design and operations.
Bilingual: Czech (`.md`) and English (`.en.md`).
---
## Topic Map — Relationships Between Areas
```
┌─────────────┐
│ CLOUD │
│ (IaaS/PaaS)│
└──────┬──────┘
┌──────────────┼──────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────────┐
│NETWORKING│ │ STORAGE │ │ DATABASES │
│(L2-L7, │ │(SAN/NAS/ │ │ (SQL/NOSQL/ │
│ Zero Tr.)│ │ Ceph/SDS)│ │ Vector) │
└────┬─────┘ └────┬─────┘ └──────┬───────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────┐
│ DATACENTERS │
│ (Tier, power, cooling, layout) │
└────────────┬────────────────────────┘
┌────────────┼────────────┬───────────────┐
▼ ▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌────────┐ ┌──────────────┐
│SERVER-HW │ │SERVER- │ │ GPU │ │ PROVISIONING │
│(CPU,RAM, │ │CONFIG │ │(NVIDIA/│ │ (PXE, Ironic │
│ PCIe,BM) │ │(BIOS, │ │ AMD) │ │ Terraform) │
└──────────┘ │ NUMA) │ └────────┘ └──────────────┘
└──────────┘
┌──────────┐ ┌──────────┐ ┌────────┐
│HYPERVISOR│ │ MONITOR │ │ CICD │
│(VMware, │ │(Prom, │ │(GitOps, │
│ KVM, ...)│ │ Grafana) │ │ IaC) │
└──────────┘ └──────────┘ └────────┘
```
---
## Navigation — Czech (`.md`)
| Area | File | Description | Related to |
|------|------|-------------|------------|
| ☁️ Cloud architecture | [CLOUD.md](CLOUD.md) | AWS/Azure/GCP, hybrid cloud, multi-cloud | GPU, NETWORKING |
| 🌐 Network architecture | [NETWORKING.md](NETWORKING.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
| 📊 Monitoring & observability | [MONITORING.md](MONITORING.md) | Prometheus, Grafana, OTel, logging, alerting | — |
| 🔄 CI/CD & DevOps | [CICD.md](CICD.md) | Pipelines, GitOps, IaC (Terraform), deployment | — |
| 🗄️ Database architecture | [DATABASES.md](DATABASES.md) | Classification, sharding, replication, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VEKTOROVE-DB, DATABAZOVE-ENGINY |
| 🖥️ Hypervisors | [HYPERVISORS.md](HYPERVISORS.md) | VMware, Hyper-V, KVM, Proxmox, migration | STORAGE, SERVER-HW |
| 🏭 Data centers | [DATACENTERS.md](DATACENTERS.md) | Tier, power, cooling, layout, DC services | MONITORING |
| 💾 Storage | [STORAGE.md](STORAGE.md) | SAN/NAS/object, RAID, SDS, Ceph, OpenStack Cinder/Swift/Manila | — |
| 🔌 Server connectivity | [CONNECTIVITY.md](CONNECTIVITY.md) | Ethernet, FC SAN, iSCSI, NVMe-oF, SAS | — |
| 🔧 Server hardware | [SERVER-HW.md](SERVER-HW.md) | CPU, RAM, PCIe, NUMA, BMC | CONNECTIVITY |
| 🎮 GPU | [GPU.md](GPU.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
| ⚙️ Server config | [SERVER-CONFIG.md](SERVER-CONFIG.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
| 📦 Provisioning | [PROVISIONING.md](PROVISIONING.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
| 📋 Legacy index | [HARDWARE.md](HARDWARE.md) | → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
| 📋 Legacy infra | [INFRASTRUCTURE.md](INFRASTRUCTURE.md) | → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
| 📋 Review workflow | [REVIEW.md](REVIEW.md) | Review and content control process | — |
| 📝 ADR template | [templates/ADR.md](templates/ADR.md) | Architecture Decision Record template | — |
### Detailed DB files
| File | Description |
|------|-------------|
| [POSTGRESQL.md](POSTGRESQL.md) | PostgreSQL — architecture, replication, tuning |
| [MYSQL.md](MYSQL.md) | MySQL & MariaDB |
| [ORACLE.md](ORACLE.md) | Oracle Database — RAC, Data Guard, tuning |
| [MONGODB.md](MONGODB.md) | MongoDB — document DB, sharding, replica sets |
| [REDIS.md](REDIS.md) | Redis — cache, session store, streams |
| [CASSANDRA.md](CASSANDRA.md) | Cassandra & ScyllaDB — wide-column, nosql |
| [VEKTOROVE-DB.md](VEKTOROVE-DB.md) | Vector databases — Pinecone, Qdrant, Milvus, pgvector |
| [DATABAZOVE-ENGINY.md](DATABAZOVE-ENGINY.md) | Common DB concepts — transactions, indexes, locking |
---
## Navigation — English (`.en.md`)
| Area | File | Description | Related to |
|------|------|-------------|------------|
| ☁️ Cloud architecture | [CLOUD.en.md](CLOUD.en.md) | AWS/Azure/GCP, hybrid cloud, multi-cloud | GPU, NETWORKING |
| 🌐 Network architecture | [NETWORKING.en.md](NETWORKING.en.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
| 📊 Monitoring & observability | [MONITORING.en.md](MONITORING.en.md) | Prometheus, Grafana, OTel, logging, alerting | — |
| 🔄 CI/CD & DevOps | [CICD.en.md](CICD.en.md) | Pipelines, GitOps, IaC (Terraform), deployment | — |
| 🗄️ Database architecture | [DATABASES.en.md](DATABASES.en.md) | Classification, sharding, replication, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VECTOR-DBS, DATABASE-ENGINES |
| 🖥️ Hypervisors | [HYPERVISORS.en.md](HYPERVISORS.en.md) | VMware, Hyper-V, KVM, Proxmox, migration | STORAGE, SERVER-HW |
| 🏭 Data centers | [DATACENTERS.en.md](DATACENTERS.en.md) | Tier, power, cooling, layout, DC services | MONITORING |
| 💾 Storage | [STORAGE.en.md](STORAGE.en.md) | SAN/NAS/object, RAID, SDS, Ceph | — |
| 🔌 Server connectivity | [CONNECTIVITY.en.md](CONNECTIVITY.en.md) | Ethernet, FC SAN, iSCSI, NVMe-oF, SAS | — |
| 🔧 Server hardware | [SERVER-HW.en.md](SERVER-HW.en.md) | CPU, RAM, PCIe, NUMA, BMC | CONNECTIVITY |
| 🎮 GPU | [GPU.en.md](GPU.en.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
| ⚙️ Server config | [SERVER-CONFIG.en.md](SERVER-CONFIG.en.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
| 📦 Provisioning | [PROVISIONING.en.md](PROVISIONING.en.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
| 📋 Legacy index | [HARDWARE.en.md](HARDWARE.en.md) | → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
| 📋 Legacy infra | [INFRASTRUCTURE.en.md](INFRASTRUCTURE.en.md) | → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
| 📋 Review workflow | [REVIEW.en.md](REVIEW.en.md) | Review and content control process | — |
| 📝 ADR template | [templates/ADR.en.md](templates/ADR.en.md) | Architecture Decision Record template | — |
### Detailed DB files
| File | Description |
|------|-------------|
| [POSTGRESQL.en.md](POSTGRESQL.en.md) | PostgreSQL — architecture, replication, tuning |
| [MYSQL.en.md](MYSQL.en.md) | MySQL & MariaDB |
| [ORACLE.en.md](ORACLE.en.md) | Oracle Database — RAC, Data Guard, tuning |
| [MONGODB.en.md](MONGODB.en.md) | MongoDB — document DB, sharding, replica sets |
| [REDIS.en.md](REDIS.en.md) | Redis — cache, session store, streams |
| [CASSANDRA.en.md](CASSANDRA.en.md) | Cassandra & ScyllaDB — wide-column, nosql |
| [VECTOR-DBS.en.md](VECTOR-DBS.en.md) | Vector databases — Pinecone, Qdrant, Milvus, pgvector |
| [DATABASE-ENGINES.en.md](DATABASE-ENGINES.en.md) | Common DB concepts — transactions, indexes, locking |
---
## Case Studies
| File | Description |
|------|-------------|
| [case-studies/proxmox-demo/README.md](case-studies/proxmox-demo/README.md) | Proxmox VE demo cluster — design (CZ) |
| [case-studies/proxmox-demo/README.en.md](case-studies/proxmox-demo/README.en.md) | Proxmox VE demo cluster — design (EN) |
---
## Cross-Reference Matrix
| File | References |
|------|------------|
| `CLOUD.md` / `CLOUD.en.md` | [`GPU.md`](GPU.md), [`NETWORKING.md`](NETWORKING.md), [`sources/cloud/sources.md`](sources/cloud/sources.md) |
| `NETWORKING.md` / `NETWORKING.en.md` | [`CLOUD.md`](CLOUD.md), [`sources/networking/sources.md`](sources/networking/sources.md) |
| `DATACENTERS.md` / `DATACENTERS.en.md` | [`MONITORING.md`](MONITORING.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `MONITORING.md` / `MONITORING.en.md` | [`sources/monitoring/sources.md`](sources/monitoring/sources.md) |
| `CICD.md` / `CICD.en.md` | [`sources/cicd/sources.md`](sources/cicd/sources.md) |
| `PROVISIONING.md` / `PROVISIONING.en.md` | [`CICD.md`](CICD.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `STORAGE.md` / `STORAGE.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `GPU.md` / `GPU.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `SERVER-HW.md` / `SERVER-HW.en.md` | [`CONNECTIVITY.md`](CONNECTIVITY.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `SERVER-CONFIG.md` / `SERVER-CONFIG.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `CONNECTIVITY.md` / `CONNECTIVITY.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `HYPERVISORS.md` / `HYPERVISORS.en.md` | [`STORAGE.md`](STORAGE.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `DATABASES.md` / `DATABASES.en.md` | [`POSTGRESQL.md`](POSTGRESQL.md), [`MYSQL.md`](MYSQL.md), [`ORACLE.md`](ORACLE.md), [`MONGODB.md`](MONGODB.md), [`REDIS.md`](REDIS.md), [`CASSANDRA.md`](CASSANDRA.md), [`VEKTOROVE-DB.md`](VEKTOROVE-DB.md), [`DATABAZOVE-ENGINY.md`](DATABAZOVE-ENGINY.md), [`sources/databases/sources.md`](sources/databases/sources.md) |
| `HARDWARE.md` / `HARDWARE.en.md` | [`SERVER-HW.md`](SERVER-HW.md), [`GPU.md`](GPU.md), [`SERVER-CONFIG.md`](SERVER-CONFIG.md), [`PROVISIONING.md`](PROVISIONING.md) |
| `INFRASTRUCTURE.md` / `INFRASTRUCTURE.en.md` | [`HYPERVISORS.md`](HYPERVISORS.md), [`DATACENTERS.md`](DATACENTERS.md), [`STORAGE.md`](STORAGE.md), [`HARDWARE.md`](HARDWARE.md) |
---
## Sources
Raw reference data (documentation, books, standards) by area:
| Area | Czech | English |
|------|-------|---------|
| ☁️ Cloud | [`sources/cloud/sources.md`](sources/cloud/sources.md) | [`sources/cloud/sources.en.md`](sources/cloud/sources.en.md) |
| 🌐 Networking | [`sources/networking/sources.md`](sources/networking/sources.md) | [`sources/networking/sources.en.md`](sources/networking/sources.en.md) |
| 📊 Monitoring | [`sources/monitoring/sources.md`](sources/monitoring/sources.md) | [`sources/monitoring/sources.en.md`](sources/monitoring/sources.en.md) |
| 🔄 CI/CD | [`sources/cicd/sources.md`](sources/cicd/sources.md) | [`sources/cicd/sources.en.md`](sources/cicd/sources.en.md) |
| 🗄️ Databases | [`sources/databases/sources.md`](sources/databases/sources.md) | [`sources/databases/sources.en.md`](sources/databases/sources.en.md) |
| 🏗️ Infrastructure | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) | [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
---
## KB Agents
| Agent | Description |
|-------|-------------|
| [`kb-research`](.opencode/agents/kb-research.md) | Processes [todo] items — research on new topics |
| [`kb-source-scout`](.opencode/agents/kb-source-scout.md) | Finds new sources and adds them to sources/ |
| [`kb-reviewer`](.opencode/agents/kb-reviewer.md) | Audits consistency, links, duplications, formatting |
| [`kb-index`](.opencode/agents/kb-index.md) | **Maintains this index** — scans files, extracts cross-references, validates links |
---
## Principles
| Czech | English |
|-------|---------|
| **Dostupnost** — SLA, redundance, failover, multi-AZ | **Availability** — SLA, redundancy, failover, multi-AZ |
| **Škálovatelnost** — horizontální vs. vertikální, auto-scaling | **Scalability** — horizontal vs. vertical, auto-scaling |
| **Bezpečnost** — defense in depth, least privilege, zero trust | **Security** — defense in depth, least privilege, zero trust |
| **Náklady** — FinOps, right-sizing, reserved instances | **Cost** — FinOps, right-sizing, reserved instances |
| **Operability** — observabilita, automation, dokumentace | **Operability** — observability, automation, documentation |
---
*This index is automatically maintained by the `kb-index` agent. Last updated: 2026-06-11.*

190
README.md Normal file
View File

@@ -0,0 +1,190 @@
# 🏗️ Infrastructure Architecture — Knowledge Base
Komplexní přehled témat, principů a best practices pro návrh a provoz infrastruktury.
Bilingual: Czech (`.md`) and English (`.en.md`).
---
## Topic Map — Vztahy mezi oblastmi
```
┌─────────────┐
│ CLOUD │
│ (IaaS/PaaS)│
└──────┬──────┘
┌──────────────┼──────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────────┐
│NETWORKING│ │ STORAGE │ │ DATABASES │
│(L2-L7, │ │(SAN/NAS/ │ │ (SQL/NOSQL/ │
│ Zero Tr.)│ │ Ceph/SDS)│ │ Vector) │
└────┬─────┘ └────┬─────┘ └──────┬───────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────┐
│ DATACENTERS │
│ (Tier, power, cooling, layout) │
└────────────┬────────────────────────┘
┌────────────┼────────────┬───────────────┐
▼ ▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌────────┐ ┌──────────────┐
│SERVER-HW │ │SERVER- │ │ GPU │ │ PROVISIONING │
│(CPU,RAM, │ │CONFIG │ │(NVIDIA/│ │ (PXE, Ironic │
│ PCIe,BM) │ │(BIOS, │ │ AMD) │ │ Terraform) │
└──────────┘ │ NUMA) │ └────────┘ └──────────────┘
└──────────┘
┌──────────┐ ┌──────────┐ ┌────────┐
│HYPERVISOR│ │ MONITOR │ │ CICD │
│(VMware, │ │(Prom, │ │(GitOps, │
│ KVM, ...)│ │ Grafana) │ │ IaC) │
└──────────┘ └──────────┘ └────────┘
```
---
## Navigace — Czech (`.md`)
| Oblast | Soubor | Popis | Propojeno s |
|--------|--------|-------|-------------|
| ☁️ Cloud architektura | [CLOUD.md](CLOUD.md) | AWS/Azure/GCP, hybrid cloud, multi-cloud, well-architected framework | GPU, NETWORKING |
| 🌐 Síťová architektura | [NETWORKING.md](NETWORKING.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
| 📊 Monitoring a observabilita | [MONITORING.md](MONITORING.md) | Prometheus, Grafana, OTel, logging, alerting, SLO | — |
| 🔄 CI/CD a DevOps | [CICD.md](CICD.md) | Pipelines, GitOps, IaC (Terraform), deployment strategie | — |
| 🗄️ Databázová architektura | [DATABASES.md](DATABASES.md) | Klasifikace, sharding, replikace, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VEKTOROVE-DB, DATABAZOVE-ENGINY |
| 🖥️ Hypervisory | [HYPERVISORS.md](HYPERVISORS.md) | VMware, Hyper-V, KVM, Proxmox, migrace | STORAGE, SERVER-HW |
| 🏭 Datová centra | [DATACENTERS.md](DATACENTERS.md) | Tier, power, cooling, layout, DC služby | MONITORING |
| 💾 Storage | [STORAGE.md](STORAGE.md) | SAN/NAS/object, RAID, SDS, Ceph, OpenStack Cinder/Swift/Manila | — |
| 🔌 Server connectivity | [CONNECTIVITY.md](CONNECTIVITY.md) | Ethernet, FC SAN, iSCSI, NVMe-oF, SAS | — |
| 🔧 Server hardware | [SERVER-HW.md](SERVER-HW.md) | CPU, RAM, PCIe, NUMA, BMC | CONNECTIVITY |
| 🎮 GPU | [GPU.md](GPU.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
| ⚙️ Server config | [SERVER-CONFIG.md](SERVER-CONFIG.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
| 📦 Provisioning | [PROVISIONING.md](PROVISIONING.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
| 📋 Původní rozcestník | [HARDWARE.md](HARDWARE.md) | Legacy index → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
| 📋 Původní infrastruktura | [INFRASTRUCTURE.md](INFRASTRUCTURE.md) | Legacy index → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
| 📋 Review workflow | [REVIEW.md](REVIEW.md) | Proces oponentury a kontroly obsahu | — |
| 📝 ADR template | [templates/ADR.md](templates/ADR.md) | Architecture Decision Record template | — |
### Detailní DB soubory
| Soubor | Popis |
|--------|-------|
| [POSTGRESQL.md](POSTGRESQL.md) | PostgreSQL — architektura, replikace, tuning |
| [MYSQL.md](MYSQL.md) | MySQL & MariaDB |
| [ORACLE.md](ORACLE.md) | Oracle Database — RAC, Data Guard, tuning |
| [MONGODB.md](MONGODB.md) | MongoDB — document DB, sharding, replica sets |
| [REDIS.md](REDIS.md) | Redis — cache, session store, streamy |
| [CASSANDRA.md](CASSANDRA.md) | Cassandra & ScyllaDB — wide-column, nosql |
| [VEKTOROVE-DB.md](VEKTOROVE-DB.md) | Vector databáze — Pinecone, Qdrant, Milvus, pgvector |
| [DATABAZOVE-ENGINY.md](DATABAZOVE-ENGINY.md) | Společné koncepty napříč DB — transakce, indexy, locking |
---
## Navigation — English (`.en.md`)
| Area | File | Description | Related to |
|------|------|-------------|------------|
| ☁️ Cloud architecture | [CLOUD.en.md](CLOUD.en.md) | AWS/Azure/GCP, hybrid cloud, multi-cloud | GPU, NETWORKING |
| 🌐 Network architecture | [NETWORKING.en.md](NETWORKING.en.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
| 📊 Monitoring & observability | [MONITORING.en.md](MONITORING.en.md) | Prometheus, Grafana, OTel, logging, alerting | — |
| 🔄 CI/CD & DevOps | [CICD.en.md](CICD.en.md) | Pipelines, GitOps, IaC (Terraform), deployment | — |
| 🗄️ Database architecture | [DATABASES.en.md](DATABASES.en.md) | Classification, sharding, replication, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VECTOR-DBS, DATABASE-ENGINES |
| 🖥️ Hypervisors | [HYPERVISORS.en.md](HYPERVISORS.en.md) | VMware, Hyper-V, KVM, Proxmox, migration | STORAGE, SERVER-HW |
| 🏭 Data centers | [DATACENTERS.en.md](DATACENTERS.en.md) | Tier, power, cooling, layout, DC services | MONITORING |
| 💾 Storage | [STORAGE.en.md](STORAGE.en.md) | SAN/NAS/object, RAID, SDS, Ceph | — |
| 🔌 Server connectivity | [CONNECTIVITY.en.md](CONNECTIVITY.en.md) | Ethernet, FC SAN, iSCSI, NVMe-oF, SAS | — |
| 🔧 Server hardware | [SERVER-HW.en.md](SERVER-HW.en.md) | CPU, RAM, PCIe, NUMA, BMC | CONNECTIVITY |
| 🎮 GPU | [GPU.en.md](GPU.en.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
| ⚙️ Server config | [SERVER-CONFIG.en.md](SERVER-CONFIG.en.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
| 📦 Provisioning | [PROVISIONING.en.md](PROVISIONING.en.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
| 📋 Legacy index | [HARDWARE.en.md](HARDWARE.en.md) | → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
| 📋 Legacy infra | [INFRASTRUCTURE.en.md](INFRASTRUCTURE.en.md) | → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
| 📋 Review workflow | [REVIEW.en.md](REVIEW.en.md) | Review and content control process | — |
| 📝 ADR template | [templates/ADR.en.md](templates/ADR.en.md) | Architecture Decision Record template | — |
### Detailed DB files
| File | Description |
|------|-------------|
| [POSTGRESQL.en.md](POSTGRESQL.en.md) | PostgreSQL — architecture, replication, tuning |
| [MYSQL.en.md](MYSQL.en.md) | MySQL & MariaDB |
| [ORACLE.en.md](ORACLE.en.md) | Oracle Database — RAC, Data Guard, tuning |
| [MONGODB.en.md](MONGODB.en.md) | MongoDB — document DB, sharding, replica sets |
| [REDIS.en.md](REDIS.en.md) | Redis — cache, session store, streams |
| [CASSANDRA.en.md](CASSANDRA.en.md) | Cassandra & ScyllaDB — wide-column, nosql |
| [VECTOR-DBS.en.md](VECTOR-DBS.en.md) | Vector databases — Pinecone, Qdrant, Milvus, pgvector |
| [DATABASE-ENGINES.en.md](DATABASE-ENGINES.en.md) | Common DB concepts — transactions, indexes, locking |
---
## Case Studies
| File | Popis / Description |
|------|-------------------|
| [case-studies/proxmox-demo/README.md](case-studies/proxmox-demo/README.md) | Proxmox VE demo cluster — návrh (CZ) |
| [case-studies/proxmox-demo/README.en.md](case-studies/proxmox-demo/README.en.md) | Proxmox VE demo cluster — design (EN) |
---
## Cross-Reference Matrix
| Soubor (File) | Odkazuje na (References) |
|---------------|--------------------------|
| `CLOUD.md` / `CLOUD.en.md` | [`GPU.md`](GPU.md), [`NETWORKING.md`](NETWORKING.md), [`sources/cloud/sources.md`](sources/cloud/sources.md) |
| `NETWORKING.md` / `NETWORKING.en.md` | [`CLOUD.md`](CLOUD.md), [`sources/networking/sources.md`](sources/networking/sources.md) |
| `DATACENTERS.md` / `DATACENTERS.en.md` | [`MONITORING.md`](MONITORING.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `MONITORING.md` / `MONITORING.en.md` | [`sources/monitoring/sources.md`](sources/monitoring/sources.md) |
| `CICD.md` / `CICD.en.md` | [`sources/cicd/sources.md`](sources/cicd/sources.md) |
| `PROVISIONING.md` / `PROVISIONING.en.md` | [`CICD.md`](CICD.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `STORAGE.md` / `STORAGE.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `GPU.md` / `GPU.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `SERVER-HW.md` / `SERVER-HW.en.md` | [`CONNECTIVITY.md`](CONNECTIVITY.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `SERVER-CONFIG.md` / `SERVER-CONFIG.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `CONNECTIVITY.md` / `CONNECTIVITY.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `HYPERVISORS.md` / `HYPERVISORS.en.md` | [`STORAGE.md`](STORAGE.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `DATABASES.md` / `DATABASES.en.md` | [`POSTGRESQL.md`](POSTGRESQL.md), [`MYSQL.md`](MYSQL.md), [`ORACLE.md`](ORACLE.md), [`MONGODB.md`](MONGODB.md), [`REDIS.md`](REDIS.md), [`CASSANDRA.md`](CASSANDRA.md), [`VEKTOROVE-DB.md`](VEKTOROVE-DB.md), [`DATABAZOVE-ENGINY.md`](DATABAZOVE-ENGINY.md), [`sources/databases/sources.md`](sources/databases/sources.md) |
| `HARDWARE.md` / `HARDWARE.en.md` | [`SERVER-HW.md`](SERVER-HW.md), [`GPU.md`](GPU.md), [`SERVER-CONFIG.md`](SERVER-CONFIG.md), [`PROVISIONING.md`](PROVISIONING.md) |
| `INFRASTRUCTURE.md` / `INFRASTRUCTURE.en.md` | [`HYPERVISORS.md`](HYPERVISORS.md), [`DATACENTERS.md`](DATACENTERS.md), [`STORAGE.md`](STORAGE.md), [`HARDWARE.md`](HARDWARE.md) |
---
## Zdroje / Sources
Raw referenční data (dokumentace, knihy, standardy) podle oblastí:
| Oblast | Czech | English |
|--------|-------|---------|
| ☁️ Cloud | [`sources/cloud/sources.md`](sources/cloud/sources.md) | [`sources/cloud/sources.en.md`](sources/cloud/sources.en.md) |
| 🌐 Networking | [`sources/networking/sources.md`](sources/networking/sources.md) | [`sources/networking/sources.en.md`](sources/networking/sources.en.md) |
| 📊 Monitoring | [`sources/monitoring/sources.md`](sources/monitoring/sources.md) | [`sources/monitoring/sources.en.md`](sources/monitoring/sources.en.md) |
| 🔄 CI/CD | [`sources/cicd/sources.md`](sources/cicd/sources.md) | [`sources/cicd/sources.en.md`](sources/cicd/sources.en.md) |
| 🗄️ Databases | [`sources/databases/sources.md`](sources/databases/sources.md) | [`sources/databases/sources.en.md`](sources/databases/sources.en.md) |
| 🏗️ Infrastructure | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) | [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
---
## KB Agents
| Agent | Popis / Description |
|-------|-------------------|
| [`kb-research`](.opencode/agents/kb-research.md) | Zpracovává [todo] položky — rešerše nových témat |
| [`kb-source-scout`](.opencode/agents/kb-source-scout.md) | Vyhledává nové zdroje a přidává je do sources/ |
| [`kb-reviewer`](.opencode/agents/kb-reviewer.md) | Audit konzistence, odkazů, duplicit, formátování |
| [`kb-index`](.opencode/agents/kb-index.md) | **Udržuje tento rozcestník** — scanuje soubory, extrahuje křížové reference, validuje odkazy |
---
## Principy / Principles
| Principy (CZ) | Principles (EN) |
|---------------|-----------------|
| **Dostupnost** — SLA, redundance, failover, multi-AZ | **Availability** — SLA, redundancy, failover, multi-AZ |
| **Škálovatelnost** — horizontální vs. vertikální, auto-scaling | **Scalability** — horizontal vs. vertical, auto-scaling |
| **Bezpečnost** — defense in depth, least privilege, zero trust | **Security** — defense in depth, least privilege, zero trust |
| **Náklady** — FinOps, right-sizing, reserved instances | **Cost** — FinOps, right-sizing, reserved instances |
| **Operability** — observabilita, automation, dokumentace | **Operability** — observability, automation, documentation |
---
*Rozcestník je automaticky udržován agentem `kb-index`. Poslední aktualizace: 2026-06-11.*

119
REDIS.en.md Normal file
View File

@@ -0,0 +1,119 @@
# 🔴 Redis
## Overview
Redis is an in-memory key-value store with advanced data structures, used primarily as a cache, session store, message broker, and real-time database. Runs in RAM with optional disk persistence (RDB/AOF).
## Data structures
| Structure | Description | Use case |
|-----------|-------------|----------|
| **String** | Binary string (max 512 MB) | Cache values, session tokens, counters |
| **Hash** | Map field-value | User profile, cached object |
| **List** | Linked list (push/pop on both ends) | Queue (RPUSH/LPOP), log stream |
| **Set** | Unique values (unordered) | Tags, deduplication, memberships |
| **Sorted Set** | Unique + score (sorted) | Leaderboards, rate limiting, timeouts |
| **Bitmap** | Bit field | Feature flags, daily active users |
| **HyperLogLog** | Approximate cardinality (12 KB = 2^64) | Unique visitors (error < 1%) |
| **Stream** | Append-only log (Kafka-like) | Event store, messaging |
| **Geospatial** | Geo-indexing (GEOADD, GEOSEARCH) | Location queries, proximity search |
| **JSON** | JSON document (RedisJSON module) | Document structures |
## Eviction policies
| Policy | Description | Use case |
|--------|-------------|----------|
| **noeviction** | Error on write when full | Transactional data, must not lose |
| **allkeys-lru** | LRU on all keys | General cache, standard |
| **allkeys-lfu** | LFU on all keys | Frequently accessed data |
| **volatile-lru** | LRU on keys with TTL | Cache with expiration |
| **volatile-ttl** | Closest to expiration | Short-lived data |
| **allkeys-random** | Random | Testing |
## Redis Cluster vs Sentinel
| Feature | Redis Sentinel | Redis Cluster |
|---------|---------------|---------------|
| **Scaling** | Read replicas (master + replica) | Data sharding (16384 hash slots) |
| **Auto-failover** | Yes (Sentinel) | Yes (gossip-based) |
| **Multi-key ops** | Yes (transactions on master) | Limited (same hash slot) |
| **Client communication** | Via Sentinel (deprecated) | Cluster nodes redirect (MOVED/ASK) |
| **Minimum nodes** | Master + Replica + 3 Sentinel | 3 masters (each with replica) |
| **Use case** | High availability, single shard | Multi-shard, horizontal scaling |
## Persistence
| Method | Description | RTO | RPO | Use case |
|--------|-------------|-----|-----|----------|
| **RDB** (Redis Database) | Periodic snapshot to dump.rdb | Minutes | Last snapshot | Cache, loss tolerated |
| **AOF** (Append-Only File) | Append-only log of all write operations | Seconds | 1 s (fsync every sec) | Data must not be lost |
| **RDB + AOF** | Combination | Seconds | 1 s | Recommended for production |
## Modules (RediSearch, RedisJSON, RedisGraph)
Redis is extensible via modules:
- **RediSearch** — full-text search, facets, prefix/suffix search
- **RedisJSON** — JSON path queries, document manipulation
- **RedisGraph** — graph DB (based on Cypher, deprecated since 2025)
- **RedisTimeSeries** — time-series with downsampling, retention policies
- **RedisBloom** — Bloom filters, Cuckoo filters, Top-K, Count-Min Sketch
## Memcached vs Redis
| Feature | Redis | Memcached |
|---------|-------|-----------|
| **Data structures** | String, Hash, List, Set, Sorted Set, Stream, JSON | String only |
| **Persistence** | RDB + AOF | None (purely in-memory) |
| **Replication** | Master-replica, Cluster | None (multi-threaded) |
| **Eviction** | 6 policies | LRU only |
| **Lua scripting** | Yes (EVAL) | No |
| **Transactions** | Yes (MULTI/EXEC) | No |
| **Pub/Sub** | Yes | No |
| **Streaming** | Yes (Stream) | No |
## Recommendations — where Redis is better
| Area | Redis | Competition | Why Redis |
|------|-------|-------------|-----------|
| **Cache (in-memory)** | < 1 ms latency, 6 eviction policies | Memcached (LRU string only) | Richer data types, persistence, cluster |
| **Session store** | Hash + TTL, Cluster for HA | DynamoDB (higher latency) | Simpler, faster, native expiration |
| **Rate limiting** | Sorted Set (sliding window counter) | Application in DB (complex) | Atomic operations, built-in logic |
| **Leaderboard / scoring** | Sorted Set (ZADD, ZRANK, ZREVRANGE) | SQL (ORDER BY + COUNT = expensive) | O(1) rank, O(log N) insert |
| **Message queue** | List/Stream (RPUSH+BLPOP) | Kafka (heavy, JVM) | Lightweight, embedded, no broker |
| **Real-time analytics** | HyperLogLog + Bitmap + Stream | ClickHouse (heavy analytics) | Real-time aggregation, small RAM |
| **Geolocation** | GEOADD, GEOSEARCH, GEODIST | PostGIS (heavier, disk-based) | In-memory, ideal for real-time |
### When to use Redis
- **Cache for API** — response cache, DB query cache, session cache
- **Session management** — distributed sessions across servers
- **Rate limiting** — API gateway, per-user/per-IP limits
- **Leaderboards / rankings** — real-time scoring
- **Message broker** — task queue (RQ, Celery with Redis), pub/sub notifications
- **Real-time analytics** — counting uniques, metrics, dashboards
- **Geo-proximity** — "find nearest branch" in < 1 ms
### When to use something else
- **Persistent data with SQL queries** → PostgreSQL or MySQL
- **Large volumes > RAM** → Memcached (multi-threaded), Dragonfly (more RAM utilization)
- **Long-term message queue** → Kafka, RabbitMQ (disk-based persistence)
- **Document DB** → MongoDB (persistent, complex queries)
## Redis licensing
Redis underwent a major license change in 2024:
| Period | License | Conditions |
|--------|---------|------------|
| **Until March 2024** | BSD 3-clause (open source) | Completely free use, including managed services |
| **Since March 2024** | RSALv2 + SSPL (dual license) | SSPL: if you offer Redis as a managed service, you must release the entire stack as open source. RSALv2: restrictions on cloud operators |
| **Valkey (fork, Linux Foundation)** | BSD 3-clause | Fully open source fork of Redis 7.2, supported by Linux Foundation, AWS, Google, Oracle |
**Impact**: Managed Redis services (AWS ElastiCache, Google Memorystore, Azure Cache for Redis) cannot use Redis 7.4+ without a commercial license → they are migrating to **Valkey**. For self-hosted Redis, no change — RSALv2/SSPL does not restrict internal use.
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
*Last revision: 2026-06-03*

119
REDIS.md Normal file
View File

@@ -0,0 +1,119 @@
# 🔴 Redis
## Přehled
Redis je in-memory key-value store s pokročilými datovými strukturami, používaný primárně jako cache, session store, message broker a real-time databáze. Běží v RAM s možností persistence na disk (RDB/AOF).
## Data structures
| Struktura | Popis | Use case |
|-----------|-------|----------|
| **String** | Binární string (max 512 MB) | Cache hodnoty, session tokeny, counters |
| **Hash** | Map field-value | Uživatelský profil, objekt v cache |
| **List** | Linked list (push/pop na oba konce) | Queue (RPUSH/LPOP), log stream |
| **Set** | Unikátní hodnoty (unordered) | Tags, deduplikace, memberships |
| **Sorted Set** | Unikátní + score (řazení) | Leaderboardy, rate limiting, timeouts |
| **Bitmap** | Bitové pole | Feature flagy, daily active users |
| **HyperLogLog** | Approximate cardinality (12 KB = 2^64) | Unikátní návštěvníci (error < 1 %) |
| **Stream** | Append-only log (Kafka-like) | Event store, messaging |
| **Geospatial** | Geo-indexing (GEOADD, GEOSEARCH) | Lokalitní dotazy, proximity search |
| **JSON** | JSON dokument (RedisJSON modul) | Dokumentové struktury |
## Eviction policies
| Policy | Popis | Use case |
|--------|-------|----------|
| **noeviction** | Chyba při zápisu když je plno | Transakční data, neztrácet |
| **allkeys-lru** | LRU na všechny klíče | Obecná cache, standard |
| **allkeys-lfu** | LFU na všechny klíče | Často přistupovaná data |
| **volatile-lru** | LRU na klíče s TTL | Cache s expirací |
| **volatile-ttl** | Nejblíž k expiraci | Krátkodobá data |
| **allkeys-random** | Náhodný | Testování |
## Redis Cluster vs Sentinel
| Vlastnost | Redis Sentinel | Redis Cluster |
|-----------|---------------|---------------|
| **Škálování** | Read replicas (master + replica) | Data sharding (16384 hash slotů) |
| **Auto-failover** | Ano (Sentinel) | Ano (gossip-based) |
| **Multi-key ops** | Ano (transactiony na masteru) | Omezené (stejný hash slot) |
| **Client komunikace** | Přes Sentinel (deprecated) | Cluster nodes redirect (MOVED/ASK) |
| **Minimální uzly** | Master + Replica + 3 Sentinel | 3 masters (každý s replikou) |
| **Use case** | Vysoká dostupnost, single shard | Multi-shard, horizontální škálování |
## Persistence
| Metoda | Popis | RTO | RPO | Use case |
|--------|-------|-----|-----|----------|
| **RDB** (Redis Database) | Periodický snapshot do dump.rdb | Minuty | Poslední snapshot | Cache, ztráta tolerována |
| **AOF** (Append-Only File) | Append-only log všech write operací | Sekundy | 1 s (fsync every sec) | Data nesmí být ztracena |
| **RDB + AOF** | Kombinace | Sekundy | 1 s | Doporučeno pro produkci |
## Moduly (RediSearch, RedisJSON, RedisGraph)
Redis rozšiřitelný modulem:
- **RediSearch** — full-text search, facety, prefix/suffix vyhledávání
- **RedisJSON** — JSON path dotazy, manipulace dokumentů
- **RedisGraph** — grafová DB (na bázi Cypher, deprecated od 2025)
- **RedisTimeSeries** — time-series s downsamplingem, retention politikami
- **RedisBloom** — Bloom filtry, Cuckoo filtry, Top-K, Count-Min Sketch
## Memcached vs Redis
| Vlastnost | Redis | Memcached |
|-----------|-------|-----------|
| **Data structures** | String, Hash, List, Set, Sorted Set, Stream, JSON | Pouze String |
| **Persistence** | RDB + AOF | Žádná (čistě in-memory) |
| **Replication** | Master-replica, Cluster | Žádná (multi-threaded) |
| **Eviction** | 6 politik | LRU pouze |
| **Lua scripting** | Ano (EVAL) | Ne |
| **Transakce** | Ano (MULTI/EXEC) | Ne |
| **Pub/Sub** | Ano | Ne |
| **Streaming** | Ano (Stream) | Ne |
## Doporučení — v čem je Redis lepší
| Oblast | Redis | Konkurence | Proč Redis |
|--------|-------|------------|------------|
| **Cache (in-memory)** | < 1 ms latence, 6 eviction politik | Memcached (pouze LRU string) | Bohatší datové typy, persistence, cluster |
| **Session store** | Hash + TTL, Cluster pro HA | DynamoDB (vyšší latence) | Jednodušší, rychlejší, nativní expirace |
| **Rate limiting** | Sorted Set (sliding window counter) | Aplikace v DB (složité) | Atomic operace, vestavěná logika |
| **Leaderboard / scoring** | Sorted Set (ZADD, ZRANK, ZREVRANGE) | SQL (ORDER BY + COUNT = expensive) | O(1) rank, O(log N) insert |
| **Message queue** | List/Stream (RPUSH+BLPOP) | Kafka (těžká, JVM) | Lehká, embedded, žádný broker |
| **Real-time analytics** | HyperLogLog + Bitmap + Stream | ClickHouse (těžká analytika) | Agregace v reálném čase, malá RAM |
| **Geolokace** | GEOADD, GEOSEARCH, GEODIST | PostGIS (těžší, disk-based) | In-memory, ideální pro real-time |
### Kdy použít Redis
- **Cache pro API** — response cache, DB query cache, session cache
- **Session management** — distribuované session napříč servery
- **Rate limiting** — API gateway, per-user/per-IP limity
- **Leaderboardy / žebříčky** — real-time skórování
- **Message broker** — fronta úloh (RQ, Celery s Redis), pub/sub notifikace
- **Real-time analytics** — počítání unikátů, metrik, dashboardů
- **Geoproxmity** — "najdi nejbližší pobočku" v < 1 ms
### Kdy použít něco jiného
- **Trvalá data s SQL dotazy** → PostgreSQL nebo MySQL
- **Velké objemy > RAM** → Memcached (multi-threaded), Dragonfly (více RAM utilization)
- **Dlouhodobá fronta zpráv** → Kafka, RabbitMQ (disk-based persistence)
- **Dokumentová DB** → MongoDB (persistentní, komplexní dotazy)
## Redis licensing
Redis prošel zásadní změnou licence v roce 2024:
| Období | Licence | Podmínky |
|--------|---------|----------|
| **Do března 2024** | BSD 3-clause (open source) | Zcela volné použití, včetně managed služeb |
| **Od března 2024** | RSALv2 + SSPL (dual license) | SSPL: pokud nabízíte Redis jako managed službu, musíte uvolnit celý stack jako open source. RSALv2: omezení na cloud provozovatele |
| **Valkey (fork, Linux Foundation)** | BSD 3-clause | Plně open source fork Redis 7.2, podpora od Linux Foundation, AWS, Google, Oracle |
**Dopad**: Managed Redis služby (AWS ElastiCache, Google Memorystore, Azure Cache for Redis) nemohou používat Redis 7.4+ bez komerční licence → přechází na **Valkey**. Pro self-hosted Redis beze změny — RSALv2/SSPL neomezuje interní použití.
## Zdroje
Odkazy, knihy a standardy: [sources/databases/sources.md](sources/databases/sources.md)
*Poslední revize: 2026-06-03*

89
REVIEW.en.md Normal file
View File

@@ -0,0 +1,89 @@
# 📋 Review workflow — Review and content control
## Process
```
Draft ──→ Self-review ──→ Peer review ──→ Approval ──→ Merged
↑ │
└────────────── Feedback loop ───────────────────────┘
```
## Phases
### 1. Draft
- Author creates new content / edits existing
- Mark files as `[draft]` in the commit message note
- Goal: capture ideas, structure, and facts
### 2. Self-review (author)
- [ ] Is the content **understandable**? Would a junior understand it?
- [ ] Are the **facts correct**? Verify against sources / official documentation
- [ ] Are **sources** cited? (links in `sources/`)
- [ ] Is the **structure consistent** with the rest of the KB?
- [ ] Are **abbreviations** explained?
- [ ] **Spelling and grammar**
- [ ] Does the **tone** match — factual, without subjective opinions
- [ ] Does it contain **actionable best practices**, not just theory
### 3. Peer review (colleague / reviewer)
- Author requests a review (PR / issue / @mention)
- Reviewer checks:
- **Technical accuracy** — are data and concepts valid?
- **Completeness** — is anything important missing?
- **Impartiality** — does it not favor one vendor without reason?
- **Currency** — is any information outdated?
**Review template:**
```
## Review: [file name]
### Technical accuracy
- [ ] Facts are correct
- [ ] Recommendations are appropriate
- [ ] Cited sources are relevant
### Structure and form
- [ ] Logical structure
- [ ] Consistent formatting
- [ ] Language is understandable
### Comments
- [ ] [comment 1]
- [ ] [comment 2]
### Verdict
- [ ] Approved
- [ ] Approved with reservations (see comments)
- [ ] Rejected (reason: …)
```
### 4. Approval
- Approves: author + at least 1 peer reviewer
- After approval, content is considered `[done]`
- Changes after approval require a new review cycle
### 5. Merged / Published
- Content is considered current and trustworthy
- If a source in `sources/` is marked `[done]`, it confirms processing
## File states
| Status | Meaning |
|--------|--------|
| `[draft]` | In progress, not yet reviewed |
| `[in-review]` | Peer review in progress |
| `[done]` | Approved, current |
| `[outdated]` | Outdated, awaiting revision |
| `[deprecated]` | Replaced by another document |
## Regular revision
- **Quarterly** — check currency of the entire KB
- **Trigger** — new tool version, architecture change, EOL technology
- Each file should have a **last revision date** in its footer

89
REVIEW.md Normal file
View File

@@ -0,0 +1,89 @@
# 📋 Review workflow — Oponentura a kontrola obsahu
## Proces
```
Draft ──→ Self-review ──→ Peer review ──→ Approval ──→ Merged
↑ │
└────────────── Feedback loop ───────────────────────┘
```
## Fáze
### 1. Draft
- Autor vytvoří nový obsah / upraví existující
- Označí soubory jako `[draft]` v poznámce commit message
- Cíl: zachytit myšlenky, strukturu a fakta
### 2. Self-review (autor)
- [ ] Je obsah **srozumitelný**? Pochopí to junior?
- [ ] Jsou **fakta správná**? Ověřit proti sources / oficiální dokumentaci
- [ ] Jsou uvedeny **zdroje**? (odkazy v `sources/`)
- [ ] Je **struktura konzistentní** se zbytkem KB?
- [ ] Jsou **zkratky** vysvětleny?
- [ ] **Pravopis a gramatika**
- [ ] Odpovídá **tón** — faktický, bez subjektivních názorů
- [ ] Obsahuje **actionable best practices**, nejen teorii
### 3. Peer review (kolega / oponent)
- Autor zažádá o review (PR / issue / @mention)
- Oponent kontroluje:
- **Odborná správnost** — jsou data a koncepty validní?
- **Úplnost** — není něco důležitého vynecháno?
- **Nestrannost** — neupřednostňuje jeden vendor bezdůvodně?
- **Aktuálnost** — nejsou informace zastaralé?
**Review template:**
```
## Review: [název souboru]
### Odborná správnost
- [ ] Fakta jsou správná
- [ ] Doporučení jsou vhodná
- [ ] Uvedené zdroje jsou relevantní
### Struktura a forma
- [ ] Logické členění
- [ ] Konzistentní formátování
- [ ] Jazyk je srozumitelný
### Připomínky
- [ ] [připomínka 1]
- [ ] [připomínka 2]
### Verdikt
- [ ] Schvaluji
- [ ] Schvaluji s výhradami (viz připomínky)
- [ ] Zamítnuto (důvod: …)
```
### 4. Approval
- Schvaluje: autor + minimálně 1 peer reviewer
- Po schválení se obsah považuje za `[done]`
- Změny po schválení vyžadují nový review cyklus
### 5. Merged / Published
- Obsah je považován za aktuální a důvěryhodný
- Pokud je zdroj označen `[done]` v `sources/`, je to potvrzení zpracování
## Stavy souborů
| Status | Význam |
|--------|--------|
| `[draft]` | Rozpracováno, neprošlo review |
| `[in-review]` | Probíhá peer review |
| `[done]` | Schváleno, aktuální |
| `[outdated]` | Zastaralé, čeká na revizi |
| `[deprecated]` | Nahrazeno jiným dokumentem |
## Pravidelná revize
- **Kvartálně** — kontrola aktuálnosti celé KB
- **Trigger** — nová verze nástroje, změna architektury, EOL technologie
- Každý soubor by měl mít v patičce **datum poslední revize**

757
SERVER-CONFIG.en.md Normal file
View File

@@ -0,0 +1,757 @@
# ⚙️ Server configuration — best practices by workload
## General BIOS/UEFI settings
| Setting | Recommendation | Rationale |
|-----------|-----------|------------|
| **Boot mode** | UEFI | Secure Boot, GPT, larger disks |
| **Power profile** | Performance / OS Control | Max performance, C-States disabled |
| **Hyper-Threading** | Enabled | +30-50 % throughput for multi-thread |
| **Virtualization** | Enabled (VT-x/AMD-V) | Required for hypervisor, containers |
| **SR-IOV** | Enabled | GPU, NIC passthrough |
| **NUMA** | Enabled | NUMA-aware scheduling |
| **ACPI** | Enabled | Power management, OS-level |
| **Secure Boot** | Enabled | Secure boot chain |
| **TPM** | Enabled | Measured boot, key storage |
---
## 1. Database servers
### CPU Selection
| DB type | CPU preference | Rationale |
|--------|---------------|------------|
| **OLTP** (PostgreSQL, MySQL) | High clock, moderate cores | Low latency per transaction, limited parallelism |
| **OLAP** (ClickHouse, Snowflake) | Many cores, AVX-512 | Columnstore, high parallelism |
| **In-memory** (Redis, Memcached) | High clock, low cache latency | Single-threaded (Redis), RAM bandwidth |
| **Document** (MongoDB) | Balance (clock × cores) | Mixed workload |
| **Distributed** (Cassandra, Scylla) | Many cores, high cache | Shard-per-core (Scylla), compaction |
| **Oracle OLTP** | High clock, moderate cores, core-factor aware | CPU license cost (core factor 0.5 for AMD EPYC and Intel Xeon) |
| **Oracle OLAP / DW** | Many cores, large SGA, in-memory option | Parallel query, Exadata Smart Scan, compression |
### Oracle CPU licensing — core factor
Oracle licenses per core with a correction factor depending on the processor. Factor 0.5 means 2 cores = 1 Oracle license.
| Processor | Core factor | 64 physical cores → Oracle licenses |
|----------|-------------|--------------------------------------|
| AMD EPYC (all series) | 0.5 | 32 |
| Intel Xeon (Scalable) | 0.5 | 32 |
| IBM POWER | 1.0 | 64 |
| ARM (Ampere Altra) | 0.5 | 32 |
**Impact on CPU selection**: At the same Oracle license cost, EPYC with more cores is more advantageous — you get more compute power for the same license price.
### Configuration by company size and storage type
#### Variant A: Small company — local NVMe RAID
| Component | Recommendation | Note |
|-----------|-----------|----------|
| **CPU** | 1× EPYC 9124/9224 or Intel Xeon 4410Y (8-16C) | 1 socket, high clock |
| **RAM** | 64-256 GB (8-16 GB/core) | DDR5-4800, 1DPC |
| **OS disk** | 2× SATA/SAS SSD, RAID 1 (240-480 GB) | For OS + binaries |
| **Data disk** | 4-6× NVMe (U.2/E3.S), RAID 10 | Local data, no sharing |
| **WAL disk** | 2× NVMe RAID 1 (400-800 GB) | PostgreSQL only |
| **Network** | 2× 25 GbE (LACP) | Application traffic + management |
| **Form factor** | 1U or 2U | Single node, no cluster |
| **Storage backend** | Local RAID controller (PERC/Broadcom) | HW RAID 10 or SW RAID (mdadm) |
| **HA** | Application manages failover (patroni, repmgr, orchestrator) | Standby node on failure |
**Use case**: Startup, branch office, dev/test, < 500 users, single database server, low availability requirements.
#### Variant B: Medium company — local NVMe + asynchronous replication
| Component | Recommendation | Note |
|-----------|-----------|----------|
| **CPU** | 1-2× EPYC 9334/9374F or Intel Xeon 5418Y (16-24C) | 1-2 socket, balanced |
| **RAM** | 128-512 GB (8-16 GB/core) | DDR5-4800/5600, 1DPC |
| **OS disk** | 2× NVMe RAID 1 (2× 480 GB) | OS + binaries |
| **Data disk** | 6-8× NVMe, RAID 10 | Local NVMe, 3-6 TB usable |
| **WAL disk** | 2× NVMe RAID 1 (2× 800 GB) | Separate from data |
| **Network** | 2× 25 GbE (app) + 2× 25 GbE (replication) | Application and replication networks separated |
| **Form factor** | 2U | Primary + replica node |
| **Storage backend** | SW RAID (mdadm) or HW RAID (PERC H965) | Write-back cache with BBU |
| **HA** | Patroni / repmgr / MySQL InnoDB Cluster | Asynchronous replication to 1-2 standby |
**Use case**: E-commerce, medium SaaS, 500-5000 users, RPO < 1 min, RTO < 5 min.
#### Variant C: Large company — FC SAN (enterprise)
| Component | Recommendation | Note |
|-----------|-----------|----------|
| **CPU** | 2× EPYC 9654/9965 or Xeon 8592+/6980P (48-128C) | 2 socket, max cores, large cache |
| **RAM** | 512 GB - 2 TB (8-16 GB/core) | DDR5, 2DPC (speed penalty), 12 channels (EPYC) |
| **OS disk** | 2× SATA SSD RAID 1 (2× 480 GB) | OS only, data on SAN |
| **Data + WAL** | LUNs from FC SAN | Hitachi VSP / Dell PowerMax / Pure //X |
| **HBA** | 2× dual-port FC HBA (32/64 Gb) | Multipath (active-active), FC-NVMe |
| **Network** | 2× 25/100 GbE (app) + 2× 32/64 Gb FC (storage) | App and storage networks separated |
| **Form factor** | 2U | 2-8 node cluster (RAC, AlwaysOn AG) |
| **Storage backend** | FC SAN — LUN per database | Thin provisioning, RAID on SAN, snapshots |
| **HA** | Oracle RAC / SQL Server AOAG / PostgreSQL Patroni | Synchronous replication, FC multipath |
**SAN advantages**: Centralized management, snapshots, cloning, disaster recovery (SRDF/Metro), separate storage network, higher availability.
**Disadvantages**: Higher latency compared to local NVMe (~50-200 µs over SAN vs ~10 µs local NVMe), higher CAPEX, vendor lock-in.
#### Variant D: Large company — Ceph / SDS backend
| Component | Recommendation | Note |
|-----------|-----------|----------|
| **CPU** | 2× EPYC 9334/9654 (16-32C) | Fewer cores than SAN variant — part of CPU goes to Ceph client |
| **RAM** | 256-512 GB | Less RAM — Ceph client cache is not as effective as local buffer |
| **OS disk** | 2× SATA SSD RAID 1 (2× 480 GB) | OS |
| **Network** | 2× 25/100 GbE (app) + 2× 25/100 GbE (Ceph public) | App and Ceph traffic over Ethernet |
| **HBA** | Storage HBA in IT/HBA mode (no RAID) | For Ceph OSD node, not DB node |
| **Form factor** | 2U | DB node + separate Ceph OSD node |
| **Storage backend** | RBD (RADOS Block Device) over Ceph | 3× replication or erasure coding |
| **HA** | Application + Ceph inherent HA | Ceph self-healing, auto-rebalance |
**Ceph advantages**: No vendor lock-in, horizontal scaling, unified platform for block/file/object, lower CAPEX.
**Disadvantages**: Higher latency and CPU overhead (Ceph client → network → OSD), variable performance, more complex troubleshooting.
#### Variant E: Cloud — RDS / CloudSQL / Azure SQL
| Component | Recommendation | Note |
|-----------|-----------|----------|
| **Compute** | AWS RDS (db.r7g/r8g), Azure SQL (GP/BC/Hyperscale) | Managed service, no OS access |
| **Storage** | EBS gp3 / io2, Azure Premium SSD v2, Cloud SQL SSD | Automatic scaling, PITR, multi-AZ |
| **Network** | Security Group, Private Link, VPC peering | No HBA, no SAN — everything over Ethernet |
| **HA** | Multi-AZ (synchronous), read replicas | Managed failover, RTO < 60 s |
| **Backup** | Automated, PITR (7-35 days) | No management required |
**Use case**: No on-prem hardware, elastic scaling, pay-per-use, lower operational overhead.
**Disadvantages**: Higher long-term costs, data residency, network latency, limited customization.
### Variant comparison
| Aspect | Local NVMe (small) | Local NVMe (medium) | FC SAN | Ceph | Cloud |
|--------|---------------------|----------------------|--------|------|-------|
| **Latency** | ~10 µs | ~10 µs | ~50-200 µs | ~100-500 µs | ~100-1000 µs |
| **Scaling** | Vertical | Vertical | Horizontal | Horizontal | Elastic |
| **CAPEX** | Low | Medium | High | Medium | None (OPEX) |
| **Operational overhead** | Low | Low | High (SAN admin) | Medium | None |
| **HA** | Application | Patroni/Cluster | RAC/AOAG | Ceph HA | Managed |
| **RPO** | 1-5 min | < 1 min | < 10 s | < 30 s | < 60 s |
| **RTO** | 5-15 min | < 5 min | < 2 min | < 5 min | < 60 s |
| **Number of servers** | 1-2 | 2-4 | 4-16 | 6-20+ | 0 (managed) |
| **Company** | Startup/SME | SME/Enterprise | Enterprise | Enterprise | Any |
### PostgreSQL parameter matrix by storage type
| Parameter | Local NVMe | FC SAN | Ceph RBD |
|----------|-----------|--------|----------|
| `random_page_cost` | 1.1 | 1.5-2.0 | 2.0-3.0 |
| `effective_io_concurrency` | 300 | 100-200 | 50-100 |
| `synchronous_commit` | off (NVMe cache) | on (SAN cache) | off (Ceph cache) |
| `full_page_writes` | on | on | on (even over Ceph) |
### Storage layout by backend type
**Local NVMe (small/medium):**
```
Mount point FS RAID Disk Purpose
/ ext4 1 (mirror) 2× SATA SSD OS
/data xfs 10 4-8× NVMe Data
/wal xfs 1 (mirror) 2× NVMe WAL (PG)
```
**FC SAN (enterprise):**
```
Mount point FS Device Purpose
/ ext4 local RAID 1 (2× SSD) OS
/dev/sdb xfs FC LUN 1 (500 GB) WAL (PG)
/dev/sdc xfs FC LUN 2 (2 TB) Data
/dev/sdd xfs FC LUN 3 (2 TB) Indexes (separate)
```
**Ceph RBD:**
```
Mount point FS Ceph device Purpose
/ ext4 local RAID 1 (2× SSD) OS
/dev/rbd0 xfs rbd datastore-01 Data + WAL (Ceph RBD)
```
### Kernel tuning by variant
**Local NVMe:**
```
vm.dirty_ratio = 30
vm.dirty_background_ratio = 5
```
**FC SAN:**
```
# SAN storage — higher latency, less aggressive flush
vm.dirty_ratio = 20
vm.dirty_background_ratio = 3
vm.dirty_expire_centisecs = 3000 # Defer writes (SAN cache)
```
**Ceph RBD:**
```
# Ceph RBD — network storage, optimize for RBD cache
vm.dirty_ratio = 15
vm.dirty_background_ratio = 2
# RBD cache settings
# rbd cache = true (client-side)
# rbd cache size = 256-512 MB
```
### Database-specific tuning
| Parameter | PostgreSQL | MySQL | Oracle | MongoDB |
|----------|-----------|-------|--------|---------|
| **Cache** | `shared_buffers` 25 % RAM | `innodb_buffer_pool` 70-80 % RAM | `SGA_TARGET` 60-80 % RAM | `WiredTiger cache` 50-80 % RAM |
| **OS cache** | `effective_cache_size` 75 % RAM | OS cache + InnoDB | OS cache (double buffering risk with large SGA) | OS cache |
| **Write buffer** | `wal_buffers` 64-256 MB | `innodb_log_file_size` 1-4 GB | Redo log (2-4 groups, 200 MB-4 GB) | WiredTiger log |
| **Connections** | `max_connections` 50-500 | `max_connections` 100-500 | `processes` 200-2000 | maxIncomingConnections |
| **I/O** | `effective_io_concurrency` 200 | `innodb_io_capacity` 2000 | `db_file_multiblock_read_count` 128 | WiredTiger eviction |
| **Huge pages** | `huge_pages = try` | `large-pages = ON` | `use_large_pages = only` (mandatory) | transparent_hugepages=never |
| **Parallel query** | `max_parallel_workers` 4-8 | `innodb_parallel_read_threads` 4 | `parallel_degree_policy = auto` — up to 64 | — |
### Connectivity by variant
| Variant | App network | Storage network | Replication | Management |
|----------|---------|-------------|-----------|------------|
| **Local (small)** | 2× 25 GbE LACP | — | 2× 25 GbE (same) | iDRAC/iLO |
| **Local (medium)** | 2× 25 GbE LACP | — | 2× 25 GbE dedicated | iDRAC/iLO |
| **FC SAN** | 2× 25/100 GbE | 2× 32/64 Gb FC (multipath) | FC replication | iDRAC/iLO + SAN mgmt |
| **Ceph** | 2× 25/100 GbE | 2× 25/100 GbE (public net) | 2× 25/100 GbE (cluster net) | iDRAC/iLO + Ceph mgmt |
| **Cloud** | Elastic IP / Private Link | — | — | AWS Console / API |
| **Oracle Standalone** | 2× 25 GbE LACP | ASM (2× 25 GbE or FC 32G) | Data Guard 2× 25 GbE | iLO + ASM mgmt |
| **Oracle RAC** | 2-4× 25/100 GbE | 2× 64 Gb FC (multipath) | Cache Fusion interconnect | iLO + SAN mgmt |
| **Oracle Exadata** | 4-8× 100 GbE RoCE | NVMe over Fabric | RDMA interconnect | Exadata CLI + OEDA |
### Oracle-specific configuration
#### Oracle ASM — diskgroup layout
Oracle ASM (Automatic Storage Management) replaces traditional filesystem + volume manager:
| Diskgroup | Redundancy | Disks | Purpose |
|-----------|-----------|-------|-------|
| **DATA** | Normal (2× mirror) | 4-12× FC LUN/NVMe | Data files, temp files, control files |
| **FRA** (Flash Recovery Area) | Normal (2× mirror) | 2-6× FC LUN/NVMe | Archive logs, backup, flashback logs |
| **REDO** | High (3× mirror) | 2-4× FC LUN/NVMe | Online redo log groups (I/O critical) |
| **SPFILE** | Normal | 2× small LUN | Server parameter file |
**ASM striping**: Coarse (1 MB) for regular data, Fine (128 KB) for redo logs (lower write latency).
#### Variant O1: Standalone Oracle (small/medium, single instance)
| Parameter | Small (< 500 users) | Medium (500-2000 users) |
|----------|---------------------|------------------------|
| **CPU** | 1-2× EPYC 9124-9224 / Xeon 4410Y (8-16C) | 2× EPYC 9334-9374F / Xeon 5418Y (16-24C) |
| **RAM (SGA + PGA)** | 64-128 GB (SGA 70 %, PGA 30 %) | 128-512 GB (SGA 60-80 %, PGA 20-40 %) |
| **Huge pages** | Yes (vm.nr_hugepages) — mandatory for SGA | Yes |
| **OS disk** | 2× SATA SSD RAID 1 (240 GB) | 2× NVMe RAID 1 (480 GB) |
| **DATA + FRA** | 4-6× NVMe, ASM normal redundancy | 6-8× NVMe or FC LUN, ASM normal |
| **REDO** | 2-4× NVMe (separate from DATA), ASM high | 4× FC LUN (separate), ASM high |
| **Archive log** | Local FRA | FC LUN (FRA diskgroup) |
| **Network (app)** | 2× 25 GbE LACP | 2-4× 25/100 GbE LACP |
| **Network (storage)** | — (local NVMe) | 2× FC 32G multipath |
| **Network (Data Guard)** | — | 2× 25 GbE dedicated |
| **DB version** | Oracle SE2 (max 16 threads) | Oracle EE (unlimited) |
**Use case**: Dev/test, small production DBs, branch offices. SE2 license = max 16 CPU threads, limited parallel execution.
#### Variant O2: Oracle Data Guard (medium/large, HA + DR)
Primary + standby in active-passive mode, Active Data Guard possible for reporting.
| Parameter | Recommendation |
|----------|-----------|
| **CPU** | 2× EPYC 9654-9965 / Xeon 8592+ (32-64C) |
| **RAM** | 256-1024 GB (SGA 60-80 %, PGA 20-40 %) |
| **Huge pages** | Yes (50-80 % RAM allocated for SGA) |
| **OS disk** | 2× NVMe RAID 1 (480 GB) |
| **Storage** | FC SAN LUN (DATA + FRA + REDO separate) or NVMe + ASM |
| **HBA** | 2× dual-port FC 32/64 Gb (multipath active-active) |
| **App network** | 2-4× 25/100 GbE LACP |
| **Storage network** | 2× FC 32/64 Gb multipath |
| **Data Guard network** | 2× 25/100 GbE dedicated (sync or async) |
| **Data Guard mode** | Maximum Availability (sync, fallback to async) — RPO = 0 |
| **Topology** | 1 primary + 1-2 standby (physical), far sync for geo-DR |
| **Active Data Guard** | Standby open for read (reporting, backup) — requires ADG license |
**Data Guard latency**:
```text
Synchronous (Maximum Availability):
Primary COMMIT → LGWR flush REDO → sync over network → Standby LGWR → ACK → ~1-5 ms
RPO = 0, impact on write latency
Asynchronous (Maximum Performance):
Primary COMMIT → LGWR flush REDO → async to standby buffer → ~0.1-1 ms
RPO = a few seconds, negligible write impact
```
**Network requirements for Data Guard sync**:
- RTT < 2 ms for synchronous mode (recommended < 1 ms)
- Min. 10 GbE, recommended 25 GbE (throughput = REDO rate × 2)
- REDO rate: OLTP ~50-500 MB/s, batch ~500-2000 MB/s
- At REDO rate 500 MB/s and 25 GbE → ~20 % link utilization
#### Variant O3: Oracle RAC (large, enterprise)
Multi-instance cluster with shared storage and Cache Fusion.
| Parameter | Recommendation |
|----------|-----------|
| **Number of nodes** | 2-4 (typical), max 64 (RAC cluster) |
| **CPU per node** | 2× EPYC 9654-9965 / Xeon 8592+ (32-64C) |
| **RAM per node** | 512-2048 GB (SGA 60-80 %, PGA 20-40 %) |
| **Huge pages** | Yes (1 GB pages if RAM > 512 GB) |
| **Storage** | FC SAN — shared LUNs (ASM normal/high redundancy) |
| **HBA** | 2× dual-port FC 64 Gb (multipath, active-active) |
| **App network** | 2-4× 25/100 GbE LACP (VIP, SCAN listener) |
| **Storage network** | 2-4× FC 64 Gb (multipath per node) |
| **Cache Fusion interconnect** | 2× 100 GbE (RoCE v2 or InfiniBand) — dedicated |
| **RAC interconnect latency** | < 5 µs (recommended), max < 10 µs |
| **ASM** | Normal redundancy (2-way mirror) |
| **Oracle Clusterware** | Voting disk (3× 1 GB LUN), OCR (3× 500 MB LUN) |
| **Service** | OLTP_service, REPORT_service, BATCH_service |
**Cache Fusion — critical interconnect**:
```
Node A (DB instance) ←──→ Node B (DB instance)
│ │
└──────── ASM ───────────┘
FC SAN (shared storage)
Cache Fusion traffic: dirty block transfer between instances
→ Latency < 5 µs, otherwise RAC scaling degrades
→ Capacity: 2× 100 GbE, dedicated switch or InfiniBand HDR100
→ Recommended MTU: 9000 (jumbo frames)
```
**RAC sizing by transaction count**:
| TPS | Nodes | CPU per node | RAM per node | Interconnect |
|-----|------|-------------|-------------|-------------|
| < 10 000 | 2 | 16-24C | 256 GB | 2× 25 GbE |
| 10 000 - 50 000 | 2-4 | 32-48C | 512 GB | 2× 100 GbE RoCE |
| 50 000 - 200 000 | 4-8 | 48-64C | 1024 GB | 2× 100 GbE RoCE / InfiniBand |
| > 200 000 | 8+ | 64-128C | 2048 GB | InfiniBand HDR100/HDR200 |
**RAC sizing — license cost calculation**:
```text
Example: 4-node RAC, each node 2× EPYC 9654 (96C) = 192 cores per node
Core factor 0.5 → 96 Oracle licenses per node
4 × 96 = 384 Oracle EE licenses
At ~$47.5k/license → ~$18.2M (licenses only, without 22 % annual support)
```
#### Variant O4: Oracle Exadata (hyperscale)
Engineered system — optimal for hybrid workload (OLTP + DW).
| Parameter | X9M / X10M | Use case |
|----------|-----------|----------|
| **Database servers** | 2-8× (Xeon, 1.5-6 TB RAM, NVMe) | Compute |
| **Storage servers** | 3-18× (NVMe + HDD, Smart Scan) | Predicate offloading |
| **Smart Scan** | Filtering at storage layer | Less data over network, higher throughput |
| **RoCE interconnect** | 100 GbE (RDMA) | Low latency, high bandwidth |
| **In-Memory Column Store** | Optional license | Real-time analytics without ETL |
| **HCC (Hybrid Columnar Compression)** | Compression in storage servers | Up to 10-15× compression for DW |
| **Rack power** | ~15-30 kW (full rack) | Higher density |
**When to choose Exadata over standalone RAC**:
- OLTP > 50 000 TPS
- Consolidation needed (multiple DBs on one cluster)
- Smart Scan significantly accelerates reporting on production data
- HCC for storage savings on DW workloads
---
## 2. Hypervisor host (ESXi / KVM / Hyper-V)
### Configuration by size and storage type
#### Variant A: Small company — local storage (2-3 hosts)
| Component | Recommendation | Note |
|-----------|-----------|----------|
| **CPU** | 1× EPYC 9224/9254 or Xeon 4410Y/5418Y (12-24C) | 1 socket, enough cores for VM density |
| **RAM** | 128-256 GB (4-8 GB/core) | DDR5, 1DPC |
| **OS disk** | 2× SATA SSD RAID 1 (2× 240-480 GB) | ESXi / Proxmox / Hyper-V boot |
| **VM storage** | 4-6× SATA/SAS SSD, RAID 5/6 or 10 | Local RAID, 4-12 TB usable |
| **Network** | 2-4× 10/25 GbE (LACP) | Shared for everything (management + VM + storage) |
| **Hypervisor** | VMware vSphere Standard / Proxmox VE / Hyper-V | Basic license, no enterprise features |
| **Storage backend** | Local RAID controller (PERC H755, Broadcom 9560) | HW RAID with cache, write-back |
| **HA** | VMware HA / Proxmox HA | Restart VM on another host on failure |
| **Backup** | Veeam B&R Free / PBS (Proxmox Backup Server) | Local or USB disk |
**Use case**: Small office, branch office, dev/test, < 10 VMs, low budget, simple management.
**Limitations**: No vMotion without shared storage, outage during host failure (HA restart, not seamless).
#### Variant B: Medium company — vSAN / Ceph (3-6 hosts)
| Component | Recommendation | Note |
|-----------|-----------|----------|
| **CPU** | 1-2× EPYC 9334/9654 or Xeon 5418Y/8592+ (16-32C) | 1-2 socket |
| **RAM** | 256-512 GB (4-8 GB/core) | DDR5, 2DPC (minimal penalty) |
| **OS disk** | 2× SATA SSD RAID 1 or 2× M.2 NVMe (BOSS-S1) | Separate from VM storage |
| **Cache tier** | 1-2× NVMe (vSAN caching / Ceph WAL+DB) | For write performance |
| **Capacity tier** | 4-8× SATA/SAS SSD or HDD (vSAN capacity / Ceph OSD) | HDD for capacity, SSD for performance |
| **Network** | 4× 25/100 GbE — 2× VM + mgmt, 2× storage (vSAN/Ceph) | Separate storage network, RDMA (RoCE v2) |
| **Hypervisor** | VMware vSAN / Proxmox Ceph / StarWind HCI | HCI license (vSAN ~$2.5k/Core) |
| **Storage backend** | vSAN OSA/ESA or Ceph (RADOS) | Distributed storage, auto-rebalance |
| **HA** | vSphere HA + vSAN / Proxmox HA + Ceph | vMotion, DRS, automated failover |
| **Failover** | N+1 (one host as reserve) | vSAN requires min. 4 hosts (ESA min. 3) |
**Pure Ceph variant (Proxmox / OpenStack)**:
```
Proxmox node (3-6×):
├── CPU: 1× EPYC 9224-9334 (12-24C)
├── RAM: 128-256 GB
├── OS: 2× SATA SSD RAID 1
├── Ceph OSD: 4-8× NVMe/SATA SSD (RAW, HBA mode)
├── Network: 2× 25 GbE (public) + 2× 25 GbE (cluster)
└── Storage: Ceph 3× replication, CRUSH host failure domain
```
**VMware vSAN variant (4-6 hosts)**:
```
vSAN node (4-6×):
├── CPU: 1-2× EPYC/Xeon (16-32C)
├── RAM: 256-512 GB
├── OS: 2× M.2 NVMe (BOSS-S1) or SD card (deprecated)
├── vSAN cache: 1-2× NVMe (write buffer)
├── vSAN capacity: 4-8× SATA SSD (vSAN ESA) or HDD (vSAN OSA)
├── Network: 2× 25/100 GbE (VM) + 2× 25 GbE (vSAN)
└── Storage: vSAN ESA (all-NVMe) or OSA (hybrid)
```
**Use case**: SME, enterprise division, 10-100 VMs, need for vMotion, DRS, HA, simple storage management.
#### Variant C: Large company — FC SAN (6+ hosts)
| Component | Recommendation | Note |
|-----------|-----------|----------|
| **CPU** | 2× EPYC 9654/9965 or Xeon 8592+/6980P (32-64C) | 2 socket, max VM density |
| **RAM** | 512 GB - 2 TB (4-8 GB/core) | DDR5, 2DPC |
| **OS disk** | 2× SATA SSD RAID 1 or SD card (vSphere) | Boot, image storage |
| **VM storage** | LUNs from FC SAN — VMFS / NFS datastores | Hitachi, Dell, Pure, HPE storage |
| **HBA** | 2× dual-port FC HBA 32/64 Gb | Multipath, FC-NVMe |
| **Network** | 4-8× 25/100 GbE — split by traffic type | Management, VM, vMotion, FT separated |
| **Hypervisor** | VMware vSphere Enterprise+ / Hyper-V DC | Enterprise license, DRS, HA, FT |
| **Storage backend** | FC SAN — VMFS 8 datastores, VVols | Thin provisioning, storage DRS, array snapshots |
| **HA** | vSphere HA + DRS + vCenter | vMotion, DRS, FT, SRM for DR |
| **Failover** | N+1 or admission control (CPU/RAM reserve) | Reserved capacity for HA failover |
**Use case**: Enterprise, 100+ VMs, mix of DB and applications, centralized storage management, enterprise SLA.
#### Variant D: Hyperscale — Ceph / SDS (20+ hosts)
| Component | Recommendation | Note |
|-----------|-----------|----------|
| **CPU** | 2× EPYC 9654/9965 (64-128C) | 2 socket, compute optimal |
| **RAM** | 512 GB - 1 TB (2-4 GB/core) | Low overcommit ratio for consistency |
| **OS disk** | 2× M.2 NVMe RAID 1 (BOSS) | Boot |
| **Network** | 4-8× 100 GbE (compute + storage) | Separate OVN/OVS for SDN, VXLAN tunneling |
| **Hypervisor** | OpenStack (Nova) / OpenShift (KubeVirt) | Open source, API-driven, multi-tenant |
| **Storage backend** | Ceph (RADOS, RBD, RGW, CephFS) | Unified storage, erasure coding (8+3) |
| **Orchestration** | OpenStack / Kubernetes | Infrastructure-as-Code, autoscaling |
| **HA** | OpenStack HA / Kubernetes HA | Self-healing, auto-rebalance |
**Use case**: Cloud provider, hyperscale, 500+ VMs, multi-tenant, maximum automation.
### Hypervisor variant comparison
| Aspect | Local (small) | vSAN/Ceph (medium) | FC SAN (large) | Ceph hyperscale |
|--------|---------------|---------------------|----------------|-----------------|
| **Storage** | Local RAID | vSAN / Ceph (HCI) | FC SAN (centralized) | Ceph (distributed) |
| **Number of hosts** | 2-3 | 3-6 | 6-50+ | 20+ |
| **VM latency** | ~10 µs (local) | ~100-500 µs | ~200 µs (SAN) | ~500-2000 µs |
| **CAPEX/host** | Low | Medium | High | Medium |
| **CAPEX storage** | Low | None (part of hosts) | High (SAN array) | None (part of hosts) |
| **Management** | Simple (per host) | vCenter / Proxmox | vCenter + SAN mgmt | OpenStack / K8s |
| **vMotion** | No (no shared storage) | Yes (vSAN / Ceph RBD) | Yes (FC LUN) | Yes (Ceph RBD) |
| **DRS** | No | Yes (vSphere) | Yes (vSphere) | OpenStack scheduler |
| **Scaling** | Vertical | Horizontal (add host) | Horizontal (host + SAN) | Horizontal |
### Network design by variant
#### Small (local storage)
| Traffic | VLAN | Speed | Teaming | Note |
|---------|------|----------|---------|----------|
| Management | Mgmt | 1 GbE | Active/Passive | Dedicated port (iLO/iDRAC) |
| VM + Storage | All | 2-4× 10/25 GbE | LACP | Shared, VLAN tagging |
```
┌──────────────────────────────────────────┐
│ Host │
│ ┌──────┐ ┌─────────────────────────────┐│
│ │ iLO │ │ NIC1 NIC2 ││
│ │ 1 GbE │ │ [LACP] 25 GbE ││
│ └──────┘ └──────────┬──────────────────┘│
└──────────────────────┼───────────────────┘
┌─────┴─────┐
│ Switch │
└───────────┘
```
#### Medium (vSAN / Ceph)
| Traffic | VLAN | Speed | Teaming | Note |
|---------|------|----------|---------|----------|
| Management | Mgmt | 1 GbE | Active/Passive | Dedicated iLO/iDRAC |
| VM | VM | 2× 25/100 GbE | LACP | VM traffic, migration |
| Storage | vSAN/Ceph | 2× 25/100 GbE | LACP or RDMA | Separate, Jumbo frames (MTU 9000) |
```
┌──────────────────────────────────────────┐
│ Host │
│ ┌──────┐ ┌──────────┐ ┌───────────────┐│
│ │ iLO │ │ NIC1 NIC2│ │ NIC3 NIC4 ││
│ │ 1 GbE │ │ VM traffic│ │ Storage (vSAN)││
│ └──────┘ └──────────┘ └───────────────┘│
└──────────────────────────────────────────┘
```
#### Large (FC SAN)
| Traffic | VLAN | Speed | Teaming | Note |
|---------|------|----------|---------|----------|
| Management | Mgmt | 1 GbE | Active/Passive | Dedicated |
| VM | VM | 2-4× 25/100 GbE | LACP | VM traffic |
| vMotion | vMotion | 2× 25 GbE | Dedicated | Multi-NIC vMotion |
| FT | FT | 2× 10/25 GbE | Dedicated | Low latency |
| Storage | — | 2× 32/64 Gb FC | Multipath | FC SAN |
```
┌──────────────────────────────────────────────┐
│ Host │
│ ┌──────┐ ┌────────────┐ ┌────┐ ┌─────────┐│
│ │ iLO │ │ NIC1-4 │ │HBA1│ │ HBA2 ││
│ │ 1 GbE │ │ VM+vMotion+FT│ │32Gb│ │ 32Gb ││
│ └──────┘ └────────────┘ └─┬──┘ └──┬──────┘│
└────────────────────────────┼───────┼───────┘
│ │
┌───────┴───┐ ┌─┴────────┐
│ Ethernet │ │ FC Switch │
│ Switch │ │ (Brocade/ │
│ │ │ Cisco) │
└───────────┘ └──────────┘
```
### BIOS for hypervisor — all variants
| Setting | Value | Rationale |
|-----------|---------|------------|
| Hyper-Threading | Enabled | Higher VM density |
| Virtualization Technology | Enabled | VT-x/AMD-V |
| VT-d / IOMMU | Enabled | Passthrough, SR-IOV |
| Power Management | Performance / OS | Minimize VM exit latency |
| C-States | Disabled | Lower VM exit latency (important for real-time VMs) |
| NUMA | Enabled | NUMA-aware VM placement |
| SR-IOV | Enabled | NIC/GPU virtualization |
| Adjacent Sector Prefetch | Enabled (Intel) | Better sequential reads |
| DCU Streamer / IP Prefetcher | Enabled | HW prefetch for VM workload |
| Patrol Scrub | Disabled (vSAN/Ceph) | Can cause latency spikes with SDS |
### Hypervisor selection by variant
| Criterion | VMware vSphere | Proxmox VE | Hyper-V | OpenStack |
|-----------|---------------|------------|---------|-----------|
| **Size** | SME - Enterprise | SME | SME - Enterprise | Hyperscale |
| **Storage** | vSAN, SAN, NFS | Ceph, ZFS, NFS | Storage Spaces, SAN | Ceph, manila |
| **License** | ~$1-5k/core | Free (support ~$500/host) | Part of Windows Server | Open source |
| **Familiarity** | Highest | Medium | Windows admin | Low |
| **Automation** | Terraform, Ansible, PowerCLI | Ansible, Terraform, PBS | PowerShell, SCVMM | Terraform, Heat, Ansible |
| **Ecosystem** | Broadest (Veeam, Zerto, SRM) | Growing (PBS, remote migration) | Windows ecosystem | Open source (Kolla, TripleO) |
---
## 3. Kubernetes node
### Node profiles
| Role | CPU | RAM | Storage | Network | Use case |
|------|-----|-----|---------|---------|----------|
| **General purpose** | 16-32 cores | 64-128 GB | 1× NVMe OS + 1×NVMe local | Web, API, microservices |
| **Memory optimized** | 32-64 cores | 256-512 GB | 1× NVMe OS + 2×NVMe local | In-memory cache, DB |
| **Compute optimized** | 64-128 cores | 128-256 GB | 1× NVMe OS | Batch, CI/CD |
| **GPU node** | 32-64 cores | 512-1024 GB | 1× NVMe OS + 4-8×NVMe local | AI/ML training, inference |
| **Storage node** | 16-32 cores | 64-128 GB | 4-12× NVMe/SATA (Ceph/Longhorn) | SDS, persistent volumes |
### Kernel tuning
```
# /etc/sysctl.d/99-kubernetes.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
net.ipv4.conf.all.forwarding = 1
# Connection tracking (for NodePort, Service)
net.netfilter.nf_conntrack_max = 2097152
net.netfilter.nf_conntrack_tcp_timeout_established = 86400
# File watchers (for kubelet, containerd)
fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches = 524288
# Memory management
vm.swappiness = 0
vm.overcommit_memory = 1 # Allow overcommit (CRI-O, containerd)
vm.panic_on_oom = 0
kernel.panic = 10
kernel.panic_on_oops = 1
```
### Container storage
| Type | Recommendation | Note |
|-----|-----------|----------|
| **OS disk** | RAID 1 (2× NVMe) | Ext4/XFS, 100-200 GB |
| **Container runtime image** | RAID 1 (2× NVMe) | /var/lib/containerd, 200-500 GB |
| **Local PV** | Single NVMe | Raw device, no RAID |
| **Rook/Ceph OSD** | Raw NVMe/SATA | HBA/IT mode, no RAID |
| **Longhorn** | Raw NVMe/SATA | Ext4/XFS per volume |
---
## 4. Storage server (Ceph / MinIO / NAS)
### Ceph OSD node
| Component | Recommendation | Note |
|-----------|-----------|----------|
| **CPU** | 1-2 cores per OSD | Up to 12 OSD per node (24 cores) |
| **RAM** | 4-8 GB per OSD + OS | BlueStore cache, 16-64 GB min |
| **Network** | 2× 25/100 GbE | Public + Cluster network |
| **Storage** | 10-12× NVMe/SATA SSD OSD | HBA/IT mode, no RAID |
| **OS disk** | 2× SATA SSD RAID 1 | OS, Ceph MON/MGR |
**BIOS for Ceph:**
- SATA/NVMe: AHCI/NVMe mode (not RAID)
- C-States: Disabled (lower OSD latency)
- NUMA: Enabled
- Power: Performance
### MinIO node
| Component | Recommendation |
|-----------|-----------|
| **CPU** | 8-16 cores (32+ for erasure coding) |
| **RAM** | 32-64 GB + 1 GB per 1 TB storage |
| **Storage** | 4-16× NVMe (direct, no RAID) |
| **Network** | 2× 25/100 GbE |
| **OS** | Ubuntu / RHEL, XFS (for data) |
### NAS (TrueNAS / FreeNAS)
- **ZFS**: RAID-Z1/Z2/Z3, compression (lz4, zstd), dedup
- **ARC cache**: 1 GB per 1 TB storage (max 64 GB)
- **L2ARC**: NVMe cache (optional, read-heavy)
- **SLOG**: NVDIMM / Optane (sync write, ZIL)
- **Network**: 2-4× 10/25 GbE LACP
---
## 5. Web / API servers
| Parameter | Recommendation |
|----------|-----------|
| **CPU** | High clock, 8-32 cores |
| **RAM** | 32-128 GB |
| **Storage** | 2× NVMe RAID 1 (OS + app) |
| **OS** | Ubuntu / RHEL, optimized kernel |
| **Network** | 2× 10/25 GbE (bonding) |
**Kernel tuning:**
```
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.core.netdev_max_backlog = 65535
```
---
## Quick decision tree — server selection by workload, size and storage
```mermaid
flowchart TD
W["What workload?"] --> DB["Database"]
W --> HV["Virtualization"]
W --> K8s["Kubernetes"]
W --> AI["AI/ML"]
W --> ST["Storage server"]
W --> WEB["Web / API"]
DB --> DBS{"Company size"}
DBS -->|"< 500"| DB1["1× EPYC 8-16C, 64-256 GB<br/>NVMe RAID10, 2× 25GbE"]
DBS -->|"500-5000"| DB2{"Storage"}
DB2 -->|"Local"| DB2L["1-2× EPYC 16-24C, 128-512 GB<br/>NVMe RAID10, 4× 25GbE"]
DB2 -->|"Ceph"| DB2C["2× EPYC 16-32C, 256-512 GB<br/>RBD, 4× 25/100GbE"]
DBS -->|"Enterprise"| DB3{"Storage"}
DB3 -->|"FC SAN"| DB3F["2× EPYC 48-128C, 512-2048 GB<br/>SAN LUN + 2× FC 32/64G"]
DB3 -->|"Ceph"| DB3C["2× EPYC 32-64C, 256-512 GB<br/>RBD, 4× 100GbE"]
DBS -->|"Cloud"| DBC["RDS/Azure SQL/CloudSQL<br/>Managed, Multi-AZ"]
DB --> ORACLE{"Oracle architecture?"}
ORACLE -->|"Standalone"| ORA1["1-2× EPYC 8-24C<br/>64-512 GB, ASM local/FC<br/>2× 25GbE + FC 32G"]
ORACLE -->|"Data Guard"| ORA2["2× EPYC 32-64C<br/>256-1024 GB, FC SAN<br/>2× 25/100GbE + 2× FC 64G<br/>2× 25GbE (DG sync)"]
ORACLE -->|"RAC 2-4 nodes"| ORA3["Per node: 2× EPYC 32-64C<br/>512-2048 GB, FC SAN<br/>2× 100GbE (app)<br/>2× FC 64G (storage)<br/>2× 100GbE RoCE (interconnect)"]
ORACLE -->|"Exadata"| ORA4["Engineered system<br/>2-8 DB servers + 3-18 storage<br/>RoCE 100GbE, Smart Scan<br/>15-30 kW/rack"]
HV --> HVS{"Number of hosts"}
HVS -->|"2-3"| HV1["1× EPYC 12-24C, 128-256 GB<br/>RAID5/6 SSD, 2-4× 10/25GbE"]
HVS -->|"3-6"| HV2{"HCI"}
HV2 -->|"vSAN"| HV2V["1-2× EPYC 16-32C, 256-512 GB<br/>NVMe cache + SSD, 4× 25GbE"]
HV2 -->|"Ceph"| HV2C["1× EPYC 12-24C, 128-256 GB<br/>4-8× HBA NVMe/SSD, 4× 25GbE"]
HVS -->|"6+"| HV3["2× EPYC 32-64C, 512-2048 GB<br/>FC SAN 32/64G, 4-8× 25/100GbE"]
HVS -->|"20+"| HV4["2× EPYC 64-128C, 512-1024 GB<br/>OpenStack + Ceph, 4-8× 100GbE"]
K8s --> K8T{"Node type"}
K8T -->|"General"| K8G["16-32C, 64-128 GB<br/>2× NVMe, 2× 25GbE"]
K8T -->|"Memory"| K8M["32-64C, 256-512 GB<br/>3× NVMe, 2× 25GbE"]
K8T -->|"GPU"| K8U["32-64C, 512-1024 GB<br/>6-10× NVMe, H100/B200, 4× 100GbE"]
K8T -->|"Storage"| K8S["16-32C, 64-128 GB<br/>6-14× HBA NVMe, 4× 25GbE"]
AI --> AIT{"Purpose"}
AIT -->|"Training"| AITR["GPU H100/B200, NVLink<br/>InfiniBand 400Gb/s, liquid cooling"]
AIT -->|"Inference"| AIIR["A100/H200, MIG<br/>PCIe 5.0, 2× 100GbE"]
ST --> STT{"Type"}
STT -->|"Ceph OSD"| STC["EPYC (PCIe lanes)<br/>4-8 GB/OSD, HBA, 2× 25/100GbE"]
STT -->|"MinIO"| STM["EPYC 8-16C, 32-64 GB<br/>4-16× NVMe direct, 2× 25/100GbE"]
STT -->|"NAS (ZFS)"| STN["EPYC 16-32C, 64-128 GB<br/>RAID-Z, SLOG NVMe, 2-4× 10/25GbE"]
WEB --> WEBE["EPYC high clock, 8-32C<br/>32-128 GB, 2× NVMe RAID1, 2× 10/25GbE"]
```
### Connectivity summary by platform
| Platform | App / VM network | Storage network | Replication / Cluster | Management |
|-----------|-------------|-------------|---------------------|------------|
| **DB local (small)** | 2× 25 GbE LACP | — | 2× 25 GbE (shared) | 1× 1 GbE (iLO) |
| **DB local (medium)** | 2× 25/100 GbE LACP | — | 2× 25 GbE dedicated | 1× 1 GbE (iLO) |
| **DB FC SAN** | 2× 25/100 GbE LACP | 2× 32/64 Gb FC multipath | FC replication | 1× 1 GbE (iLO) + SAN mgmt |
| **DB Ceph** | 2× 25/100 GbE | 2× 25/100 GbE (Ceph public) | 2× 25/100 GbE (Ceph cluster) | 1× 1 GbE (iLO) |
| **Hypervisor local** | 2-4× 10/25 GbE LACP | — (local) | — | 1× 1 GbE (iLO) |
| **Hypervisor vSAN** | 2× 25/100 GbE LACP | 2× 25/100 GbE (vSAN) | vSAN traffic | 1× 1 GbE (iLO) |
| **Hypervisor FC SAN** | 2-4× 25/100 GbE LACP | 2× 32/64 Gb FC multipath | 2× 25 GbE (vMotion) | 1× 1 GbE (iLO) |
| **Hypervisor Ceph** | 2× 25/100 GbE LACP | 2× 25/100 GbE (Ceph) | 2× 25 GbE (migration) | 1× 1 GbE (iLO) |
| **Kubernetes** | 2× 25/100 GbE | 2× 25/100 GbE (Ceph/Longhorn) | 2× 25/100 GbE (K8s cluster) | 1× 1 GbE (BMC) |
| **Web/API** | 2× 10/25 GbE LACP | — | — | 1× 1 GbE (BMC) |
| **Oracle Standalone** | 2× 25 GbE LACP | 2× FC 32G or NVMe local | Data Guard 2× 25 GbE | 1× 1 GbE (iLO) + ASM mgmt |
| **Oracle Data Guard** | 2× 25/100 GbE LACP | 2× FC 64G multipath | 2× 25 GbE (DG sync) | 1× 1 GbE (iLO) + SAN mgmt |
| **Oracle RAC** | 2× 100 GbE LACP (VIP/SCAN) | 2× FC 64G multipath | 2× 100 GbE RoCE (Cache Fusion) | 1× 1 GbE (iLO) + Clusterware |
| **Oracle Exadata** | 4-8× 100 GbE RoCE | NVMe over Fabric | RDMA interconnect | Exadata CLI + OEDA |
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Last revision: 2026-06-03*

757
SERVER-CONFIG.md Normal file
View File

@@ -0,0 +1,757 @@
# ⚙️ Server configuration — best practices podle workloadu
## Obecná BIOS/UEFI nastavení
| Nastavení | Doporučení | Zdůvodnění |
|-----------|-----------|------------|
| **Boot mode** | UEFI | Secure Boot, GPT, větší disky |
| **Power profile** | Performance / OS Control | Max výkon, C-States disabled |
| **Hyper-Threading** | Enabled | +30-50 % throughput pro multi-thread |
| **Virtualization** | Enabled (VT-x/AMD-V) | Nutné pro hypervisor, containers |
| **SR-IOV** | Enabled | GPU, NIC passthrough |
| **NUMA** | Enabled | NUMA-aware scheduling |
| **ACPI** | Enabled | Power management, OS-level |
| **Secure Boot** | Enabled | Secure boot chain |
| **TPM** | Enabled | Measured boot, key storage |
---
## 1. Databázové servery
### Volba CPU
| DB typ | CPU preference | Zdůvodnění |
|--------|---------------|------------|
| **OLTP** (PostgreSQL, MySQL) | High clock, moderate cores | Nízká latence na transakci, limited parallelism |
| **OLAP** (ClickHouse, Snowflake) | Many cores, AVX-512 | Columnstore, high parallelism |
| **In-memory** (Redis, Memcached) | High clock, low cache latency | Single-threaded (Redis), RAM bandwidth |
| **Document** (MongoDB) | Balance (clock × cores) | Mixed workload |
| **Distributed** (Cassandra, Scylla) | Many cores, high cache | Shard-per-core (Scylla), compaction |
| **Oracle OLTP** | High clock, moderate cores, core-factor aware | CPU license cost (core factor 0.5 pro AMD EPYC i Intel Xeon) |
| **Oracle OLAP / DW** | Many cores, large SGA, in-memory option | Parallel query, Exadata Smart Scan, compression |
### Oracle CPU licensing — core factor
Oracle licencuje na jádro s korekčním faktorem dle procesoru. Faktor 0.5 znamená, že 2 jádra = 1 Oracle license.
| Procesor | Core factor | 64 fyzických jader → Oracle licencí |
|----------|-------------|--------------------------------------|
| AMD EPYC (všechny řady) | 0.5 | 32 |
| Intel Xeon (Scalable) | 0.5 | 32 |
| IBM POWER | 1.0 | 64 |
| ARM (Ampere Altra) | 0.5 | 32 |
**Dopad na výběr CPU**: Při stejném Oracle license cost je EPYC s více jádry výhodnější — dostanete více compute power za stejnou license cenu.
### Konfigurace podle velikosti firmy a typu storage
#### Varianta A: Malá firma — lokální NVMe RAID
| Komponenta | Doporučení | Poznámka |
|-----------|-----------|----------|
| **CPU** | 1× EPYC 9124/9224 nebo Intel Xeon 4410Y (8-16C) | 1 socket, high clock |
| **RAM** | 64-256 GB (8-16 GB/core) | DDR5-4800, 1DPC |
| **OS disk** | 2× SATA/SAS SSD, RAID 1 (240-480 GB) | Pro OS + binární soubory |
| **Data disk** | 4-6× NVMe (U.2/E3.S), RAID 10 | Lokální data, žádné sdílení |
| **WAL disk** | 2× NVMe RAID 1 (400-800 GB) | Pouze PostgreSQL |
| **Network** | 2× 25 GbE (LACP) | Aplikační traffic + management |
| **Form factor** | 1U nebo 2U | Single node, žádný cluster |
| **Storage backend** | Lokální RAID controller (PERC/Broadcom) | HW RAID 10 nebo SW RAID (mdadm) |
| **HA** | Aplikace řídí failover (patroni, repmgr, orchestrator) | Standby node při selhání |
**Use case**: Startup, pobočka, dev/test, < 500 uživatelů, jeden databázový server, nízké nároky na dostupnost.
#### Varianta B: Střední firma — lokální NVMe + asynchronní replikace
| Komponenta | Doporučení | Poznámka |
|-----------|-----------|----------|
| **CPU** | 1-2× EPYC 9334/9374F nebo Intel Xeon 5418Y (16-24C) | 1-2 socket, balanced |
| **RAM** | 128-512 GB (8-16 GB/core) | DDR5-4800/5600, 1DPC |
| **OS disk** | 2× NVMe RAID 1 (2× 480 GB) | OS + binárky |
| **Data disk** | 6-8× NVMe, RAID 10 | Lokální NVMe, 3-6 TB usable |
| **WAL disk** | 2× NVMe RAID 1 (2× 800 GB) | Oddělený od data |
| **Network** | 2× 25 GbE (app) + 2× 25 GbE (replication) | Aplikační a replikační síť odděleny |
| **Form factor** | 2U | Primární + replica node |
| **Storage backend** | SW RAID (mdadm) nebo HW RAID (PERC H965) | Write-back cache s BBU |
| **HA** | Patroni / repmgr / MySQL InnoDB Cluster | Asynchronní replikace na 1-2 standby |
**Use case**: E-commerce, SaaS střední velikosti, 500-5000 uživatelů, RPO < 1 min, RTO < 5 min.
#### Varianta C: Velká firma — FC SAN (enterprise)
| Komponenta | Doporučení | Poznámka |
|-----------|-----------|----------|
| **CPU** | 2× EPYC 9654/9965 nebo Xeon 8592+/6980P (48-128C) | 2 socket, max cores, large cache |
| **RAM** | 512 GB - 2 TB (8-16 GB/core) | DDR5, 2DPC (penalizace speed), 12 channelů (EPYC) |
| **OS disk** | 2× SATA SSD RAID 1 (2× 480 GB) | Pouze OS, data na SAN |
| **Data + WAL** | LUNy z FC SAN | Hitachi VSP / Dell PowerMax / Pure //X |
| **HBA** | 2× dual-port FC HBA (32/64 Gb) | Multipath (active-active), FC-NVMe |
| **Network** | 2× 25/100 GbE (app) + 2× 32/64 Gb FC (storage) | App i storage síť odděleny |
| **Form factor** | 2U | 2-8 node cluster (RAC, AlwaysOn AG) |
| **Storage backend** | FC SAN — LUN per databáze | Thin provisioning, RAID na SAN, snapshots |
| **HA** | Oracle RAC / SQL Server AOAG / PostgreSQL Patroni | Synchronní replikace, FC multipath |
**Výhody SAN**: Centrální management, snapshots, cloning, disaster recovery (SRDF/Metro), oddělená storage síť, vyšší dostupnost.
**Nevýhody**: Vyšší latence oproti lokálnímu NVMe (~50-200 µs přes SAN vs ~10 µs local NVMe), vyšší CAPEX, vendor lock-in.
#### Varianta D: Velká firma — Ceph / SDS backend
| Komponenta | Doporučení | Poznámka |
|-----------|-----------|----------|
| **CPU** | 2× EPYC 9334/9654 (16-32C) | Méně cores než SAN varianta — část CPU jde na Ceph client |
| **RAM** | 256-512 GB | Méně RAM — Ceph client cache není tak efektivní jako lokální buffer |
| **OS disk** | 2× SATA SSD RAID 1 (2× 480 GB) | OS |
| **Network** | 2× 25/100 GbE (app) + 2× 25/100 GbE (Ceph public) | App i Ceph traffic po Ethernetu |
| **HBA** | Storage HBA v IT/HBA mode (žádný RAID) | Pro Ceph OSD node, ne DB node |
| **Form factor** | 2U | DB nod + separátní Ceph OSD nod |
| **Storage backend** | RBD (RADOS Block Device) přes Ceph | 3× replikace nebo erasure coding |
| **HA** | Aplikace + Ceph inherentní HA | Ceph self-healing, auto-rebalance |
**Výhody Ceph**: Žádný vendor lock-in, horizontální škálování, jednotná platforma pro block/file/object, nižší CAPEX.
**Nevýhody**: Vyšší latence a CPU režie (Ceph client → network → OSD), variabilní výkon, složitější troubleshooting.
#### Varianta E: Cloud — RDS / CloudSQL / Azure SQL
| Komponenta | Doporučení | Poznámka |
|-----------|-----------|----------|
| **Compute** | AWS RDS (db.r7g/r8g), Azure SQL (GP/BC/Hyperscale) | Managed service, bez přístupu k OS |
| **Storage** | EBS gp3 / io2, Azure Premium SSD v2, Cloud SQL SSD | Automatické škálování, PITR, multi-AZ |
| **Network** | Security Group, Private Link, VPC peering | Žádný HBA, žádná SAN — vše přes Ethernet |
| **HA** | Multi-AZ (synchronní), read replicas | Managed failover, RTO < 60 s |
| **Backup** | Automated, PITR (7-35 dní) | Bez nutnosti managementu |
**Use case**: Žádný on-prem hardware, elastické škálování, pay-per-use, menší provozní režie.
**Nevýhody**: Vyšší dlouhodobé náklady, data residency, network latency, limited customization.
### Srovnání variant
| Aspekt | Lokální NVMe (malá) | Lokální NVMe (střední) | FC SAN | Ceph | Cloud |
|--------|---------------------|----------------------|--------|------|-------|
| **Latence** | ~10 µs | ~10 µs | ~50-200 µs | ~100-500 µs | ~100-1000 µs |
| **Škálování** | Vertikální | Vertikální | Horizontální | Horizontální | Elastické |
| **CAPEX** | Nízký | Střední | Vysoký | Střední | Žádný (OPEX) |
| **Provozní režie** | Nízká | Nízká | Vysoká (SAN admin) | Střední | Žádná |
| **HA** | Aplikace | Patroni/Cluster | RAC/AOAG | Ceph HA | Managed |
| **RPO** | 1-5 min | < 1 min | < 10 s | < 30 s | < 60 s |
| **RTO** | 5-15 min | < 5 min | < 2 min | < 5 min | < 60 s |
| **Počet serverů** | 1-2 | 2-4 | 4-16 | 6-20+ | 0 (managed) |
| **Firma** | Startup/SME | SME/Enterprise | Enterprise | Enterprise | Libovolná |
### PostgreSQL parameter matrix podle storage typu
| Parametr | Local NVMe | FC SAN | Ceph RBD |
|----------|-----------|--------|----------|
| `random_page_cost` | 1.1 | 1.5-2.0 | 2.0-3.0 |
| `effective_io_concurrency` | 300 | 100-200 | 50-100 |
| `synchronous_commit` | off (NVMe cache) | on (SAN cache) | off (Ceph cache) |
| `full_page_writes` | on | on | on (i přes Ceph) |
### Storage layout podle typu backendu
**Lokální NVMe (malá/střední):**
```
Mount point FS RAID Disk Účel
/ ext4 1 (mirror) 2× SATA SSD OS
/data xfs 10 4-8× NVMe Data
/wal xfs 1 (mirror) 2× NVMe WAL (PG)
```
**FC SAN (enterprise):**
```
Mount point FS Device Účel
/ ext4 local RAID 1 (2× SSD) OS
/dev/sdb xfs FC LUN 1 (500 GB) WAL (PG)
/dev/sdc xfs FC LUN 2 (2 TB) Data
/dev/sdd xfs FC LUN 3 (2 TB) Indexy (oddělené)
```
**Ceph RBD:**
```
Mount point FS Ceph device Účel
/ ext4 local RAID 1 (2× SSD) OS
/dev/rbd0 xfs rbd datastore-01 Data + WAL (Ceph RBD)
```
### Kernel tuning podle variants
**Lokální NVMe:**
```
vm.dirty_ratio = 30
vm.dirty_background_ratio = 5
```
**FC SAN:**
```
# SAN storage — vyšší latency, méně agresivní flush
vm.dirty_ratio = 20
vm.dirty_background_ratio = 3
vm.dirty_expire_centisecs = 3000 # Defer writes (SAN cache)
```
**Ceph RBD:**
```
# Ceph RBD — network storage, optimalizovat pro RBD cache
vm.dirty_ratio = 15
vm.dirty_background_ratio = 2
# RBD cache settings
# rbd cache = true (client-side)
# rbd cache size = 256-512 MB
```
### Database-specific tuning
| Parametr | PostgreSQL | MySQL | Oracle | MongoDB |
|----------|-----------|-------|--------|---------|
| **Cache** | `shared_buffers` 25 % RAM | `innodb_buffer_pool` 70-80 % RAM | `SGA_TARGET` 60-80 % RAM | `WiredTiger cache` 50-80 % RAM |
| **OS cache** | `effective_cache_size` 75 % RAM | OS cache + InnoDB | OS cache (double buffering risk při large SGA) | OS cache |
| **Write buffer** | `wal_buffers` 64-256 MB | `innodb_log_file_size` 1-4 GB | Redo log (2-4 groups, 200 MB-4 GB) | WiredTiger log |
| **Connections** | `max_connections` 50-500 | `max_connections` 100-500 | `processes` 200-2000 | maxIncomingConnections |
| **I/O** | `effective_io_concurrency` 200 | `innodb_io_capacity` 2000 | `db_file_multiblock_read_count` 128 | WiredTiger eviction |
| **Huge pages** | `huge_pages = try` | `large-pages = ON` | `use_large_pages = only` (mandatory) | transparent_hugepages=never |
| **Parallel query** | `max_parallel_workers` 4-8 | `innodb_parallel_read_threads` 4 | `parallel_degree_policy = auto` — až 64 | — |
### Connectivity per variant
| Varianta | App síť | Storage síť | Replikace | Management |
|----------|---------|-------------|-----------|------------|
| **Lokální (malá)** | 2× 25 GbE LACP | — | 2× 25 GbE (same) | iDRAC/iLO |
| **Lokální (střední)** | 2× 25 GbE LACP | — | 2× 25 GbE dedik. | iDRAC/iLO |
| **FC SAN** | 2× 25/100 GbE | 2× 32/64 Gb FC (multipath) | FC replication | iDRAC/iLO + SAN mgmt |
| **Ceph** | 2× 25/100 GbE | 2× 25/100 GbE (public net) | 2× 25/100 GbE (cluster net) | iDRAC/iLO + Ceph mgmt |
| **Cloud** | Elastic IP / Private Link | — | — | AWS Console / API |
| **Oracle Standalone** | 2× 25 GbE LACP | ASM (2× 25 GbE nebo FC 32G) | Data Guard 2× 25 GbE | iLO + ASM mgmt |
| **Oracle RAC** | 2-4× 25/100 GbE | 2× 64 Gb FC (multipath) | Cache Fusion interconnect | iLO + SAN mgmt |
| **Oracle Exadata** | 4-8× 100 GbE RoCE | NVMe over Fabric | RDMA interconnect | Exadata CLI + OEDA |
### Oracle-specific konfigurace
#### Oracle ASM — diskgroup layout
Oracle ASM (Automatic Storage Management) nahrazuje tradiční filesystem + volume manager:
| Diskgroup | Redundancy | Disky | Účel |
|-----------|-----------|-------|-------|
| **DATA** | Normal (2× mirror) | 4-12× FC LUN/NVMe | Data files, temp files, control files |
| **FRA** (Flash Recovery Area) | Normal (2× mirror) | 2-6× FC LUN/NVMe | Archive logs, backup, flashback logs |
| **REDO** | High (3× mirror) | 2-4× FC LUN/NVMe | Online redo log groups (I/O kritické) |
| **SPFILE** | Normal | 2× small LUN | Server parameter file |
**ASM striping**: Coarse (1 MB) pro běžná data, Fine (128 KB) pro redo logy (nižší latence zápisu).
#### Varianta O1: Standalone Oracle (malá/střední, single instance)
| Parametr | Small (< 500 users) | Medium (500-2000 users) |
|----------|---------------------|------------------------|
| **CPU** | 1-2× EPYC 9124-9224 / Xeon 4410Y (8-16C) | 2× EPYC 9334-9374F / Xeon 5418Y (16-24C) |
| **RAM (SGA + PGA)** | 64-128 GB (SGA 70 %, PGA 30 %) | 128-512 GB (SGA 60-80 %, PGA 20-40 %) |
| **Huge pages** | Ano (vm.nr_hugepages) — mandatory pro SGA | Ano |
| **OS disk** | 2× SATA SSD RAID 1 (240 GB) | 2× NVMe RAID 1 (480 GB) |
| **DATA + FRA** | 4-6× NVMe, ASM normal redundancy | 6-8× NVMe nebo FC LUN, ASM normal |
| **REDO** | 2-4× NVMe (oddělené od DATA), ASM high | 4× FC LUN (oddělené), ASM high |
| **Archive log** | Lokální FRA | FC LUN (FRA diskgroup) |
| **Network (app)** | 2× 25 GbE LACP | 2-4× 25/100 GbE LACP |
| **Network (storage)** | — (lokální NVMe) | 2× FC 32G multipath |
| **Network (Data Guard)** | — | 2× 25 GbE dedikované |
| **DB version** | Oracle SE2 (max 16 threads) | Oracle EE (neomezené) |
**Use case**: Dev/test, malé produkční DB, pobočky. SE2 license = max 16 CPU threads, limitovaná parallel execution.
#### Varianta O2: Oracle Data Guard (střední/velká, HA + DR)
Primární + standby v active-passive režimu, možnost Active Data Guard pro reporting.
| Parametr | Doporučení |
|----------|-----------|
| **CPU** | 2× EPYC 9654-9965 / Xeon 8592+ (32-64C) |
| **RAM** | 256-1024 GB (SGA 60-80 %, PGA 20-40 %) |
| **Huge pages** | Ano (50-80 % RAM alokováno pro SGA) |
| **OS disk** | 2× NVMe RAID 1 (480 GB) |
| **Storage** | FC SAN LUN (DATA + FRA + REDO odděleně) nebo NVMe + ASM |
| **HBA** | 2× dual-port FC 32/64 Gb (multipath active-active) |
| **App network** | 2-4× 25/100 GbE LACP |
| **Storage network** | 2× FC 32/64 Gb multipath |
| **Data Guard network** | 2× 25/100 GbE dedikované (sync nebo async) |
| **Data Guard režim** | Maximum Availability (sync, fallback na async) — RPO = 0 |
| **Topologie** | 1 primary + 1-2 standby (physical), far sync pro geo-DR |
| **Active Data Guard** | Standby otevřená pro čtení (reporting, backup) — vyžaduje ADG licenci |
**Latence Data Guard**:
```text
Synchronní (Maximum Availability):
Primární COMMIT → LGWR flush REDO → sync přes síť → Standby LGWR → ACK → ~1-5 ms
RPO = 0, dopad na latenci zápisu
Asynchronní (Maximum Performance):
Primární COMMIT → LGWR flush REDO → async do standby buffer → ~0.1-1 ms
RPO = několik sekund, zanedbatelný dopad na zápis
```
**Síťové požadavky pro Data Guard sync**:
- RTT < 2 ms pro synchronní režim (doporučeno < 1 ms)
- Min. 10 GbE, doporučeno 25 GbE (propustnost = REDO rate × 2)
- REDO rate: OLTP ~50-500 MB/s, batch ~500-2000 MB/s
- Při REDO rate 500 MB/s a 25 GbE → ~20 % link utilization
#### Varianta O3: Oracle RAC (velká, enterprise)
Multi-instance cluster se shared storage a Cache Fusion.
| Parametr | Doporučení |
|----------|-----------|
| **Počet nodů** | 2-4 (typicky), max 64 (RAC cluster) |
| **CPU per node** | 2× EPYC 9654-9965 / Xeon 8592+ (32-64C) |
| **RAM per node** | 512-2048 GB (SGA 60-80 %, PGA 20-40 %) |
| **Huge pages** | Ano (1 GB stránky pokud RAM > 512 GB) |
| **Storage** | FC SAN — shared LUNs (ASM normal/high redundancy) |
| **HBA** | 2× dual-port FC 64 Gb (multipath, active-active) |
| **App network** | 2-4× 25/100 GbE LACP (VIP, SCAN listener) |
| **Storage network** | 2-4× FC 64 Gb (multipath per node) |
| **Cache Fusion interconnect** | 2× 100 GbE (RoCE v2 nebo InfiniBand) — dedikovaný |
| **RAC interconnect latency** | < 5 µs (doporučeno), max < 10 µs |
| **ASM** | Normal redundancy (2-way mirror) |
| **Oracle Clusterware** | Voting disk (3× 1 GB LUN), OCR (3× 500 MB LUN) |
| **Service** | OLTP_service, REPORT_service, BATCH_service |
**Cache Fusion — kritický interconnect**:
```
Node A (DB instance) ←──→ Node B (DB instance)
│ │
└──────── ASM ───────────┘
FC SAN (shared storage)
Cache Fusion traffic: dirty block transfer mezi instancemi
→ Latence < 5 µs, jinak RAC škálování degraduje
→ Kapacita: 2× 100 GbE, dedikovaný switch nebo InfiniBand HDR100
→ Doporučená MTU: 9000 (jumbo frames)
```
**RAC sizing podle počtu transakcí**:
| TPS | Nodů | CPU per node | RAM per node | Interconnect |
|-----|------|-------------|-------------|-------------|
| < 10 000 | 2 | 16-24C | 256 GB | 2× 25 GbE |
| 10 000 - 50 000 | 2-4 | 32-48C | 512 GB | 2× 100 GbE RoCE |
| 50 000 - 200 000 | 4-8 | 48-64C | 1024 GB | 2× 100 GbE RoCE / InfiniBand |
| > 200 000 | 8+ | 64-128C | 2048 GB | InfiniBand HDR100/HDR200 |
**RAC sizing — výpočet licence cost**:
```text
Příklad: 4-node RAC, každý node 2× EPYC 9654 (96C) = 192 cores per node
Core factor 0.5 → 96 Oracle licenses per node
4 × 96 = 384 Oracle EE licenses
Pri ~$47.5k/license → ~$18.2M (jen licence, bez supportu 22 % ročně)
```
#### Varianta O4: Oracle Exadata (hyperscale)
Engineered system — optimální pro hybrid workload (OLTP + DW).
| Parametr | X9M / X10M | Use case |
|----------|-----------|----------|
| **Database servers** | 2-8× (Xeon, 1.5-6 TB RAM, NVMe) | Compute |
| **Storage servers** | 3-18× (NVMe + HDD, Smart Scan) | Offloading predikátů |
| **Smart Scan** | Filtrace na storage vrstvě | Méně dat po síti, vyšší propustnost |
| **RoCE interconnect** | 100 GbE (RDMA) | Nízká latence, high bandwidth |
| **In-Memory Column Store** | Volitelná licence | Real-time analytics bez ETL |
| **HCC (Hybrid Columnar Compression)** | Compression v storage serverech | Až 10-15× komprese pro DW |
| **Rack power** | ~15-30 kW (full rack) | Vyšší densita |
**Kdy zvolit Exadata místo standalone RAC**:
- OLTP > 50 000 TPS
- Potřeba konsolidace (více DB na jeden cluster)
- Smart Scan výrazně zrychluje reporting na produkčních datech
- HCC pro úsporu storage u DW workloadů
---
## 2. Hypervisor host (ESXi / KVM / Hyper-V)
### Konfigurace podle velikosti a storage typu
#### Varianta A: Malá firma — lokální storage (2-3 hosty)
| Komponenta | Doporučení | Poznámka |
|-----------|-----------|----------|
| **CPU** | 1× EPYC 9224/9254 nebo Xeon 4410Y/5418Y (12-24C) | 1 socket, dost cores pro VM density |
| **RAM** | 128-256 GB (4-8 GB/core) | DDR5, 1DPC |
| **OS disk** | 2× SATA SSD RAID 1 (2× 240-480 GB) | ESXi / Proxmox / Hyper-V boot |
| **VM storage** | 4-6× SATA/SAS SSD, RAID 5/6 nebo 10 | Lokální RAID, 4-12 TB usable |
| **Network** | 2-4× 10/25 GbE (LACP) | Sdílený pro vše (management + VM + storage) |
| **Hypervisor** | VMware vSphere Standard / Proxmox VE / Hyper-V | Basic license, žádné enterprise funkce |
| **Storage backend** | Lokální RAID controller (PERC H755, Broadcom 9560) | HW RAID s cache, write-back |
| **HA** | VMware HA / Proxmox HA | Restart VM na jiném hostu při selhání |
| **Backup** | Veeam B&R Free / PBS (Proxmox Backup Server) | Lokální nebo USB disk |
**Use case**: Malá kancelář, pobočka, dev/test, < 10 VM, nízký rozpočet, jednoduchá správa.
**Limitace**: Žádné vMotion bez shared storage, outage při výpadku hosta (restart HA, ne seamless).
#### Varianta B: Střední firma — vSAN / Ceph (3-6 hostů)
| Komponenta | Doporučení | Poznámka |
|-----------|-----------|----------|
| **CPU** | 1-2× EPYC 9334/9654 nebo Xeon 5418Y/8592+ (16-32C) | 1-2 socket |
| **RAM** | 256-512 GB (4-8 GB/core) | DDR5, 2DPC (minimální penalizace) |
| **OS disk** | 2× SATA SSD RAID 1 nebo 2× M.2 NVMe (BOSS-S1) | Oddělený od VM storage |
| **Cache tier** | 1-2× NVMe (vSAN caching / Ceph WAL+DB) | Pro write performance |
| **Capacity tier** | 4-8× SATA/SAS SSD nebo HDD (vSAN capacity / Ceph OSD) | HDD pro kapacitu, SSD pro performance |
| **Network** | 4× 25/100 GbE — 2× VM + mgmt, 2× storage (vSAN/Ceph) | Oddělená storage síť, RDMA (RoCE v2) |
| **Hypervisor** | VMware vSAN / Proxmox Ceph / StarWind HCI | HCI license (vSAN ~$2.5k/Core) |
| **Storage backend** | vSAN OSA/ESA nebo Ceph (RADOS) | Distributed storage, auto-rebalance |
| **HA** | vSphere HA + vSAN / Proxmox HA + Ceph | vMotion, DRS, automated failover |
| **Failover** | N+1 (jeden host jako rezerva) | U vSAN min. 4 hosty (pro ESA min. 3) |
**Čistě Ceph varianta (Proxmox / OpenStack)**:
```
Proxmox node (3-6×):
├── CPU: 1× EPYC 9224-9334 (12-24C)
├── RAM: 128-256 GB
├── OS: 2× SATA SSD RAID 1
├── Ceph OSD: 4-8× NVMe/SATA SSD (RAW, HBA mode)
├── Network: 2× 25 GbE (public) + 2× 25 GbE (cluster)
└── Storage: Ceph 3× replication, CRUSH host failure domain
```
**VMware vSAN varianta (4-6 hostů)**:
```
vSAN node (4-6×):
├── CPU: 1-2× EPYC/Xeon (16-32C)
├── RAM: 256-512 GB
├── OS: 2× M.2 NVMe (BOSS-S1) nebo SD card (deprecated)
├── vSAN cache: 1-2× NVMe (write buffer)
├── vSAN capacity: 4-8× SATA SSD (vSAN ESA) nebo HDD (vSAN OSA)
├── Network: 2× 25/100 GbE (VM) + 2× 25 GbE (vSAN)
└── Storage: vSAN ESA (all-NVMe) nebo OSA (hybrid)
```
**Use case**: SME, enterprise divize, 10-100 VM, potřeba vMotion, DRS, HA, jednoduchý storage management.
#### Varianta C: Velká firma — FC SAN (6+ hostů)
| Komponenta | Doporučení | Poznámka |
|-----------|-----------|----------|
| **CPU** | 2× EPYC 9654/9965 nebo Xeon 8592+/6980P (32-64C) | 2 socket, max VM density |
| **RAM** | 512 GB - 2 TB (4-8 GB/core) | DDR5, 2DPC |
| **OS disk** | 2× SATA SSD RAID 1 nebo SD card (vSphere) | Boot, image storage |
| **VM storage** | LUNy z FC SAN — VMFS / NFS datastory | Hitachi, Dell, Pure, HPE storage |
| **HBA** | 2× dual-port FC HBA 32/64 Gb | Multipath, FC-NVMe |
| **Network** | 4-8× 25/100 GbE — rozdělené do traffic typů | Management, VM, vMotion, FT odděleny |
| **Hypervisor** | VMware vSphere Enterprise+ / Hyper-V DC | Enterprise license, DRS, HA, FT |
| **Storage backend** | FC SAN — VMFS 8 datastory, VVols | Thin provisioning, storage DRS, array snapshots |
| **HA** | vSphere HA + DRS + vCenter | vMotion, DRS, FT, SRM pro DR |
| **Failover** | N+1 nebo admission control (rezerva CPU/RAM) | Vyhrazená kapacita pro HA failover |
**Use case**: Enterprise, 100+ VM, mix DB a aplikací, centralizovaný storage management, enterprise SLA.
#### Varianta D: Hyperscale — Ceph / SDS (20+ hostů)
| Komponenta | Doporučení | Poznámka |
|-----------|-----------|----------|
| **CPU** | 2× EPYC 9654/9965 (64-128C) | 2 socket, compute optimální |
| **RAM** | 512 GB - 1 TB (2-4 GB/core) | Nízký overcommit ratio pro konzistenci |
| **OS disk** | 2× M.2 NVMe RAID 1 (BOSS) | Boot |
| **Network** | 4-8× 100 GbE (compute + storage) | Separate OVN/OVS pro SDN, VXLAN tunneling |
| **Hypervisor** | OpenStack (Nova) / OpenShift (KubeVirt) | Open source, API-driven, multi-tenant |
| **Storage backend** | Ceph (RADOS, RBD, RGW, CephFS) | Unified storage, erasure coding (8+3) |
| **Orchestrace** | OpenStack / Kubernetes | Infrastructure-as-Code, autoscaling |
| **HA** | OpenStack HA / Kubernetes HA | Self-healing, auto-rebalance |
**Use case**: Cloud provider, hyperscale, 500+ VM, multi-tenant, maximální automatizace.
### Srovnání hypervisor variant
| Aspekt | Lokální (malá) | vSAN/Ceph (střední) | FC SAN (velká) | Ceph hyperscale |
|--------|---------------|---------------------|----------------|-----------------|
| **Storage** | Lokální RAID | vSAN / Ceph (HCI) | FC SAN (centralizovaný) | Ceph (distribuovaný) |
| **Počet hostů** | 2-3 | 3-6 | 6-50+ | 20+ |
| **Latence VM** | ~10 µs (local) | ~100-500 µs | ~200 µs (SAN) | ~500-2000 µs |
| **CAPEX/host** | Nízký | Střední | Vysoký | Střední |
| **CAPEX storage** | Nízký | Žádný (součást hostů) | Vysoký (SAN array) | Žádný (součást hostů) |
| **Management** | Simple (per host) | vCenter / Proxmox | vCenter + SAN mgmt | OpenStack / K8s |
| **vMotion** | Ne (bez sdílené storage) | Ano (vSAN / Ceph RBD) | Ano (FC LUN) | Ano (Ceph RBD) |
| **DRS** | Ne | Ano (vSphere) | Ano (vSphere) | OpenStack scheduler |
| **Škálování** | Vertikální | Horizontální (přidat host) | Horizontální (host + SAN) | Horizontální |
### Network design podle varianty
#### Malá (lokální storage)
| Traffic | VLAN | Rychlost | Teaming | Poznámka |
|---------|------|----------|---------|----------|
| Management | Mgmt | 1 GbE | Active/Passive | Dedikovaný port (iLO/iDRAC) |
| VM + Storage | All | 2-4× 10/25 GbE | LACP | Sdílené, VLAN tagging |
```
┌──────────────────────────────────────────┐
│ Host │
│ ┌──────┐ ┌─────────────────────────────┐│
│ │ iLO │ │ NIC1 NIC2 ││
│ │ 1 GbE │ │ [LACP] 25 GbE ││
│ └──────┘ └──────────┬──────────────────┘│
└──────────────────────┼───────────────────┘
┌─────┴─────┐
│ Switch │
└───────────┘
```
#### Střední (vSAN / Ceph)
| Traffic | VLAN | Rychlost | Teaming | Poznámka |
|---------|------|----------|---------|----------|
| Management | Mgmt | 1 GbE | Active/Passive | Dedikovaný iLO/iDRAC |
| VM | VM | 2× 25/100 GbE | LACP | VM traffic, migrace |
| Storage | vSAN/Ceph | 2× 25/100 GbE | LACP nebo RDMA | Oddělený, Jumbo frames (MTU 9000) |
```
┌──────────────────────────────────────────┐
│ Host │
│ ┌──────┐ ┌──────────┐ ┌───────────────┐│
│ │ iLO │ │ NIC1 NIC2│ │ NIC3 NIC4 ││
│ │ 1 GbE │ │ VM traffic│ │ Storage (vSAN)││
│ └──────┘ └──────────┘ └───────────────┘│
└──────────────────────────────────────────┘
```
#### Velká (FC SAN)
| Traffic | VLAN | Rychlost | Teaming | Poznámka |
|---------|------|----------|---------|----------|
| Management | Mgmt | 1 GbE | Active/Passive | Dedikovaný |
| VM | VM | 2-4× 25/100 GbE | LACP | VM traffic |
| vMotion | vMotion | 2× 25 GbE | Dedikovaný | Multi-NIC vMotion |
| FT | FT | 2× 10/25 GbE | Dedikovaný | Low latency |
| Storage | — | 2× 32/64 Gb FC | Multipath | FC SAN |
```
┌──────────────────────────────────────────────┐
│ Host │
│ ┌──────┐ ┌────────────┐ ┌────┐ ┌─────────┐│
│ │ iLO │ │ NIC1-4 │ │HBA1│ │ HBA2 ││
│ │ 1 GbE │ │ VM+vMotion+FT│ │32Gb│ │ 32Gb ││
│ └──────┘ └────────────┘ └─┬──┘ └──┬──────┘│
└────────────────────────────┼───────┼───────┘
│ │
┌───────┴───┐ ┌─┴────────┐
│ Ethernet │ │ FC Switch │
│ Switch │ │ (Brocade/ │
│ │ │ Cisco) │
└───────────┘ └──────────┘
```
### BIOS pro hypervisor — všechny varianty
| Nastavení | Hodnota | Zdůvodnění |
|-----------|---------|------------|
| Hyper-Threading | Enabled | Vyšší VM density |
| Virtualization Technology | Enabled | VT-x/AMD-V |
| VT-d / IOMMU | Enabled | Passthrough, SR-IOV |
| Power Management | Performance / OS | Minimalizace latence VM exit |
| C-States | Disabled | Nižší latence VM exit (důležité pro real-time VM) |
| NUMA | Enabled | NUMA-aware VM placement |
| SR-IOV | Enabled | NIC/GPU virtualizace |
| Adjacent Sector Prefetch | Enabled (Intel) | Lepší sekvenční čtení |
| DCU Streamer / IP Prefetcher | Enabled | HW prefetch pro VM workload |
| Patrol Scrub | Disabled (vSAN/Ceph) | Může způsobovat latency spikes u SDS |
### Výběr hypervisoru podle varianty
| Kritérium | VMware vSphere | Proxmox VE | Hyper-V | OpenStack |
|-----------|---------------|------------|---------|-----------|
| **Velikost** | SME - Enterprise | SME | SME - Enterprise | Hyperscale |
| **Storage** | vSAN, SAN, NFS | Ceph, ZFS, NFS | Storage Spaces, SAN | Ceph, manila |
| **License** | ~$1-5k/core | Zdarma (support ~$500/host) | Součást Windows Server | Open source |
| **Familiarita** | Nejvyšší | Střední | Windows admin | Nízká |
| **Automation** | Terraform, Ansible, PowerCLI | Ansible, Terraform, PBS | PowerShell, SCVMM | Terraform, Heat, Ansible |
| **Ekosystém** | Nejširší (Veeam, Zerto, SRM) | Rostoucí (PBS, vzdálená migrace) | Windows ecosystem | Open source (Kolla, TripleO) |
---
## 3. Kubernetes node
### Node profily
| Role | CPU | RAM | Storage | Network | Use case |
|------|-----|-----|---------|---------|----------|
| **General purpose** | 16-32 cores | 64-128 GB | 1× NVMe OS + 1×NVMe local | Web, API, microservices |
| **Memory optimized** | 32-64 cores | 256-512 GB | 1× NVMe OS + 2×NVMe local | In-memory cache, DB |
| **Compute optimized** | 64-128 cores | 128-256 GB | 1× NVMe OS | Batch, CI/CD |
| **GPU node** | 32-64 cores | 512-1024 GB | 1× NVMe OS + 4-8×NVMe local | AI/ML training, inference |
| **Storage node** | 16-32 cores | 64-128 GB | 4-12× NVMe/SATA (Ceph/Longhorn) | SDS, persistent volumes |
### Kernel tuning
```
# /etc/sysctl.d/99-kubernetes.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
net.ipv4.conf.all.forwarding = 1
# Connection tracking (pro NodePort, Service)
net.netfilter.nf_conntrack_max = 2097152
net.netfilter.nf_conntrack_tcp_timeout_established = 86400
# File watchers (pro kubelet, containerd)
fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches = 524288
# Memory management
vm.swappiness = 0
vm.overcommit_memory = 1 # Allow overcommit (CRI-O, containerd)
vm.panic_on_oom = 0
kernel.panic = 10
kernel.panic_on_oops = 1
```
### Container storage
| Typ | Doporučení | Poznámka |
|-----|-----------|----------|
| **OS disk** | RAID 1 (2× NVMe) | Ext4/XFS, 100-200 GB |
| **Container runtime image** | RAID 1 (2× NVMe) | /var/lib/containerd, 200-500 GB |
| **Local PV** | Single NVMe | Raw device, no RAID |
| **Rook/Ceph OSD** | Raw NVMe/SATA | HBA/IT mode, no RAID |
| **Longhorn** | Raw NVMe/SATA | Ext4/XFS per volume |
---
## 4. Storage server (Ceph / MinIO / NAS)
### Ceph OSD node
| Komponenta | Doporučení | Poznámka |
|-----------|-----------|----------|
| **CPU** | 1-2 cores per OSD | Do 12 OSD na node (24 cores) |
| **RAM** | 4-8 GB per OSD + OS | BlueStore cache, 16-64 GB min |
| **Network** | 2× 25/100 GbE | Public + Cluster network |
| **Storage** | 10-12× NVMe/SATA SSD OSD | HBA/IT mode, žádný RAID |
| **OS disk** | 2× SATA SSD RAID 1 | OS, Ceph MON/MGR |
**BIOS pro Ceph:**
- SATA/NVMe: AHCI/NVMe mode (ne RAID)
- C-States: Disabled (nižší latence OSD)
- NUMA: Enabled
- Power: Performance
### MinIO node
| Komponenta | Doporučení |
|-----------|-----------|
| **CPU** | 8-16 cores (32+ pro erasure coding) |
| **RAM** | 32-64 GB + 1 GB per 1 TB storage |
| **Storage** | 4-16× NVMe (direct, no RAID) |
| **Network** | 2× 25/100 GbE |
| **OS** | Ubuntu / RHEL, XFS (pro data) |
### NAS (TrueNAS / FreeNAS)
- **ZFS**: RAID-Z1/Z2/Z3, compression (lz4, zstd), dedup
- **ARC cache**: 1 GB per 1 TB storage (max 64 GB)
- **L2ARC**: NVMe cache (optional, read-heavy)
- **SLOG**: NVDIMM / Optane (sync write, ZIL)
- **Network**: 2-4× 10/25 GbE LACP
---
## 5. Web / API servery
| Parametr | Doporučení |
|----------|-----------|
| **CPU** | High clock, 8-32 cores |
| **RAM** | 32-128 GB |
| **Storage** | 2× NVMe RAID 1 (OS + app) |
| **OS** | Ubuntu / RHEL, optimized kernel |
| **Network** | 2× 10/25 GbE (bonding) |
**Kernel tuning:**
```
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.core.netdev_max_backlog = 65535
```
---
## Rychlý decision tree — výběr serveru podle workloadu, velikosti a storage
```mermaid
flowchart TD
W["Jaký workload?"] --> DB["Databáze"]
W --> HV["Virtualizace"]
W --> K8s["Kubernetes"]
W --> AI["AI/ML"]
W --> ST["Storage server"]
W --> WEB["Web / API"]
DB --> DBS{"Velikost firmy"}
DBS -->|"< 500"| DB1["1× EPYC 8-16C, 64-256 GB<br/>NVMe RAID10, 2× 25GbE"]
DBS -->|"500-5000"| DB2{"Storage"}
DB2 -->|"Lokální"| DB2L["1-2× EPYC 16-24C, 128-512 GB<br/>NVMe RAID10, 4× 25GbE"]
DB2 -->|"Ceph"| DB2C["2× EPYC 16-32C, 256-512 GB<br/>RBD, 4× 25/100GbE"]
DBS -->|"Enterprise"| DB3{"Storage"}
DB3 -->|"FC SAN"| DB3F["2× EPYC 48-128C, 512-2048 GB<br/>SAN LUN + 2× FC 32/64G"]
DB3 -->|"Ceph"| DB3C["2× EPYC 32-64C, 256-512 GB<br/>RBD, 4× 100GbE"]
DBS -->|"Cloud"| DBC["RDS/Azure SQL/CloudSQL<br/>Managed, Multi-AZ"]
DB --> ORACLE{"Oracle architektura?"}
ORACLE -->|"Standalone"| ORA1["1-2× EPYC 8-24C<br/>64-512 GB, ASM local/FC<br/>2× 25GbE + FC 32G"]
ORACLE -->|"Data Guard"| ORA2["2× EPYC 32-64C<br/>256-1024 GB, FC SAN<br/>2× 25/100GbE + 2× FC 64G<br/>2× 25GbE (DG sync)"]
ORACLE -->|"RAC 2-4 nodes"| ORA3["Per node: 2× EPYC 32-64C<br/>512-2048 GB, FC SAN<br/>2× 100GbE (app)<br/>2× FC 64G (storage)<br/>2× 100GbE RoCE (interconnect)"]
ORACLE -->|"Exadata"| ORA4["Engineered system<br/>2-8 DB servers + 3-18 storage<br/>RoCE 100GbE, Smart Scan<br/>15-30 kW/rack"]
HV --> HVS{"Počet hostů"}
HVS -->|"2-3"| HV1["1× EPYC 12-24C, 128-256 GB<br/>RAID5/6 SSD, 2-4× 10/25GbE"]
HVS -->|"3-6"| HV2{"HCI"}
HV2 -->|"vSAN"| HV2V["1-2× EPYC 16-32C, 256-512 GB<br/>NVMe cache + SSD, 4× 25GbE"]
HV2 -->|"Ceph"| HV2C["1× EPYC 12-24C, 128-256 GB<br/>4-8× HBA NVMe/SSD, 4× 25GbE"]
HVS -->|"6+"| HV3["2× EPYC 32-64C, 512-2048 GB<br/>FC SAN 32/64G, 4-8× 25/100GbE"]
HVS -->|"20+"| HV4["2× EPYC 64-128C, 512-1024 GB<br/>OpenStack + Ceph, 4-8× 100GbE"]
K8s --> K8T{"Typ uzlu"}
K8T -->|"General"| K8G["16-32C, 64-128 GB<br/>2× NVMe, 2× 25GbE"]
K8T -->|"Memory"| K8M["32-64C, 256-512 GB<br/>3× NVMe, 2× 25GbE"]
K8T -->|"GPU"| K8U["32-64C, 512-1024 GB<br/>6-10× NVMe, H100/B200, 4× 100GbE"]
K8T -->|"Storage"| K8S["16-32C, 64-128 GB<br/>6-14× HBA NVMe, 4× 25GbE"]
AI --> AIT{"Účel"}
AIT -->|"Trénování"| AITR["GPU H100/B200, NVLink<br/>InfiniBand 400Gb/s, liquid cooling"]
AIT -->|"Inference"| AIIR["A100/H200, MIG<br/>PCIe 5.0, 2× 100GbE"]
ST --> STT{"Typ"}
STT -->|"Ceph OSD"| STC["EPYC (PCIe lanes)<br/>4-8 GB/OSD, HBA, 2× 25/100GbE"]
STT -->|"MinIO"| STM["EPYC 8-16C, 32-64 GB<br/>4-16× NVMe direct, 2× 25/100GbE"]
STT -->|"NAS (ZFS)"| STN["EPYC 16-32C, 64-128 GB<br/>RAID-Z, SLOG NVMe, 2-4× 10/25GbE"]
WEB --> WEBE["EPYC high clock, 8-32C<br/>32-128 GB, 2× NVMe RAID1, 2× 10/25GbE"]
```
### Connectivity summary podle platformy
| Platforma | App / VM síť | Storage síť | Replikace / Cluster | Management |
|-----------|-------------|-------------|---------------------|------------|
| **DB lokální (malá)** | 2× 25 GbE LACP | — | 2× 25 GbE (sdílené) | 1× 1 GbE (iLO) |
| **DB lokální (střední)** | 2× 25/100 GbE LACP | — | 2× 25 GbE dedikované | 1× 1 GbE (iLO) |
| **DB FC SAN** | 2× 25/100 GbE LACP | 2× 32/64 Gb FC multipath | FC replication | 1× 1 GbE (iLO) + SAN mgmt |
| **DB Ceph** | 2× 25/100 GbE | 2× 25/100 GbE (Ceph public) | 2× 25/100 GbE (Ceph cluster) | 1× 1 GbE (iLO) |
| **Hypervisor lokální** | 2-4× 10/25 GbE LACP | — (lokální) | — | 1× 1 GbE (iLO) |
| **Hypervisor vSAN** | 2× 25/100 GbE LACP | 2× 25/100 GbE (vSAN) | vSAN traffic | 1× 1 GbE (iLO) |
| **Hypervisor FC SAN** | 2-4× 25/100 GbE LACP | 2× 32/64 Gb FC multipath | 2× 25 GbE (vMotion) | 1× 1 GbE (iLO) |
| **Hypervisor Ceph** | 2× 25/100 GbE LACP | 2× 25/100 GbE (Ceph) | 2× 25 GbE (migration) | 1× 1 GbE (iLO) |
| **Kubernetes** | 2× 25/100 GbE | 2× 25/100 GbE (Ceph/Longhorn) | 2× 25/100 GbE (K8s cluster) | 1× 1 GbE (BMC) |
| **Web/API** | 2× 10/25 GbE LACP | — | — | 1× 1 GbE (BMC) |
| **Oracle Standalone** | 2× 25 GbE LACP | 2× FC 32G nebo NVMe local | Data Guard 2× 25 GbE | 1× 1 GbE (iLO) + ASM mgmt |
| **Oracle Data Guard** | 2× 25/100 GbE LACP | 2× FC 64G multipath | 2× 25 GbE (DG sync) | 1× 1 GbE (iLO) + SAN mgmt |
| **Oracle RAC** | 2× 100 GbE LACP (VIP/SCAN) | 2× FC 64G multipath | 2× 100 GbE RoCE (Cache Fusion) | 1× 1 GbE (iLO) + Clusterware |
| **Oracle Exadata** | 4-8× 100 GbE RoCE | NVMe over Fabric | RDMA interconnect | Exadata CLI + OEDA |
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Poslední revize: 2026-06-03*

353
SERVER-HW.en.md Normal file
View File

@@ -0,0 +1,353 @@
# 🔧 Server hardware — components and architecture
## Form factors
| Type | Description | Advantages | Disadvantages |
|-----|-------|--------|----------|
| **Rack (1U/2U/4U)** | Standard rack mount, 19" width | Wide range of configurations, easy replacement | Limited PCIe slots in 1U |
| **Blade** | Modular server into chassis (HPE Synergy, Dell MX) | High density, shared power/cooling | Vendor lock-in, higher chassis cost |
| **Tower** | Standalone cabinet | Quiet, expandable | Takes space, not rack-optimized |
| **Edge / Micro** | Small, low power, industrial design | Environmental resistance, low consumption | Limited performance, fewer PCIe |
## Processors (CPU)
### Intel Xeon vs AMD EPYC
| Feature | Intel Xeon (6th gen Granite Rapids) | AMD EPYC (5th gen Turin) |
|-----------|-----------------------------------|------------------------|
| **Max cores** | 128 (P-cores) | 192 (Zen 5c) / 128 (Zen 5) |
| **PCIe lanes** | 80-96 per socket | 128 per socket |
| **Memory channels** | 8 (DDR5) | 12 (DDR5) |
| **Max memory** | 4 TB | 6 TB+ |
| **Cache L3** | ~200 MB | ~384 MB |
| **AVX-512** | Yes (full width) | Yes (256bit) |
| **AMX (matrix)** | Yes (AMX, Intel AMX) | No |
| **TDP** | 350-500 W | 360-500 W |
| **Infrastructure** | Intel QuickAssist, DSA, IAA | AMD Infinity Architecture |
| **Use case** | AI inference, networking, HPC | Virtualization, databases, general purpose |
### CPU selection guide
| Workload | Recommended CPU | Rationale |
|----------|---------------|------------|
| **Database (OLTP)** | EPYC (high core count, more memory channels) | More PCIe lanes for NVMe, higher memory bandwidth |
| **Database (OLAP/DW)** | Xeon (AVX-512, AMX) | Vector instructions for analytical queries |
| **Virtualization** | EPYC (more cores, lower TCO) | Higher core density, lower price per core |
| **HPC / AI training** | Xeon + GPU (AMX for preprocessing) | AMX for data preprocessing, GPU for training |
| **Web / API servers** | EPYC (good perf/core, low TDP variants) | Good performance/W ratio |
| **Storage** | EPYC (128 PCIe lanes for NVMe) | Maximum NVMe drives |
## Memory (RAM)
### DIMM types
| Type | Description | Use case | Server support |
|-----|-------|----------|---------------|
| **RDIMM** (Registered) | Registered, buffered address lines (1 register) | Standard server memory | All servers |
| **LRDIMM** (Load-Reduced) | Reduced electrical load (2 registers — data + addresses) | High-capacity configurations (more DIMMs per channel) | Enterprise, 4R+ |
| **NVDIMM** (Non-Volatile) | Battery-backed DRAM + flash | Write cache, metadata, persistence | Legacy (Intel Optane PMEM) |
| **3D XPoint / Optane** | PCM-based persistence (discontinued by Intel) | Legacy | Intel-only, discontinued |
### DDR5 vs DDR4 key differences
| Feature | DDR4 | DDR5 |
|-----------|------|------|
| **Channel architecture** | 1× 64-bit channel per DIMM | 2× 32-bit sub-channel per DIMM |
| **Bank groups** | 4 (single rank) | 8 (single rank) |
| **Burst length** | 8 (BL8) | 16 (BL16) |
| **On-die ECC** | No | Yes (for correcting bit errors in DRAM) |
| **PMIC** | On motherboard | On DIMM (power management IC) |
| **VDD** | 1.2 V | 1.1 V |
| **RCD** | 1× RCD per DIMM | 2× RCD (one per sub-channel) |
| **Max DIMM capacity** | 64 GB (LRDIMM) | 256 GB (RDIMM 3DS) |
| **Max speed** | 3200 MT/s | 6400 MT/s (currently 4800-5600) |
### Memory rank — detail
Rank = set of DRAM chips on a DIMM that are accessible simultaneously (64bit data + 8bit ECC).
| Rank | Number of DRAM chips (x8) | DIMM capacity (typ.) | Description |
|------|---------------------|---------------------|-------|
| **Single Rank (1R)** | 8-9 | 8-32 GB | All DRAM chips in one bank |
| **Dual Rank (2R)** | 16-18 | 16-128 GB | Two banks, rank interleaving |
| **Quad Rank (4R)** | 32-36 | 64-256 GB (3DS) | Four banks, higher capacity |
| **Octa Rank (8R)** | 64-72 | 256 GB (3DS) | Highest capacity, enterprise |
**Rank interleaving**: Dual-rank DIMM can address two ranks alternately, increasing effective bandwidth (up to 5-15 % over single-rank at the same speed).
**DDR5 rank vs DDR4**: DDR5 single-rank already contains 8 bank groups (equivalent to dual-rank DDR4), therefore rank upgrade is less significant on DDR5 than DDR4.
**Rule**: Always prefer dual-rank DIMMs over single-rank for higher density and bandwidth. Quad-rank and octa-rank only LRDIMM or 3DS.
### DIMM population — basic rules
#### 1DPC vs 2DPC (DIMMs Per Channel)
| Configuration | DIMMs per channel | Max speed DDR5 | Bandwidth | Capacity |
|------------|-----------------|---------------|-----------|----------|
| **1DPC** | 1 | 4800-5600 MT/s | 100 % | Lower |
| **2DPC** | 2 | 4000-4400 MT/s | ~80 % | Higher |
**Important**: Populating 2 DIMMs per channel reduces memory speed. E.g. Dell R760:
- 1DPC: 5600 MT/s (with 5th Gen Xeon)
- 2DPC: 4400 MT/s (always)
#### Channel architecture (Intel Xeon 4th/5th Gen — 8 channels per CPU)
```
CPU 1 — Channel A [Slot A1 (white)] [Slot A9 (black)] 1DPC: populate white slots
─ Channel B [Slot A7 (white)] [Slot A15 (black)] 2DPC: populate white + black
─ Channel C [Slot A3 (white)] [Slot A11 (black)]
─ Channel D [Slot A5 (white)] [Slot A13 (black)]
─ Channel E [Slot A4 (white)] [Slot A12 (black)]
─ Channel F [Slot A6 (white)] [Slot A14 (black)]
─ Channel G [Slot A2 (white)] [Slot A10 (black)]
─ Channel H [Slot A8 (white)] [Slot A16 (black)]
```
#### Channel architecture (AMD EPYC — 12 channels per CPU)
```
CPU 1 ─ Channel 0-11 (12× single channel, 2 DPC)
Slot A0 (P0) / Slot A1 (P1) — per specific server model
```
AMD EPYC has 12 memory channels (vs Intel 8), giving 50 % higher theoretical memory bandwidth.
### Population rules by vendor
#### Dell PowerEdge (R660 / R760)
| Number of DIMMs per CPU | 1DPC (white slots) | 2DPC (white + black) | Speed |
|-------------------|-------------------|---------------------|-------|
| **1 DIMM per CPU** | A1 (Channel A) | — | 5600 MT/s |
| **2 DIMMs per CPU** | A1, A7 | — | 5600 MT/s |
| **4 DIMMs per CPU** | A1, A7, A3, A5 | — | 5600 MT/s |
| **8 DIMMs per CPU** | A1-A8 (all white) | — | 5600 MT/s |
| **16 DIMMs per CPU** | A1-A8 (white) | A9-A16 (black) | 4400 MT/s |
**Key Dell rules**:
1. All DIMMs must be DDR5 (do not mix generations)
2. Do not mix DIMM capacities (all identical)
3. Do not mix x4 and x8 DRAM chips
4. Do not mix 3DS and non-3DS RDIMM
5. If mixing DIMM speeds, all run at the lowest
6. Balance capacity across processors
7. Optimal configuration: 16× identical DIMM (1DPC on each channel)
8. Fault Resilient Memory (FRM): only 8 or 16 DIMMs per processor
#### HPE ProLiant (DL360 / DL380 Gen11)
**Population order** (16 slots per CPU, Intel):
| DIMMs | Population order |
|-------|---------------|
| 1 | 10 |
| 2 | 1, 3 |
| 4 | 1, 3, 7, 10 |
| 6 | 3, 5, 7, 10, 14, 16 |
| 8 | 1, 3, 5, 7, 10, 12, 14, 16 |
| 12 | 1, 2, 3, 5, 6, 7, 10, 11, 12, 14, 15, 16 |
| 16 | 1-16 |
**HPE SmartMemory rules**:
1. Most qualified configuration: 1DPC (white slots)
2. 2DPC (black slots) only after populating all white
3. HBM + 4th Gen Intel: does not support Hemi (hemisphere) and SGX
4. Heterogeneous mix: higher rank count into white slots
5. **Do not mix**: 3DS with non-3DS, x4 with x8, different ranks in channel, 16 Gb / 24 Gb / 32 Gb DRAM
#### HPE Gen11/Gen12 with AMD EPYC 9005 (a50012817enw)
AMD EPYC 9005 (Turin) delivers 12 memory channels per CPU and supports DDR5-6400.
| Feature | Detail |
|-----------|--------|
| **Memory channels** | 12 per CPU (vs 8 on Intel) |
| **Max DIMM slots** | 24 per CPU (2 DPC) |
| **Max speed** | DDR5-6400 (1 DPC), DDR5-48005600 (2 DPC) |
| **Max capacity** | 6 TB+ (12× 256 GB 3DS RDIMM) |
| **DIMM types** | RDIMM (1R/2R/4R/8R), 3DS RDIMM, LRDIMM |
| **Population** | 1 DPC (white slots): 12 DIMMs, full speed; 2 DPC: 24 DIMMs, reduced speed |
| **Optimum** | 12× identical DIMMs (1 DPC on each channel) = max bandwidth |
**Rules for AMD EPYC 9005:**
1. Populate with equal capacities within a channel
2. 1 DPC = full speed 6400 MT/s, 2 DPC = lower speed
3. For optimal bandwidth: 12 DIMMs (1DPC) per CPU — all 12 channels utilized
4. Maximum capacity: 24 DIMMs (2DPC) — 24× 256 GB = 6 TB per CPU
5. Do not mix RDIMM and LRDIMM in the same system
### Memory population — decision flow
```
How many DIMMs per CPU?
├── 1 DIMM → Channel A (slot 1), losing 87.5 % bandwidth
├── 2 DIMMs → Channels A+B, still losing 75 % bandwidth
├── 4 DIMMs → Channels A,B,C,D, better but not optimal
├── 8 DIMMs → 1DPC on all channels = MAX SPEED (5600 MT/s)
│ ✅ Recommended for performance
├── 12 DIMMs → 8× 1DPC + 4× 2DPC = mixed speed (4400 MT/s)
├── 16 DIMMs → 2DPC on all channels = MAX CAPACITY (4400 MT/s)
│ ✅ For capacity-intensive workloads
└── More than 16 → LRDIMM / 3DS only, speed penalty
Conclusion: 8 DIMMs per CPU (1DPC) = highest performance
16 DIMMs per CPU (2DPC) = highest capacity
```
### Impact of configuration on performance
| Configuration | Relative bandwidth | Latency | Use case |
|------------|-------------------|---------|----------|
| **1DPC, 8 ch, 5600 MT/s** (8 DIMM) | 100 % | Lowest | OLTP databases, HPC, real-time |
| **2DPC, 8 ch, 4400 MT/s** (16 DIMM) | ~78 % | +10-15 % | Virtualization, VDI, in-memory DB |
| **Mixed 1+2DPC** (12 DIMM) | ~85 % | Medium | Capacity/performance compromise |
| **Unbalanced channels** | 50-70 % | High | **Avoid** |
**Vendor recommendations:**
- **Dell**: 16× identical DIMMs (8 per CPU), 1DPC, 5600 MT/s = optimal performance
- **HPE Intel**: Always populate white slots first, 1DPC for max performance, 2DPC for max capacity
- **HPE AMD EPYC 9005**: 12 channels per CPU, 1DPC = 12 DIMMs per CPU at 6400 MT/s (max bandwidth); 2DPC = 24 DIMMs per CPU (max capacity 6 TB)
- **Supermicro**: Consult specific manual for the given model (DSG, GPU, storage)
- **Lenovo**: Same rules as Intel/AMD platform — prefer 1DPC
### Memory sizing per workload
| Workload | RAM/core ratio | Typical pool | Recommended configuration |
|----------|---------------|--------------|----------------------|
| Database (OLTP) | 8-16 GB/core, DB in RAM | 256 GB - 2 TB | 8× 32-64 GB RDIMM, 1DPC |
| Database (OLAP) | 16-64 GB/core, columnstore | 512 GB - 4 TB+ | 16× 64-128 GB RDIMM, 2DPC |
| Virtualization (VM) | 4-8 GB/core, per VM density | 256 GB - 2 TB | 8-16× 32-64 GB RDIMM |
| Kubernetes (general) | 2-4 GB/core | 64-256 GB | 8× 16-32 GB RDIMM, 1DPC |
| AI training (CPU preprocessing) | 2-4 GB/core | 128-512 GB | 8× 32-64 GB RDIMM, 1DPC |
| HPC | 1-2 GB/core | 64-128 GB | 8× 16 GB RDIMM, 1DPC, high-speed |
| In-memory DB (SAP HANA) | 8-32 GB/core | 1-6 TB+ | 16× 128-256 GB LRDIMM/3DS |
## PCIe
| Generation | Year | Speed per lane | x16 throughput | x24 (GPU) |
|----------|-----|-------------------|-----------------|-----------|
| **PCIe 3.0** | 2010 | 985 MB/s | 15.8 GB/s | 23.6 GB/s |
| **PCIe 4.0** | 2017 | 1.97 GB/s | 31.5 GB/s | 47.3 GB/s |
| **PCIe 5.0** | 2022 | 3.94 GB/s | 63 GB/s | 94.5 GB/s |
| **PCIe 6.0** | 2025 | 7.88 GB/s | 126 GB/s | 189 GB/s |
**PCIe lane allocation**:
- GPU (x16): NVIDIA H100, AMD MI300X
- NVMe U.2 (x4): each NVMe drive
- NIC 100 GbE (x16): dual-port 100 GbE
- RAID/HBA (x8): storage controller
**CPU PCIe lane count**:
- Intel Xeon Scalable (4th gen): 64-80 lanes per socket
- AMD EPYC (4th gen Genoa): 128 lanes per socket
- Dual-socket: 256 lanes total
## NUMA
### Topology
```
Socket 0 (NUMA node 0) Socket 1 (NUMA node 1)
├── Cores 0-31 ├── Cores 32-63
├── Memory 0-256 GB ├── Memory 256-512 GB
├── PCIe root complex (GPU, NVMe) ├── PCIe root complex (NIC, NVMe)
└── I/O hub └── I/O hub
│ │
└───────── Infinity Fabric / UPI ──┘
```
- **Local access** — CPU → own memory (low latency, full bandwidth)
- **Remote access** — CPU → second socket memory (higher latency, ~1.5×, lower bandwidth)
- NUMA-aware applications: databases, VMs, DPDK, AI training
### Cross-NUMA penalty
| CPU | Local latency | Remote latency | Penalty |
|-----|--------------|----------------|---------|
| AMD EPYC (Genoa) | ~80 ns | ~150 ns | ~1.9× |
| Intel Xeon (Sapphire Rapids) | ~90 ns | ~160 ns | ~1.8× |
## TDP and cooling
| CPU | TDP | Core count | Cooling |
|-----|-----|-----------|----------|
| Intel Xeon Platinum 8480+ | 350 W | 56 | Air (high-performance) |
| Intel Xeon 6980P (Granite Rapids) | 500 W | 128 | Liquid recommended |
| AMD EPYC 9654 (Genoa) | 360 W | 96 | Air / Liquid |
| AMD EPYC 9965 (Turin) | 500 W | 192 | Liquid recommended |
### Cooling requirements per rack density
| Rack density | kW/rack | Cooling |
|-------------|---------|---------|
| Low | 1-5 kW | Free air cooling |
| Medium | 5-15 kW | CRAC/CRAH, hot/cold aisle |
| High | 15-40 kW | In-row cooling, rear-door HX |
| Ultra | 40-100+ kW | Direct-to-chip liquid, immersion |
## BMC and management
| Vendor | BMC | API | Remote console | Features |
|--------|-----|-----|---------------|----------|
| **Dell** | iDRAC (9/10) | Redfish, RACADM | Virtual Console (HTML5) | Lifecycle Controller, SUU |
| **HPE** | iLO (5/6) | Redfish, iLOREST | Integrated Remote Console | Smart Update Manager, SUM |
| **Supermicro** | BMC / IPMI | IPMI, Redfish | IPMIView, HTML5 KVM | SuperDoctor, SSM |
| **Lenovo** | XClarity Controller | Redfish, IPMI | Remote Console | XClarity Administrator |
| **Cisco** | CIMC / UCSM | Redfish, XML API | KVM Console | UCS Manager, Intersight |
### Standard functions
- Power: on/off/cycle/reset
- Boot: one-shot PXE, CD-ROM redirect, BIOS setup
- Monitoring: sensors (temp, voltage, fan, PSU)
- Alerting: SNMP traps, email, Redfish events
- Remote media: ISO mount over network
- Serial over LAN (SOL)
## Vendors and series
| Vendor | Rack series | Blade series | Management |
|---------|-------------|-------------|------------|
| **Dell** | PowerEdge R6xx/R7xx (R660, R760) | MX7000, FX2 | iDRAC, OpenManage Enterprise |
| **HPE** | ProLiant DL (DL360, DL380) | Synergy, BladeSystem | iLO, OneView, OpsRamp |
| **Cisco** | UCS C-Series (C240, C245) | UCS B-Series, Fabric Interconnect | UCS Manager, Intersight |
| **Lenovo** | ThinkSystem SR (SR630, SR650) | ThinkSystem SN | XClarity |
| **Supermicro** | SuperServer (for GPU, storage, cloud) | FatTwin, MicroBlade | IPMI, SuperDoctor |
## Server connectivity
Detailed chapter on network and storage connectivity: [CONNECTIVITY.md](CONNECTIVITY.md)
## Storage controllers
| Controller | Type | RAID | Cache | Protocol |
|-----------|-----|------|-------|----------|
| **Dell PERC** (H755, H965) | HW RAID | 0/1/5/6/10/50/60 | 4-8 GB NV | NVMe, SAS, SATA |
| **Broadcom / LSI** (9560, 9670) | HW RAID / HBA | 0/1/5/6/10/50/60 | 4 GB NV | NVMe, SAS, SATA |
| **Intel VROC** | SW RAID (CPU) | 0/1/5/10 | — | NVMe only |
| **M.2 HW RAID** (BOSS-S1) | HW RAID | 0/1 | — | 2× M.2 NVMe/SATA |
### IT vs HW RAID mode
| Feature | IT (Initiator Target) / HBA | HW RAID |
|-----------|---------------------------|---------|
| **OS sees** | Each disk individually | RAID virtual disk |
| **Caching** | OS cache | RAID controller cache (BBU) |
| **RAID** | Software (mdadm, ZFS, Ceph) | Hardware + SW driver |
| **Passthrough** | Yes | No |
| **Use case** | SDS (Ceph, MinIO), ZFS | VMware VMFS, Windows, legacy |
| **Battery/Backup** | Not needed | Write-back cache requires BBU |
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Last revision: 2026-06-03*

353
SERVER-HW.md Normal file
View File

@@ -0,0 +1,353 @@
# 🔧 Server hardware — komponenty a architektura
## Form faktory
| Typ | Popis | Výhody | Nevýhody |
|-----|-------|--------|----------|
| **Rack (1U/2U/4U)** | Standardní rack mount, šířka 19" | Široká škála konfigurací, jednoduchá výměna | Omezený počet PCIe slotů v 1U |
| **Blade** | Modulární server do chassis (HPE Synergy, Dell MX) | Vysoká hustota, sdílené napájení/chlazení | Vendor lock-in, vyšší cena chassis |
| **Tower** | Samostatně stojící skříň | Tichý, rozšiřitelný | Zabírá místo, není rack-optimized |
| **Edge / Micro** | Malý, nízká spotřeba, industriální provedení | Odolnost vůči prostředí, nízký odběr | Omezený výkon, méně PCIe |
## Procesory (CPU)
### Intel Xeon vs AMD EPYC
| Vlastnost | Intel Xeon (6. gen Granite Rapids) | AMD EPYC (5. gen Turin) |
|-----------|-----------------------------------|------------------------|
| **Max jader** | 128 (P-cores) | 192 (Zen 5c) / 128 (Zen 5) |
| **PCIe lanes** | 80-96 per socket | 128 per socket |
| **Memory channels** | 8 (DDR5) | 12 (DDR5) |
| **Max memory** | 4 TB | 6 TB+ |
| **Cache L3** | ~200 MB | ~384 MB |
| **AVX-512** | Ano (full width) | Ano (256bit) |
| **AMX (matrix)** | Ano (AMX, Intel AMX) | Ne |
| **TDP** | 350-500 W | 360-500 W |
| **Infrastructure** | Intel QuickAssist, DSA, IAA | AMD Infinity Architecture |
| **Use case** | AI inference, networking, HPC | Virtualizace, databáze, general purpose |
### CPU selection guide
| Workload | Doporučený CPU | Zdůvodnění |
|----------|---------------|------------|
| **Databáze (OLTP)** | EPYC (high core count, more memory channels) | Více PCIe lanes pro NVMe, vyšší memory bandwidth |
| **Databáze (OLAP/DW)** | Xeon (AVX-512, AMX) | Vektorové instrukce pro analytické dotazy |
| **Virtualizace** | EPYC (více jader, nižší TCO) | Vyšší core density, nižší cena per core |
| **HPC / AI training** | Xeon + GPU (AMX pro preprocessing) | AMX pro data preprocessing, GPU pro training |
| **Web / API servery** | EPYC (good perf/core, low TDP variants) | Dobrý poměr výkon/W |
| **Storage** | EPYC (128 PCIe lanes pro NVMe) | Maximum NVMe disků |
## Operační paměť (RAM)
### Typy DIMM
| Typ | Popis | Use case | Server support |
|-----|-------|----------|---------------|
| **RDIMM** (Registered) | Registrovaná, buffer adresových linek (1 register) | Standardní serverová paměť | Všechny servery |
| **LRDIMM** (Load-Reduced) | Snížená elektrická zátěž (2 registry — data + adresy) | Vysokokapacitní konfigurace (více DIMMů na channel) | Enterprise, 4R+ |
| **NVDIMM** (Non-Volatile) | Bateriově zálohovaná DRAM + flash | Write cache, metadata, persistence | Legacy (Intel Optane PMEM) |
| **3D XPoint / Optane** | PCM-based persistence (ukončeno Intelem) | Legacy | Intel-only, ukončeno |
### DDR5 vs DDR4 klíčové rozdíly
| Vlastnost | DDR4 | DDR5 |
|-----------|------|------|
| **Channel architektura** | 1× 64-bit channel per DIMM | 2× 32-bit sub-channel per DIMM |
| **Bank groups** | 4 (single rank) | 8 (single rank) |
| **Burst length** | 8 (BL8) | 16 (BL16) |
| **On-die ECC** | Ne | Ano (pro opravu bitových chyb v DRAM) |
| **PMIC** | Na motherboard | Na DIMM (power management IC) |
| **VDD** | 1.2 V | 1.1 V |
| **RCD** | 1× RCD per DIMM | 2× RCD (jeden na sub-channel) |
| **Max DIMM capacity** | 64 GB (LRDIMM) | 256 GB (RDIMM 3DS) |
| **Max speed** | 3200 MT/s | 6400 MT/s (aktuálně 4800-5600) |
### Memory rank — detail
Rank = sada DRAM čipů na DIMMu, které jsou přístupné současně (64bit data + 8bit ECC).
| Rank | Počet DRAM čipů (x8) | Kapacita DIMM (typ.) | Popis |
|------|---------------------|---------------------|-------|
| **Single Rank (1R)** | 8-9 | 8-32 GB | Všechny DRAM čipy v jedné bance |
| **Dual Rank (2R)** | 16-18 | 16-128 GB | Dvě banky, rank interleaving |
| **Quad Rank (4R)** | 32-36 | 64-256 GB (3DS) | Čtyři banky, vyšší kapacita |
| **Octa Rank (8R)** | 64-72 | 256 GB (3DS) | Nejvyšší kapacita, enterprise |
**Rank interleaving**: Dual-rank DIMM může oslovovat dva ranking střídavě, což zvyšuje efektivní bandwidth (až o 5-15 % oproti single-rank při stejném taktu).
**DDR5 rank vs DDR4**: DDR5 single-rank již obsahuje 8 bank groups (ekvivalent dual-rank DDR4), proto je rank upgrade u DDR5 méně výrazný než u DDR4.
**Pravidlo**: Vždy preferovat dual-rank DIMMy před single-rank pro vyšší hustotu a bandwidth. Quad-rank a octa-rank pouze LRDIMM nebo 3DS.
### Osazování DIMM — základní pravidla
#### 1DPC vs 2DPC (DIMMs Per Channel)
| Konfigurace | DIMMů na channel | Max speed DDR5 | Bandwidth | Kapacita |
|------------|-----------------|---------------|-----------|----------|
| **1DPC** | 1 | 4800-5600 MT/s | 100 % | Nižší |
| **2DPC** | 2 | 4000-4400 MT/s | ~80 % | Vyšší |
**Důležité**: Při osazení 2 DIMMů na channel klesá rychlost pamětí. Např. Dell R760:
- 1DPC: 5600 MT/s (s 5th Gen Xeon)
- 2DPC: 4400 MT/s (vždy)
#### Channel architecture (Intel Xeon 4th/5th Gen — 8 channels per CPU)
```
CPU 1 — Channel A [Slot A1 (white)] [Slot A9 (black)] 1DPC: osadit bílé sloty
─ Channel B [Slot A7 (white)] [Slot A15 (black)] 2DPC: osadit bílé + černé
─ Channel C [Slot A3 (white)] [Slot A11 (black)]
─ Channel D [Slot A5 (white)] [Slot A13 (black)]
─ Channel E [Slot A4 (white)] [Slot A12 (black)]
─ Channel F [Slot A6 (white)] [Slot A14 (black)]
─ Channel G [Slot A2 (white)] [Slot A10 (black)]
─ Channel H [Slot A8 (white)] [Slot A16 (black)]
```
#### Channel architecture (AMD EPYC — 12 channels per CPU)
```
CPU 1 ─ Channel 0-11 (12× single channel, 2 DPC)
Slot A0 (P0) / Slot A1 (P1) — dle konkrétního serveru
```
AMD EPYC má 12 memory channels (vs Intel 8), což dává o 50 % vyšší teoretickou memory bandwidth.
### Pravidla osazování od výrobců
#### Dell PowerEdge (R660 / R760)
| Počet DIMMů na CPU | 1DPC (bílé sloty) | 2DPC (bílé + černé) | Speed |
|-------------------|-------------------|---------------------|-------|
| **1 DIMM per CPU** | A1 (Channel A) | — | 5600 MT/s |
| **2 DIMMs per CPU** | A1, A7 | — | 5600 MT/s |
| **4 DIMMs per CPU** | A1, A7, A3, A5 | — | 5600 MT/s |
| **8 DIMMs per CPU** | A1-A8 (všechny bílé) | — | 5600 MT/s |
| **16 DIMMs per CPU** | A1-A8 (bílé) | A9-A16 (černé) | 4400 MT/s |
**Klíčová pravidla dle Dell**:
1. Všechny DIMMy musí být DDR5 (nemíchat generace)
2. Nemíchat kapacity DIMMů (všechny stejné)
3. Nemíchat x4 a x8 DRAM chips
4. Nemíchat 3DS a non-3DS RDIMM
5. Pokud mícháte rychlosti DIMMů, všechny běží na nejnižší
6. Vyvážit kapacitu mezi procesory
7. Optimální konfigurace: 16× identický DIMM (1DPC na každém channelu)
8. Fault Resilient Memory (FRM): pouze 8 nebo 16 DIMMů na procesor
#### HPE ProLiant (DL360 / DL380 Gen11)
**Population order** (16 slotů na CPU, Intel):
| DIMMů | Pořadí osazení |
|-------|---------------|
| 1 | 10 |
| 2 | 1, 3 |
| 4 | 1, 3, 7, 10 |
| 6 | 3, 5, 7, 10, 14, 16 |
| 8 | 1, 3, 5, 7, 10, 12, 14, 16 |
| 12 | 1, 2, 3, 5, 6, 7, 10, 11, 12, 14, 15, 16 |
| 16 | 1-16 |
**Pravidla HPE SmartMemory**:
1. Nejkvalifikovanější konfigurace: 1DPC (bílé sloty)
2. 2DPC (černé sloty) až po osazení všech bílých
3. HBM + 4th Gen Intel: nepodporuje Hemi (hemisphere) a SGX
4. Heterogenní mix: vyšší rank count do bílých slotů
5. **Nemíchat**: 3DS s non-3DS, x4 s x8, různé ranky v channelu, 16 Gb / 24 Gb / 32 Gb DRAM
#### HPE Gen11/Gen12 s AMD EPYC 9005 (a50012817enw)
AMD EPYC 9005 (Turin) přináší 12 memory channels na CPU a podporu DDR5-6400.
| Vlastnost | Detail |
|-----------|--------|
| **Memory channels** | 12 per CPU (vs 8 u Intel) |
| **Max DIMM slots** | 24 per CPU (2 DPC) |
| **Max speed** | DDR5-6400 (1 DPC), DDR5-48005600 (2 DPC) |
| **Max capacity** | 6 TB+ (12× 256 GB 3DS RDIMM) |
| **DIMM typy** | RDIMM (1R/2R/4R/8R), 3DS RDIMM, LRDIMM |
| **Population** | 1 DPC (bílé sloty): 12 DIMMs, plná rychlost; 2 DPC: 24 DIMMs, snížená rychlost |
| **Optimum** | 12× identických DIMMů (1 DPC na každém channelu) = max bandwidth |
**Pravidla pro AMD EPYC 9005:**
1. Osazovat po stejných kapacitách v rámci channelu
2. 1 DPC = plná rychlost 6400 MT/s, 2 DPC = nižší rychlost
3. Pro optimální bandwidth: 12 DIMMů (1DPC) na CPU — využito všech 12 channelů
4. Maximální kapacita: 24 DIMMů (2DPC) — 24× 256 GB = 6 TB na CPU
5. Nemíchat RDIMM a LRDIMM ve stejném systému
### Memory population — decision flow
```
Kolik DIMMů na CPU?
├── 1 DIMM → Channel A (slot 1), ztrácíte 87.5 % bandwidth
├── 2 DIMMs → Channels A+B, stále ztráta 75 % bandwidth
├── 4 DIMMs → Channels A,B,C,D, lepší, ale ne optimální
├── 8 DIMMs → 1DPC na všech channel = MAX SPEED (5600 MT/s)
│ ✅ Doporučeno pro výkon
├── 12 DIMMs → 8× 1DPC + 4× 2DPC = mixed speed (4400 MT/s)
├── 16 DIMMs → 2DPC na všech channel = MAX KAPACITA (4400 MT/s)
│ ✅ Pro kapacitně náročné workloady
└── Více než 16 → Pouze s LRDIMM / 3DS, speed penalty
Závěr: 8 DIMMů na CPU (1DPC) = nejvyšší výkon
16 DIMMů na CPU (2DPC) = nejvyšší kapacita
```
### Vliv konfigurace na výkon
| Konfigurace | Relativní bandwidth | Latence | Use case |
|------------|-------------------|---------|----------|
| **1DPC, 8 ch, 5600 MT/s** (8 DIMM) | 100 % | Nejnižší | Databáze OLTP, HPC, real-time |
| **2DPC, 8 ch, 4400 MT/s** (16 DIMM) | ~78 % | +10-15 % | Virtualizace, VDI, in-memory DB |
| **Mixed 1+2DPC** (12 DIMM) | ~85 % | Střední | Kompromis kapacity/výkonu |
| **Unbalanced channels** | 50-70 % | Vysoká | **Vyhnout se** |
**Doporučení výrobců:**
- **Dell**: 16× identických DIMMů (8 per CPU), 1DPC, 5600 MT/s = optimální výkon
- **HPE Intel**: Vždy plnit bílé sloty první, pro max výkon 1DPC, pro max kapacitu 2DPC
- **HPE AMD EPYC 9005**: 12 channelů na CPU, 1DPC = 12 DIMMů na CPU při 6400 MT/s (max bandwidth); 2DPC = 24 DIMMů na CPU (max kapacita 6 TB)
- **Supermicro**: Sledovat konkrétní manual pro daný model (DSG, GPU, storage)
- **Lenovo**: Stejná pravidla jako Intel/AMD platforma — preferovat 1DPC
### Memory sizing per workload
| Workload | Poměr RAM/core | Typický pool | Doporučená konfigurace |
|----------|---------------|--------------|----------------------|
| Databáze (OLTP) | 8-16 GB/core, DB v RAM | 256 GB - 2 TB | 8× 32-64 GB RDIMM, 1DPC |
| Databáze (OLAP) | 16-64 GB/core, columnstore | 512 GB - 4 TB+ | 16× 64-128 GB RDIMM, 2DPC |
| Virtualizace (VM) | 4-8 GB/core, podle VM density | 256 GB - 2 TB | 8-16× 32-64 GB RDIMM |
| Kubernetes (general) | 2-4 GB/core | 64-256 GB | 8× 16-32 GB RDIMM, 1DPC |
| AI training (CPU preprocessing) | 2-4 GB/core | 128-512 GB | 8× 32-64 GB RDIMM, 1DPC |
| HPC | 1-2 GB/core | 64-128 GB | 8× 16 GB RDIMM, 1DPC, high-speed |
| In-memory DB (SAP HANA) | 8-32 GB/core | 1-6 TB+ | 16× 128-256 GB LRDIMM/3DS |
## PCIe
| Generace | Rok | Rychlost per lane | x16 propustnost | x24 (GPU) |
|----------|-----|-------------------|-----------------|-----------|
| **PCIe 3.0** | 2010 | 985 MB/s | 15.8 GB/s | 23.6 GB/s |
| **PCIe 4.0** | 2017 | 1.97 GB/s | 31.5 GB/s | 47.3 GB/s |
| **PCIe 5.0** | 2022 | 3.94 GB/s | 63 GB/s | 94.5 GB/s |
| **PCIe 6.0** | 2025 | 7.88 GB/s | 126 GB/s | 189 GB/s |
**PCIe lane allocation**:
- GPU (x16): NVIDIA H100, AMD MI300X
- NVMe U.2 (x4): každý NVMe disk
- NIC 100 GbE (x16): dual-port 100 GbE
- RAID/HBA (x8): storage controller
**CPU PCIe lane count**:
- Intel Xeon Scalable (4. gen): 64-80 lanes per socket
- AMD EPYC (4. gen Genoa): 128 lanes per socket
- Dual-socket: 256 lanes total
## NUMA
### Topologie
```
Socket 0 (NUMA node 0) Socket 1 (NUMA node 1)
├── Cores 0-31 ├── Cores 32-63
├── Memory 0-256 GB ├── Memory 256-512 GB
├── PCIe root complex (GPU, NVMe) ├── PCIe root complex (NIC, NVMe)
└── I/O hub └── I/O hub
│ │
└───────── Infinity Fabric / UPI ──┘
```
- **Local access** — CPU → vlastní memory (nízká latence, plná bandwidth)
- **Remote access** — CPU → druhý socket memory (vyšší latence, ~1.5×, nižší bandwidth)
- NUMA-aware aplikace: databáze, VM, DPDK, AI training
### Cross-NUMA penalty
| CPU | Local latency | Remote latency | Penalty |
|-----|--------------|----------------|---------|
| AMD EPYC (Genoa) | ~80 ns | ~150 ns | ~1.9× |
| Intel Xeon (Sapphire Rapids) | ~90 ns | ~160 ns | ~1.8× |
## TDP a chlazení
| CPU | TDP | Core count | Chlazení |
|-----|-----|-----------|----------|
| Intel Xeon Platinum 8480+ | 350 W | 56 | Air (high-performance) |
| Intel Xeon 6980P (Granite Rapids) | 500 W | 128 | Liquid recommended |
| AMD EPYC 9654 (Genoa) | 360 W | 96 | Air / Liquid |
| AMD EPYC 9965 (Turin) | 500 W | 192 | Liquid recommended |
### Cooling requirements per rack density
| Rack density | kW/rack | Cooling |
|-------------|---------|---------|
| Low | 1-5 kW | Free air cooling |
| Medium | 5-15 kW | CRAC/CRAH, hot/cold aisle |
| High | 15-40 kW | In-row cooling, rear-door HX |
| Ultra | 40-100+ kW | Direct-to-chip liquid, immersion |
## BMC a management
| Vendor | BMC | API | Remote console | Features |
|--------|-----|-----|---------------|----------|
| **Dell** | iDRAC (9/10) | Redfish, RACADM | Virtual Console (HTML5) | Lifecycle Controller, SUU |
| **HPE** | iLO (5/6) | Redfish, iLOREST | Integrated Remote Console | Smart Update Manager, SUM |
| **Supermicro** | BMC / IPMI | IPMI, Redfish | IPMIView, HTML5 KVM | SuperDoctor, SSM |
| **Lenovo** | XClarity Controller | Redfish, IPMI | Remote Console | XClarity Administrator |
| **Cisco** | CIMC / UCSM | Redfish, XML API | KVM Console | UCS Manager, Intersight |
### Standardní funkce
- Power: on/off/cycle/reset
- Boot: one-shot PXE, CD-ROM redirect, BIOS setup
- Monitoring: sensors (temp, voltage, fan, PSU)
- Alerting: SNMP traps, email, Redfish events
- Remote media: ISO mount přes network
- Serial over LAN (SOL)
## Výrobci a řady
| Výrobce | Rack series | Blade series | Management |
|---------|-------------|-------------|------------|
| **Dell** | PowerEdge R6xx/R7xx (R660, R760) | MX7000, FX2 | iDRAC, OpenManage Enterprise |
| **HPE** | ProLiant DL (DL360, DL380) | Synergy, BladeSystem | iLO, OneView, OpsRamp |
| **Cisco** | UCS C-Series (C240, C245) | UCS B-Series, Fabric Interconnect | UCS Manager, Intersight |
| **Lenovo** | ThinkSystem SR (SR630, SR650) | ThinkSystem SN | XClarity |
| **Supermicro** | SuperServer (pro GPU, storage, cloud) | FatTwin, MicroBlade | IPMI, SuperDoctor |
## Server connectivity
Detailní kapitola o síťové a storage konektivitě: [CONNECTIVITY.md](CONNECTIVITY.md)
## Storage controllers
| Controller | Typ | RAID | Cache | Protokol |
|-----------|-----|------|-------|----------|
| **Dell PERC** (H755, H965) | HW RAID | 0/1/5/6/10/50/60 | 4-8 GB NV | NVMe, SAS, SATA |
| **Broadcom / LSI** (9560, 9670) | HW RAID / HBA | 0/1/5/6/10/50/60 | 4 GB NV | NVMe, SAS, SATA |
| **Intel VROC** | SW RAID (CPU) | 0/1/5/10 | — | NVMe only |
| **M.2 HW RAID** (BOSS-S1) | HW RAID | 0/1 | — | 2× M.2 NVMe/SATA |
### IT vs HW RAID mode
| Vlastnost | IT (Initiator Target) / HBA | HW RAID |
|-----------|---------------------------|---------|
| **OS vidí** | Každý disk samostatně | RAID virtuální disk |
| **Caching** | OS cache | RAID controller cache (BBU) |
| **RAID** | Software (mdadm, ZFS, Ceph) | Hardware + SW driver |
| **Passthrough** | Ano | Ne |
| **Use case** | SDS (Ceph, MinIO), ZFS | VMware VMFS, Windows, legacy |
| **Battery/Backup** | Není potřeba | Write-back cache vyžaduje BBU |
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Poslední revize: 2026-06-03*

283
STORAGE.en.md Normal file
View File

@@ -0,0 +1,283 @@
# 💾 Storage infrastructure
## Storage types
| Type | Description | Latency | Use case |
|-----|-------|---------|----------|
| **DAS** (Direct Attached) | Disks directly in server | <0.1 ms | OS, cache, local data |
| **SAN** (Storage Area Network) | Block devices over network | <1 ms | Databases, VM datastores |
| **NAS** (Network Attached Storage) | File access (NFS, SMB) | 1-3 ms | Shared files, home dirs |
| **Object storage** | REST API, flat namespace | 10-100 ms | Backups, media, big data |
## Protocols
| Protocol | Type | Speed | Note |
|----------|-----|----------|----------|
| **Fibre Channel** | SAN | 8/16/32/64 Gbps | Low latency, dedicated network |
| **iSCSI** | SAN (IP) | 1/10/25 GbE | Cheaper, over ethernet |
| **NVMe-oF** | SAN (NVMe) | 25/50/100 GbE | Lowest latency, emerging |
| **NFS** | NAS | 1/10/25 GbE | Universal, simple |
| **SMB/CIFS** | NAS | 1/10/25 GbE | Windows native |
| **S3 API** | Object | — | Standard for object storage |
## RAID
| RAID | Min. disks | Capacity | Protection | Read speed | Write speed | Use case |
|------|-----------|----------|---------|---------------|----------------|----------|
| **0** | 2 | 100 % | None | N × (striping) | N × | Temp data, cache (risky) |
| **1** | 2 | 50 % | 1 disk | N × (mirror) | 1 × | OS disk, critical data |
| **5** | 3 | 67-94 % | 1 disk | N-1 × | N-1 × (parity write penalty) | Universal file/VM storage |
| **6** | 4 | 50-88 % | 2 disks | N-2 × | N-2 × (double parity) | Large capacities, important data |
| **10** | 4 | 50 % | 1/mirror | N × | N/2 × | Databases, VM, high-performance |
| **50** | 6 | 67-94 % | 1/stripe | N-1 × | N-1 × | Large capacity + performance |
| **60** | 8 | 50-88 % | 2/stripe | N-2 × | N-2 × | Enterprise |
### Stripe size
- Small stripe (16-64 KB) — better IOPS, worse throughput (databases, OLTP)
- Large stripe (128-1024 KB) — better throughput, worse IOPS (video, media, backup)
- Write hole on RAID 5/6: metadata inconsistency during power loss while writing parity (prevention: non-volatile cache, battery-backed RAID controller)
## Software-Defined Storage (SDS)
| Tool | Type | Use case |
|---------|-----|----------|
| **Ceph** | Object/Block/File (RADOS) | Universal SDS, OpenStack, Kubernetes |
| **MinIO** | Object (S3 API) | High-performance S3, AI/ML data lake |
| **GlusterFS** | Distributed File | Shared filesystem, POSIX |
| **Longhorn** | Block (Kubernetes) | K8s PVC, microservices |
| **Linstor** | Block (DRBD + LVM) | Linux SDS, Kubernetes |
| **VMware vSAN** | Block (HCI) | VMware ecosystem |
| **StarWind** | Block (HCI) | Hyper-V / VMware |
### Ceph
**Architecture**:
```
RADOS (Reliable Autonomic Distributed Object Store)
├── Monitors (MON) — cluster map, quorum (3/5)
├── Managers (MGR) — dashboard, balancer, orchestrator
├── OSDs (Object Storage Daemons) — data + replication
└── MDS (Metadata Server) — CephFS only
```
**CRUSH map** (Controlled Replication Under Scalable Hashing):
- Algorithm for calculating data placement (no central index)
- Layers: Root → Datacenter → Rack → Host → OSD
- Failure domain: replication across racks / hosts
- `ceph osd crush rule create-replicated replicated_rule default host`
**Access interfaces**:
| Interface | Type | Use case |
|----------|-----|----------|
| **RBD** (RADOS Block Device) | Block | VM images, Kubernetes PVC (csi-rbd) |
| **RGW** (RADOS Gateway) | Object (S3/Swift API) | S3-compatible storage, backup |
| **CephFS** | File (POSIX) | Shared filesystem, home dirs |
| **NFS-Ganesha** | File (NFS) | NFS export over CephFS |
**Erasure coding**:
- K+M (data + parity chunks), e.g. 8+3 (8 data, 3 parity)
- More space-efficient than 3× replication (1.375× vs 3×)
- Higher CPU overhead, lower IOPS
- Recommended for cold data (RGW) instead of replication
## Enterprise storage vendors
### Hitachi VSP (Virtual Storage Platform)
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **VSP 5200/5600** | Active-active, scale-up/out, 212 controllers | 69.3 PB raw, 287 PBe | 33M IOPS, 39 µs | FC-NVMe 32Gb, FC 16/32Gb, FICON 16Gb, iSCSI 10Gb | Mission-critical, mainframe, enterprise consolidation |
| **VSP E590/E790/E1090** | Symmetric active-active, up to 65 nodes/130 controllers | 10.62 PB raw (E1090) | 8.4M IOPS, <41 µs | FC 32Gb, iSCSI 25Gb, FC-NVMe 32Gb | Midrange enterprise, hybrid workloads |
**Key features**: SVOS common across entire portfolio, AI-driven data reduction 4:1 guarantee, Global-Active Device metro clustering, 8 nines availability (HW), 100% data availability guarantee.
---
### Huawei OceanStor Dorado
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **Dorado 8000/18000 V6** | SmartMatrix full-mesh, up to 32 controllers | 32 TB cache, 6400 SSD | 40M IOPS, 0.05 ms | FC 32/64Gb, FC-NVMe, iSCSI, NFS, SMB, NVMe/RoCE, S3 | Mission-critical, finance, govt, carrier |
| **Dorado 8000/18000 V7 (2025)** | SmartMatrix 4.0, up to 64/128 controllers | 500 PB+ | >100M IOPS, 0.03 ms | FC, RoCE, NVMe/TCP, NFS, SMB, S3 | AI workloads, converged block/file/object |
**Key features**: SmartMatrix survives 7/8 controllers, FlashEver (3-gen online HW upgrade in 10 years), RAID-TP (triple SSD failure), DPU-based SmartNIC, ML-based I/O prefetch, 100% ransomware detection (Tolly), #1 SPC-1 benchmark.
---
### Dell PowerStore & PowerMax
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **PowerStore 1500/5500/9500 (Gen 3)** | Active-active dual-node, PCIe Gen5, DDR5, RDMA 200GbE | 1.2 PB raw, 5.8 PBe | 3× IOPS vs Gen2 | FC 32/64Gb, iSCSI, NVMe/FC, NVMe/TCP, NFSv4, SMB3 | Midrange-to-high-end, VMware, containerized |
| **PowerMax 2500/8500** | Scale-out NVMe, Dynamic Fabric, up to 16 nodes | 8.8 PBe (2500), 18 PBe (8500) | 6 nines availability | FC 64Gb, FICON, NVMe/FC, NVMe/TCP, iSCSI, NFS, SMB | Mission-critical, mainframe, OLTP, cyber vault |
**Key features**: PowerStore 6:1 DRR guarantee, unified block/file/vVols out of box, Cyber Detect AI anomaly; PowerMax 5:1 DRR, Secure Snapshots 65M, SRDF/Metro, Flexible RAID up to 92% efficient, FIPS 140-3.
---
### HPE Alletra
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **Alletra 5000** | Active-active hybrid flash, dual controller | 1.2 PB raw | 99.9999% guarantee | FC, iSCSI | Mixed primary + secondary, cost-efficient hybrid |
| **Alletra 6000** | Active-active all-NVMe, dual controller | ~368 TB usable | <100 µs | FC, iSCSI | Business-critical DB, VDI, VMware |
| **Alletra 9000** | Active-active all-NVMe, multi-node scale-out | 24 PB+ usable | ~23M IOPS, <150 µs | FC, iSCSI, NVMe/FC | Mission-critical ERP, AI, consolidation |
| **Alletra Storage MP** | Disaggregated modular, block + file + object | 5.8 PB block, 11.8 PB object | 100% availability guarantee | FC, iSCSI, NVMe/FC, NFS, SMB, S3 | Multi-protocol consolidation, AI/analytics |
**Key features**: Triple Parity RAID (5000), InfoSight AI Ops, HPE GreenLake as-a-service, non-disruptive controller upgrades (MP), 100% data availability guarantee.
---
### Infinidat
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **InfiniBox SSA G4** | Triple-active controller, AMD EPYC PCIe 5.0, DDR5 | 1.97 PB usable / 5.9 PBe | 2.24M IOPS, 35 µs | FC 32Gb, 25/100GbE, NVMe-oF/TCP, iSCSI, NFS, SMB, S3 | Mission-critical Oracle/SQL, multi-site DR |
| **InfiniBox G4 Hybrid** | Triple-active hybrid (HDD + flash cache) | 10.9 PB raw / 32.8 PBe | 2.24M IOPS, 64 GB/s | FC, Ethernet, NVMe-oF, iSCSI, NFS, SMB, S3 | Backup, massive unstructured data |
**Key features**: Only 3-way active on the market, Neural Cache (ML-driven), InfiniRAID, Immutable snapshots, 100% availability + 1-min snapshot recovery guarantee, everything included in base price (no extra licensing).
---
### Pure Storage FlashArray
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **FlashArray//X (X20X90 R5)** | Active-active, NVMe DirectFlash | 1.2 PB raw / 4.4 PBe | 250 µs, 5:1 DRR | FC, NVMe/FC, NVMe/RoCE, NVMe/TCP, iSCSI, NFS, SMB | Mission-critical DB, VMware, enterprise |
| **FlashArray//C (C50C90 R5)** | Active-active, QLC DirectFlash | 4.2 PB raw / 16.3 PBe | 5:1 DRR | FC, NVMe-oF, iSCSI, NFS, SMB | Capacity-optimized, backup, file |
| **FlashArray//XL (XL190)** | Active-active, 40 DirectFlash modules | 1.9 PB raw / 9.4 PBe | >4M IOPS, <100 µs, 45 GB/s | FC 64Gb, 100GbE RoCE, NVMe/FC, NVMe/TCP, NFS, SMB | Largest DB consolidation, OLTP |
**Key features**: DirectFlash (no FTL layer), 99.9999% availability, Evergreen (never forklift upgrade), Purity OS unified across entire portfolio, ActiveCluster/ActiveDR, Pure1 AIOps.
---
### Lenovo ThinkSystem
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **DM Series** (DM3200F/5200F/7200F) | Active-active, all-NVMe, NetApp ONTAP | 1.8 PB raw / 6.8 PBe | Up to 120 NVMe SSD | FC 64Gb, iSCSI, NVMe/FC, NFS, SMB, S3 | Unified block/file, AI/ML, VMware |
| **DG Series** (DG5200/7200) | Active-active, all-QLC, ONTAP | 7.4 PB raw / 27 PBe | QLC economics | FC, NVMe/FC, NVMe/TCP, iSCSI, NFS, SMB, S3 | Capacity-optimized, backup, archive |
| **DE Series** (DE4000FDE6600F) | Active-active, SAS/NVMe hybrid | 1.84 PB raw | 2M IOPS, <100 µs, 44 GB/s | FC 32Gb, iSCSI 25Gb, NVMe/FC, SAS, NVMe/RoCE | HPC, analytics, video surveillance |
**Key features**: DM/DG use ONTAP (SnapMirror, SnapVault, FabricPool, RAID-DP/RAID-TEC); cluster scale-out up to 12 HA pairs; DE series best price/performance in portfolio.
---
### Synology
| Model | Architecture | Max capacity | Protocols | Use case |
|-------|-------------|--------------|-----------|----------|
| **UC3200/UC3400** | Active-active dual-controller, SAS backend | 576 TB raw | iSCSI, FC 16Gb, 10/25GbE | SMB/midmarket SAN, VMware, HA |
| **DS/RS Series** (RS3626xs+, RS6426xs+) | Single-controller / HA pair, Btrfs | 864 TB raw, 1 PB volume | SMB, NFS, iSCSI, FC (HBA) | SME all-in-one NAS/SAN, backup, surveillance |
**Key features**: DSM UC for SAN, Synology HA, Snapshot Replication (16K snapshots), VMware VAAI/ODX/ALUA, Surveillance Station, low TCO.
---
### Vendor comparison — overview
| Vendor | Flagship | Max IOPS | Max capacity | Latency | Availability guarantee | Main differentiator |
|--------|----------|----------|-------------|---------|---------------------|----------------------|
| **Hitachi** | VSP 5600 | 33M | 287 PBe | 39 µs | 8 nines (HW) | Mainframe + open; 65-node cluster |
| **Huawei** | Dorado 18000 V7 | >100M | 500 PB+ | 0.03 ms | 99.99999% | SmartMatrix; #1 SPC-1 |
| **Dell** | PowerMax 8500 | — | 18 PBe | — | 6 nines | SRDF/Metro; mainframe |
| **HPE** | Alletra 9000/MP | ~3M | 11.8 PBe | <150 µs | 100% data guarantee | InfoSight AIOps; GreenLake |
| **Infinidat** | InfiniBox SSA G4 | 2.24M | 32.8 PBe | 35 µs | 100% availability | 3-way active; Neural Cache |
| **Pure** | FlashArray//XL | >4M | 16.3 PBe | <100 µs | 99.9999% | DirectFlash; Evergreen |
| **Lenovo** | DM7200F | — | 27 PBe | — | — | ONTAP ecosystem; broad portfolio |
| **Synology** | UC3400 | 690K | 576 TB | — | — | Lowest price for active-active SAN |
---
### Storage selection by use case
| Use case | Recommendation | Rationale |
|----------|-----------|-------------|
| **Mainframe + open hybrid** | Hitachi VSP / Dell PowerMax | Only ones with FICON + FC simultaneously |
| **AI/ML training** | Huawei Dorado V7 / Pure //XL | Highest IOPS, lowest latency |
| **Enterprise DB (Oracle, SQL Server)** | Infinidat / Pure //X | Low latency, consistent performance |
| **Virtualization (VMware, Hyper-V)** | Dell PowerStore / HPE Alletra 6000 | VAAI, vVols, InfoSight |
| **SMB / SME** | Synology / Lenovo DE | Low TCO, simple management |
| **Object storage / backup** | Pure //C / Lenovo DG / Infinidat Hybrid | QLC economics, high capacity |
| **Multi-protocol consolidation** | HPE Alletra MP / Huawei Dorado | Block + file + object in one platform |
## Decision diagram — storage platform selection
```mermaid
flowchart TD
Start(["Storage requirement"]) --> PROTO{"Access type"}
PROTO -->|"Block (SAN)"| BLOCK
PROTO -->|"File (NAS)"| FILE
PROTO -->|"Object"| OBJECT
BLOCK --> BPERF{"Performance tier"}
BPERF -->|"Tier 0/1<br/>< 100 µs, > 1M IOPS"| BT1["Infinidat / Pure //XL<br/>Huawei Dorado V7<br/>FC-NVMe, NVMe-oF"]
BPERF -->|"Tier 2<br/>100-500 µs"| BT2["Dell PowerStore / HPE Alletra 6000<br/>Hitachi VSP / Lenovo DM<br/>FC 32G, iSCSI 25GbE"]
BPERF -->|"Tier 3<br/>SME / low-cost"| BT3["Synology UC3400<br/>Lenovo DE / Dell PowerVault<br/>iSCSI, SAS"]
BLOCK --> BECOS{"Ecosystem"}
BECOS -->|"Mainframe"| BMF["Hitachi VSP / Dell PowerMax<br/>FICON + FC simultaneously"]
BECOS -->|"VMware"| BVM["Dell PowerStore / HPE Alletra<br/>VAAI, vVols, InfoSight"]
BECOS -->|"Oracle / SQL Server"| BDB["Infinidat / Pure //X<br/>Lowest latency"]
FILE --> FSIZE{"Scaling"}
FSIZE -->|"Enterprise"| FE["HPE Alletra MP (file)<br/>Lenovo DM / Dell PowerScale<br/>NFS, SMB, multi-protocol"]
FSIZE -->|"SMB"| FS["Synology DS/RS<br/>Lenovo DE / TrueNAS<br/>Btrfs, NFS, SMB, low TCO"]
OBJECT --> OUSE{"Use case"}
OUSE -->|"Backup / archive"| OB["Pure //C / Infinidat Hybrid<br/>Lenovo DG<br/>QLC, erasure coding, low cost/TB"]
OUSE -->|"AI/ML data lake"| OM["MinIO / Pure //C<br/>High throughput S3<br/>NVMe direct, erasure coding"]
OUSE -->|"Kubernetes PVC"| OK["Ceph RBD / Longhorn / Linstor<br/>SDS on K8s<br/>CSI, replication, snapshots"]
```
## OpenStack Storage
OpenStack offers three main storage services:
| Service | Type | Description |
|--------|-----|-------|
| **Cinder** | Block storage | Persistent volumes for instances (iSCSI, NFS, Ceph RBD) |
| **Swift** | Object storage | RESTful object store (S3-compatible via middleware) |
| **Manila** | File storage | Shared file systems (NFS, CIFS) as a managed service |
### Cinder (Block Storage)
- Multi-backend support: LVM, Ceph RBD, NFS, iSCSI, Fibre Channel
- Snapshoting, cloning, encryption at rest
- Cinder scheduler for volume distribution across backends
- QoS specs for IOPS/bandwidth limits
### Swift (Object Storage)
- Alternative to S3 for on-prem object storage
- Ring-based data distribution (consistent hashing)
- Multi-region replication (syncopy)
- Stateless REST API (RESTful, no single point of failure)
### Manila (Shared File Systems)
- Managed NFS/CIFS for sharing between instances
- Backends: NetApp, Dell EMC, CephFS, GlusterFS
- Access rules (IP-based, cert-based, user-based)
- Use case: HPC cluster home directories, NAS for legacy apps
### Container storage (OpenStack + Ceph)
Ceph is the most common storage backend for OpenStack: Cinder (RBD), Swift (RGW), Manila (CephFS), Glance (RBD images).
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
### Recommended reading
| Book | Authors | ISBN | Description |
|-------|--------|------|-------|
| Storage Systems | Ganger, Gibson | 978-1680837540 | Textbook covering the design, implementation and operation of storage systems — from device characteristics through OS, databases and networking to server distribution and large-scale systems. An essential resource for storage infrastructure architects. |
*Last revision: 2026-06-03*

283
STORAGE.md Normal file
View File

@@ -0,0 +1,283 @@
# 💾 Storage infrastruktura
## Typy úložišť
| Typ | Popis | Latence | Use case |
|-----|-------|---------|----------|
| **DAS** (Direct Attached) | Disky přímo v serveru | <0.1 ms | OS, cache, lokální data |
| **SAN** (Storage Area Network) | Bloková zařízení po síti | <1 ms | Databáze, VM datastory |
| **NAS** (Network Attached Storage) | Souborový přístup (NFS, SMB) | 1-3 ms | Sdílené soubory, home dirs |
| **Object storage** | REST API, flat namespace | 10-100 ms | Zálohy, media, big data |
## Protokoly
| Protokol | Typ | Rychlost | Poznámka |
|----------|-----|----------|----------|
| **Fibre Channel** | SAN | 8/16/32/64 Gbps | Nízká latence, dedikovaná síť |
| **iSCSI** | SAN (IP) | 1/10/25 GbE | Levnější, po ethernetu |
| **NVMe-oF** | SAN (NVMe) | 25/50/100 GbE | Nejnižší latence, emerging |
| **NFS** | NAS | 1/10/25 GbE | Univerzální, jednoduchý |
| **SMB/CIFS** | NAS | 1/10/25 GbE | Windows native |
| **S3 API** | Object | — | Standard pro object storage |
## RAID
| RAID | Min. disků | Kapacita | Ochrana | Rychlost čtení | Rychlost zápisu | Use case |
|------|-----------|----------|---------|---------------|----------------|----------|
| **0** | 2 | 100 % | Žádná | N × (striping) | N × | Temp data, cache (risky) |
| **1** | 2 | 50 % | 1 disk | N × (mirror) | 1 × | OS disk, kritická data |
| **5** | 3 | 67-94 % | 1 disk | N-1 × | N-1 × (parity write penalty) | Univerzální file/VM storage |
| **6** | 4 | 50-88 % | 2 disky | N-2 × | N-2 × (double parity) | Velké kapacity, důležitá data |
| **10** | 4 | 50 % | 1/mirror | N × | N/2 × | Databáze, VM, high-performance |
| **50** | 6 | 67-94 % | 1/stripe | N-1 × | N-1 × | Large capacity + performance |
| **60** | 8 | 50-88 % | 2/stripe | N-2 × | N-2 × | Enterprise |
### Stripe size
- Malý stripe (16-64 KB) — lepší IOPS, horší throughput (databáze, OLTP)
- Velký stripe (128-1024 KB) — lepší throughput, horší IOPS (video, media, backup)
- Write hole u RAID 5/6: při výpadku během zápisu parity je metadata nekonzistentní (prevence: non-volatile cache, battery-backed RAID controller)
## Software-Defined Storage (SDS)
| Nástroj | Typ | Use case |
|---------|-----|----------|
| **Ceph** | Object/Block/File (RADOS) | Univerzální SDS, OpenStack, Kubernetes |
| **MinIO** | Object (S3 API) | High-performance S3, AI/ML data lake |
| **GlusterFS** | Distributed File | Shared filesystem, POSIX |
| **Longhorn** | Block (Kubernetes) | K8s PVC, mikroservisy |
| **Linstor** | Block (DRBD + LVM) | Linux SDS, Kubernetes |
| **VMware vSAN** | Block (HCI) | VMware ecosystem |
| **StarWind** | Block (HCI) | Hyper-V / VMware |
### Ceph
**Architektura**:
```
RADOS (Reliable Autonomic Distributed Object Store)
├── Monitors (MON) — cluster map, quorum (3/5)
├── Managers (MGR) — dashboard, balancer, orchestrator
├── OSDs (Object Storage Daemons) — data + replikace
└── MDS (Metadata Server) — pouze pro CephFS
```
**CRUSH map** (Controlled Replication Under Scalable Hashing):
- Algoritmus pro výpočet umístění dat (žádný centrální index)
- Vrstvy: Root → Datacenter → Rack → Host → OSD
- Failure domain: replikace napříč racky / hosty
- `ceph osd crush rule create-replicated replicated_rule default host`
**Přístupová rozhraní**:
| Rozhraní | Typ | Use case |
|----------|-----|----------|
| **RBD** (RADOS Block Device) | Block | VM images, Kubernetes PVC (csi-rbd) |
| **RGW** (RADOS Gateway) | Object (S3/Swift API) | S3-kompatibilní storage, backup |
| **CephFS** | File (POSIX) | Shared filesystem, home dirs |
| **NFS-Ganesha** | File (NFS) | NFS export přes CephFS |
**Erasure coding**:
- K+M (data + parity chunks), např. 8+3 (8 data, 3 parity)
- Prostorově efektivnější než 3× replikace (1.375× vs 3×)
- Vyšší CPU režie, nižší IOPS
- Doporučeno pro cold data (RGW) místo replikace
## Výrobci enterprise storage
### Hitachi VSP (Virtual Storage Platform)
| Model | Architektura | Max kapacita | IOPS / Latence | Protokoly | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **VSP 5200/5600** | Active-active, scale-up/out, 212 controllerů | 69.3 PB raw, 287 PBe | 33M IOPS, 39 µs | FC-NVMe 32Gb, FC 16/32Gb, FICON 16Gb, iSCSI 10Gb | Mission-critical, mainframe, enterprise consolidation |
| **VSP E590/E790/E1090** | Symmetric active-active, až 65 nodů/130 controllerů | 10.62 PB raw (E1090) | 8.4M IOPS, <41 µs | FC 32Gb, iSCSI 25Gb, FC-NVMe 32Gb | Midrange enterprise, hybrid workloads |
**Klíčové vlastnosti**: SVOS společný pro celé portfolio, AI-driven data reduction 4:1 garance, Global-Active Device metro clustering, 8 nines availability (HW), 100% data availability guarantee.
---
### Huawei OceanStor Dorado
| Model | Architektura | Max kapacita | IOPS / Latence | Protokoly | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **Dorado 8000/18000 V6** | SmartMatrix full-mesh, až 32 controllerů | 32 TB cache, 6400 SSD | 40M IOPS, 0.05 ms | FC 32/64Gb, FC-NVMe, iSCSI, NFS, SMB, NVMe/RoCE, S3 | Mission-critical, finance, govt, carrier |
| **Dorado 8000/18000 V7 (2025)** | SmartMatrix 4.0, až 64/128 controllerů | 500 PB+ | >100M IOPS, 0.03 ms | FC, RoCE, NVMe/TCP, NFS, SMB, S3 | AI workloads, converged block/file/object |
**Klíčové vlastnosti**: SmartMatrix přežije 7/8 controllerů, FlashEver (3-gen online HW upgrade za 10 let), RAID-TP (triple SSD failure), DPU-based SmartNIC, ML-based I/O prefetch, 100% ransomware detection (Tolly), #1 SPC-1 benchmark.
---
### Dell PowerStore & PowerMax
| Model | Architektura | Max kapacita | IOPS / Latence | Protokoly | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **PowerStore 1500/5500/9500 (Gen 3)** | Active-active dual-node, PCIe Gen5, DDR5, RDMA 200GbE | 1.2 PB raw, 5.8 PBe | 3× IOPS oproti Gen2 | FC 32/64Gb, iSCSI, NVMe/FC, NVMe/TCP, NFSv4, SMB3 | Midrange-to-high-end, VMware, containerized |
| **PowerMax 2500/8500** | Scale-out NVMe, Dynamic Fabric, až 16 nodů | 8.8 PBe (2500), 18 PBe (8500) | 6 nines availability | FC 64Gb, FICON, NVMe/FC, NVMe/TCP, iSCSI, NFS, SMB | Mission-critical, mainframe, OLTP, cyber vault |
**Klíčové vlastnosti**: PowerStore 6:1 DRR garance, unified block/file/vVols out of box, Cyber Detect AI anomaly; PowerMax 5:1 DRR, Secure Snapshots 65M, SRDF/Metro, Flexible RAID až 92% efficient, FIPS 140-3.
---
### HPE Alletra
| Model | Architektura | Max kapacita | IOPS / Latence | Protokoly | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **Alletra 5000** | Active-active hybrid flash, dual controller | 1.2 PB raw | 99.9999% garance | FC, iSCSI | Mixed primary + secondary, cost-efficient hybrid |
| **Alletra 6000** | Active-active all-NVMe, dual controller | ~368 TB usable | <100 µs | FC, iSCSI | Business-critical DB, VDI, VMware |
| **Alletra 9000** | Active-active all-NVMe, multi-node scale-out | 24 PB+ usable | ~23M IOPS, <150 µs | FC, iSCSI, NVMe/FC | Mission-critical ERP, AI, consolidation |
| **Alletra Storage MP** | Disaggregated modular, block + file + object | 5.8 PB block, 11.8 PB object | 100% availability garance | FC, iSCSI, NVMe/FC, NFS, SMB, S3 | Multi-protocol consolidation, AI/analytics |
**Klíčové vlastnosti**: Triple Parity RAID (5000), InfoSight AI Ops, HPE GreenLake as-a-service, non-disruptive controller upgrades (MP), 100% data availability guarantee.
---
### Infinidat
| Model | Architektura | Max kapacita | IOPS / Latence | Protokoly | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **InfiniBox SSA G4** | Triple-active controller, AMD EPYC PCIe 5.0, DDR5 | 1.97 PB usable / 5.9 PBe | 2.24M IOPS, 35 µs | FC 32Gb, 25/100GbE, NVMe-oF/TCP, iSCSI, NFS, SMB, S3 | Mission-critical Oracle/SQL, multi-site DR |
| **InfiniBox G4 Hybrid** | Triple-active hybrid (HDD + flash cache) | 10.9 PB raw / 32.8 PBe | 2.24M IOPS, 64 GB/s | FC, Ethernet, NVMe-oF, iSCSI, NFS, SMB, S3 | Backup, massive unstructured data |
**Klíčové vlastnosti**: 3-way active jediný na trhu, Neural Cache (ML-driven), InfiniRAID, Immutable snapshots, 100% availability + 1-min snapshot recovery garance, vše v základní ceně (žádný extra licensing).
---
### Pure Storage FlashArray
| Model | Architektura | Max kapacita | IOPS / Latence | Protokoly | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **FlashArray//X (X20X90 R5)** | Active-active, NVMe DirectFlash | 1.2 PB raw / 4.4 PBe | 250 µs, 5:1 DRR | FC, NVMe/FC, NVMe/RoCE, NVMe/TCP, iSCSI, NFS, SMB | Mission-critical DB, VMware, enterprise |
| **FlashArray//C (C50C90 R5)** | Active-active, QLC DirectFlash | 4.2 PB raw / 16.3 PBe | 5:1 DRR | FC, NVMe-oF, iSCSI, NFS, SMB | Capacity-optimized, backup, file |
| **FlashArray//XL (XL190)** | Active-active, 40 DirectFlash modulů | 1.9 PB raw / 9.4 PBe | >4M IOPS, <100 µs, 45 GB/s | FC 64Gb, 100GbE RoCE, NVMe/FC, NVMe/TCP, NFS, SMB | Největší DB konsolidace, OLTP |
**Klíčové vlastnosti**: DirectFlash (bez FTL vrstvy), 99.9999% availability, Evergreen (nikdy forklift upgrade), Purity OS jednotný napříč celým portfoliem, ActiveCluster/ActiveDR, Pure1 AIOps.
---
### Lenovo ThinkSystem
| Model | Architektura | Max kapacita | IOPS / Latence | Protokoly | Use case |
|-------|-------------|--------------|----------------|-----------|----------|
| **DM Series** (DM3200F/5200F/7200F) | Active-active, all-NVMe, NetApp ONTAP | 1.8 PB raw / 6.8 PBe | Až 120 NVMe SSD | FC 64Gb, iSCSI, NVMe/FC, NFS, SMB, S3 | Unified block/file, AI/ML, VMware |
| **DG Series** (DG5200/7200) | Active-active, all-QLC, ONTAP | 7.4 PB raw / 27 PBe | QLC ekonomie | FC, NVMe/FC, NVMe/TCP, iSCSI, NFS, SMB, S3 | Capacity-optimized, backup, archive |
| **DE Series** (DE4000FDE6600F) | Active-active, SAS/NVMe hybrid | 1.84 PB raw | 2M IOPS, <100 µs, 44 GB/s | FC 32Gb, iSCSI 25Gb, NVMe/FC, SAS, NVMe/RoCE | HPC, analytics, video surveillance |
**Klíčové vlastnosti**: DM/DG využívají ONTAP (SnapMirror, SnapVault, FabricPool, RAID-DP/RAID-TEC); cluster scale-out až 12 HA párů; DE série nejlepší poměr cena/výkon v portfoliu.
---
### Synology
| Model | Architektura | Max kapacita | Protokoly | Use case |
|-------|-------------|--------------|-----------|----------|
| **UC3200/UC3400** | Active-active dual-controller, SAS backend | 576 TB raw | iSCSI, FC 16Gb, 10/25GbE | SMB/midmarket SAN, VMware, HA |
| **DS/RS Series** (RS3626xs+, RS6426xs+) | Single-controller / HA pair, Btrfs | 864 TB raw, 1 PB volume | SMB, NFS, iSCSI, FC (HBA) | SME all-in-one NAS/SAN, backup, surveillance |
**Klíčové vlastnosti**: DSM UC pro SAN, Synology HA, Snapshot Replication (16K snapshots), VMware VAAI/ODX/ALUA, Surveillance Station, nízké TCO.
---
### Srovnání vendorů — přehled
| Vendor | Flagship | Max IOPS | Max kapacita | Latence | Garance availability | Hlavní diferentiátor |
|--------|----------|----------|-------------|---------|---------------------|----------------------|
| **Hitachi** | VSP 5600 | 33M | 287 PBe | 39 µs | 8 nines (HW) | Mainframe + open; 65-node cluster |
| **Huawei** | Dorado 18000 V7 | >100M | 500 PB+ | 0.03 ms | 99.99999% | SmartMatrix; #1 SPC-1 |
| **Dell** | PowerMax 8500 | — | 18 PBe | — | 6 nines | SRDF/Metro; mainframe |
| **HPE** | Alletra 9000/MP | ~3M | 11.8 PBe | <150 µs | 100% data guarantee | InfoSight AIOps; GreenLake |
| **Infinidat** | InfiniBox SSA G4 | 2.24M | 32.8 PBe | 35 µs | 100% availability | 3-way active; Neural Cache |
| **Pure** | FlashArray//XL | >4M | 16.3 PBe | <100 µs | 99.9999% | DirectFlash; Evergreen |
| **Lenovo** | DM7200F | — | 27 PBe | — | — | ONTAP ecosystem; široké portfolio |
| **Synology** | UC3400 | 690K | 576 TB | — | — | Nejnižší cena za active-active SAN |
---
### Výběr storage dle use case
| Use case | Doporučení | Zdůvodnění |
|----------|-----------|-------------|
| **Mainframe + open hybrid** | Hitachi VSP / Dell PowerMax | Jediní s FICON + FC současně |
| **AI/ML trénování** | Huawei Dorado V7 / Pure //XL | Nejvyšší IOPS, nejnižší latence |
| **Enterprise DB (Oracle, SQL Server)** | Infinidat / Pure //X | Nízká latence, konzistentní výkon |
| **Virtualizace (VMware, Hyper-V)** | Dell PowerStore / HPE Alletra 6000 | VAAI, vVols, InfoSight |
| **SMB / SME** | Synology / Lenovo DE | Nízké TCO, jednoduchá správa |
| **Object storage / backup** | Pure //C / Lenovo DG / Infinidat Hybrid | QLC ekonomie, vysoká kapacita |
| **Multi-protocol konsolidace** | HPE Alletra MP / Huawei Dorado | Block + file + object v jedné platformě |
## Decision diagram — výběr storage platformy
```mermaid
flowchart TD
Start(["Storage requirement"]) --> PROTO{"Access type"}
PROTO -->|"Block (SAN)"| BLOCK
PROTO -->|"File (NAS)"| FILE
PROTO -->|"Object"| OBJECT
BLOCK --> BPERF{"Performance tier"}
BPERF -->|"Tier 0/1<br/>< 100 µs, > 1M IOPS"| BT1["Infinidat / Pure //XL<br/>Huawei Dorado V7<br/>FC-NVMe, NVMe-oF"]
BPERF -->|"Tier 2<br/>100-500 µs"| BT2["Dell PowerStore / HPE Alletra 6000<br/>Hitachi VSP / Lenovo DM<br/>FC 32G, iSCSI 25GbE"]
BPERF -->|"Tier 3<br/>SME / low-cost"| BT3["Synology UC3400<br/>Lenovo DE / Dell PowerVault<br/>iSCSI, SAS"]
BLOCK --> BECOS{"Ecosystem"}
BECOS -->|"Mainframe"| BMF["Hitachi VSP / Dell PowerMax<br/>FICON + FC současně"]
BECOS -->|"VMware"| BVM["Dell PowerStore / HPE Alletra<br/>VAAI, vVols, InfoSight"]
BECOS -->|"Oracle / SQL Server"| BDB["Infinidat / Pure //X<br/>Nejnižší latence"]
FILE --> FSIZE{"Škálování"}
FSIZE -->|"Enterprise"| FE["HPE Alletra MP (file)<br/>Lenovo DM / Dell PowerScale<br/>NFS, SMB, multi-protocol"]
FSIZE -->|"SMB"| FS["Synology DS/RS<br/>Lenovo DE / TrueNAS<br/>Btrfs, NFS, SMB, nízké TCO"]
OBJECT --> OUSE{"Use case"}
OUSE -->|"Backup / archive"| OB["Pure //C / Infinidat Hybrid<br/>Lenovo DG<br/>QLC, erasure coding, nízká cena/TB"]
OUSE -->|"AI/ML data lake"| OM["MinIO / Pure //C<br/>High throughput S3<br/>NVMe direct, erasure coding"]
OUSE -->|"Kubernetes PVC"| OK["Ceph RBD / Longhorn / Linstor<br/>SDS na K8s<br/>CSI, replication, snapshots"]
```
## OpenStack Storage
OpenStack nabízí tři hlavní storage služby:
| Služba | Typ | Popis |
|--------|-----|-------|
| **Cinder** | Block storage | Persistent volumes pro instance (iSCSI, NFS, Ceph RBD) |
| **Swift** | Object storage | RESTful object store (S3-kompatibilní via middleware) |
| **Manila** | File storage | Shared file systems (NFS, CIFS) jako managed service |
### Cinder (Block Storage)
- Podpora multi-backend: LVM, Ceph RBD, NFS, iSCSI, Fibre Channel
- Snapshoting, cloning, encryption at rest
- Cinder scheduler pro distribuci volume napříč backendy
- QoS specs pro omezení IOPS/bandwidth
### Swift (Object Storage)
- Alternativa k S3 pro on-prem object storage
- Ring-based data distribution (consistent hashing)
- Multi-region replikace (syncopy)
- Stateless REST API (RESTful, no single point of failure)
### Manila (Shared File Systems)
- Managed NFS/CIFS pro sdílení mezi instancemi
- Backendy: NetApp, Dell EMC, CephFS, GlusterFS
- Access rules (IP-based, cert-based, user-based)
- Use case: HPC cluster home directories, NAS pro legacy apps
### Kontejnerový storage (OpenStack + Ceph)
Ceph je nejčastější storage backend pro OpenStack: Cinder (RBD), Swift (RGW), Manila (CephFS), Glance (RBD images).
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
### Doporučená literatura
| Kniha | Autoři | ISBN | Popis |
|-------|--------|------|-------|
| Storage Systems | Ganger, Gibson | 978-1680837540 | Učebnice pokrývající návrh, implementaci a provoz úložných systémů — od charakteristik jednotlivých zařízení přes OS, databáze a networking až po distribuce v serverech a large-scale systémech. Nezbytný zdroj pro architekty storage infrastruktury. |
*Poslední revize: 2026-06-03*

105
VECTOR-DBS.en.md Normal file
View File

@@ -0,0 +1,105 @@
# 🧠 Vector Databases
## Overview
Specialized databases for storing and searching **embeddings** — vector representations of unstructured data (text, images, audio, video). They enable **semantic search** based on similarity, not exact matching. A key building block for RAG (Retrieval-Augmented Generation) and AI applications.
## Embeddings
- Map unstructured data into a vector space (list of numbers)
- Proximity in vector space = semantic similarity
- Generated by models: Word2Vec, BERT, OpenAI embeddings, E5, Cohere, Mistral
- Dimensions: 384 (all-MiniLM) to 3072 (OpenAI text-embedding-3-large)
## Vector indexing
| Method | Algorithm | Description | Accuracy | Speed |
|--------|-----------|-------------|----------|-------|
| **Flat (brute-force)** | Full scan | Comparison with all vectors | 100% | O(N) — slow for > 100K |
| **IVF** (Inverted File) | K-means clustering | Partition into clusters, search nearest cluster | ~95-99% | O(sqrt(N)) |
| **HNSW** (Hierarchical Navigable Small World) | Navigable graph | Multi-level graph, greedy search | ~99-100% | O(log N) |
| **IVF-PQ** | IVF + Product Quantization | Vector compression, less memory | ~90-95% | O(sqrt(N)) |
| **DiskANN** | SSD-based graph | Vectors on disk, Vamana graph | ~95-98% | O(log N) + I/O |
### Index selection
| Number of vectors | Requirement | Recommended index |
|------------------|-------------|------------------|
| < 100K | 100% accuracy | Flat |
| 100K - 10M | High accuracy, speed | HNSW |
| 10M+ | Memory efficiency | IVF-PQ, DiskANN |
| 100M+ | Scaling on SSD | DiskANN |
## Use case: RAG (Retrieval-Augmented Generation)
```text
User query → Embedding model → Vector DB search → Relevant chunks → LLM → Answer
```
Variants:
- **Naive RAG** — single retrieval + single generation
- **Advanced RAG** — pre-retrieval (query rewriting, HyDE) + post-retrieval (reranking, filtering)
- **Multi-modal RAG** — text + images + audio in one pipeline
## Tools — comparison
| Tool | Type | Indexes | Cloud | Self-hosted | Note |
|------|------|---------|-------|-------------|------|
| **Pinecone** | Managed | HNSW, IVF-PQ | Yes | No | Fully managed, no ops. Pricing by dimension and vector count |
| **Weaviate** | Open source | HNSW, Flat | Yes (WCD) | Yes | Graph + vector, hybrid queries, modular (generative search) |
| **Qdrant** | Open source | HNSW, IVF-PQ, quantization | Yes (Cloud) | Yes | Rust, batch API, filter concurrent with vector search |
| **Milvus** | Open source | IVF, HNSW, IVF-PQ, DiskANN | Yes (Zilliz) | Yes | GPU acceleration. More complex ops (K8s required) |
| **pgvector** | PostgreSQL extension | IVFFlat, HNSW | All (via RDS) | Yes | Embeddings directly in PostgreSQL. Hybrid SQL + vectors |
| **Chroma** | Open source | HNSW | No | Yes | Simple embedding + retrieval, Python-native |
| **LanceDB** | Open source | IVF-PQ | No | Yes | Multi-modal data, Arrow format, no server (embedded) |
| **Elasticsearch** | Search engine | HNSW (8.0+) | Yes (Cloud) | Yes | If you already have ES, can use for vectors too |
### pgvector vs standalone vector DB
| Feature | pgvector | Standalone (Pinecone, Qdrant, Milvus) |
|---------|----------|---------------------------------------|
| **Architecture** | Extension in PostgreSQL | Standalone service |
| **Hybrid queries** | Native SQL + vectors | Requires coordination of two systems |
| **Latency** | Higher (disk-based PG) | Lower (in-memory indexes) |
| **Scaling** | PG replication / Citus | Native sharding, rebalancing |
| **Consistency** | PG ACID transactions | Eventual consistency |
| **Operations** | One system | Two systems (operational overhead) |
## Recommendations — Tool selection
| Scenario | Recommendation | Rationale |
|----------|---------------|-----------|
| **RAG on PostgreSQL data** | pgvector | Hybrid SQL + vectors in one DB |
| **RAG production, no ops** | Pinecone | Fully managed, scalable, no operations |
| **Self-hosted RAG** | Qdrant (simpler) / Milvus (performance) | Open source, data control |
| **Full-text + vectors** | Elasticsearch / Weaviate | Combination of BM25 + vector score |
| **Research / prototyping** | Chroma | Python-native, quick start |
| **Embedded / edge** | LanceDB | No server, Arrow format |
| **Multi-modal data** | Weaviate / LanceDB | Native image, audio, video support |
| **GPU acceleration** | Milvus | CUDA support for index build |
## When to (not) use a vector DB
**Use** when:
- You need semantic search (similarity by meaning, not keywords)
- You are building a RAG / AI assistant over your own data
- Document/image deduplication (near-duplicate detection)
- Recommendation systems (similar content, similar users)
**Do not use** when:
- You need exact matching (keys, IDs, foreign keys) → SQL
- Full-text search suffices (BM25, stemming) → Elasticsearch, PostgreSQL full-text
- Vectors are just a complement to the primary DB → pgvector (simplicity)
- Fewer than 1000 documents → brute-force in application is sufficient
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
### Recommended reading
| Book | Authors | Description |
|------|---------|-------------|
| Vector Databases | Borwankar (2026) | Comprehensive guide to vector DBs from concepts to production deployment |
*Last revision: 2026-06-03*

105
VEKTOROVE-DB.md Normal file
View File

@@ -0,0 +1,105 @@
# 🧠 Vektorové databáze
## Přehled
Specializované databáze pro ukládání a vyhledávání **embeddingů** — vektorových reprezentací nestrukturovaných dat (text, obrázky, audio, video). Umožňují **sémantické vyhledávání** na základě podobnosti, nikoliv přesné shody. Klíčový stavební kámen pro RAG (Retrieval-Augmented Generation) a AI aplikace.
## Embeddings
- Mapují nestrukturovaná data do vektorového prostoru (seznam čísel)
- Blízkost ve vektorovém prostoru = sémantická podobnost
- Generovány modely: Word2Vec, BERT, OpenAI embeddings, E5, Cohere, Mistral
- Dimenze: 384 (all-MiniLM) až 3072 (OpenAI text-embedding-3-large)
## Indexování vektorů
| Metoda | Algoritmus | Popis | Přesnost | Rychlost |
|--------|-----------|-------|----------|----------|
| **Flat (brute-force)** | Úplné prohledání | Porovnání se všemi vektory | 100 % | O(N) — pomalé pro > 100K |
| **IVF** (Inverted File) | K-means clustering | Rozdělení do shluků, hledá se v nejbližším shluku | ~95-99 % | O(sqrt(N)) |
| **HNSW** (Hierarchical Navigable Small World) | Navigovatelný graf | Víceúrovňový graf, greedy search | ~99-100 % | O(log N) |
| **IVF-PQ** | IVF + Product Quantization | Komprese vektorů, menší paměť | ~90-95 % | O(sqrt(N)) |
| **DiskANN** | SSD-based graf | Vektory na disku, Vamana graf | ~95-98 % | O(log N) + I/O |
### Volba indexu
| Počet vektorů | Požadavek | Doporučený index |
|--------------|-----------|-----------------|
| < 100K | 100% přesnost | Flat |
| 100K - 10M | Vysoká přesnost, rychlost | HNSW |
| 10M+ | Paměťová efektivita | IVF-PQ, DiskANN |
| 100M+ | Škálování na SSD | DiskANN |
## Use case: RAG (Retrieval-Augmented Generation)
```text
User query → Embedding model → Vector DB search → Relevant chunks → LLM → Answer
```
Varianty:
- **Naive RAG** — jeden retrieval + jeden generování
- **Advanced RAG** — pre-retrieval (query rewriting, HyDE) + post-retrieval (reranking, filtering)
- **Multi-modal RAG** — text + obrázky + audio do jednoho pipeline
## Nástroje — srovnání
| Nástroj | Typ | Indexy | Cloud | Self-hosted | Poznámka |
|---------|-----|--------|-------|-------------|----------|
| **Pinecone** | Managed | HNSW, IVF-PQ | Ano | Ne | Plně spravovaná, žádný ops. Cena dle dimenze a počtu vektorů |
| **Weaviate** | Open source | HNSW, Flat | Ano (WCD) | Ano | Grafová + vektorová, hybridní dotazy, modulární (generative search) |
| **Qdrant** | Open source | HNSW, IVF-PQ, quantization | Ano (Cloud) | Ano | Rust, batch API, filtr souběžně s vektorovým search |
| **Milvus** | Open source | IVF, HNSW, IVF-PQ, DiskANN | Ano (Zilliz) | Ano | GPU akcelerace. Komplexnější ops (K8s required) |
| **pgvector** | PostgreSQL extension | IVFFlat, HNSW | Vše (díky RDS) | Ano | Embeddingy přímo v PostgreSQL. Hybridní SQL + vektory |
| **Chroma** | Open source | HNSW | Ne | Ano | Jednoduchý na embedding + retrieval, Python-native |
| **LanceDB** | Open source | IVF-PQ | Ne | Ano | Multimodální data, Arrow formát, žádný server (embedded) |
| **Elasticsearch** | Search engine | HNSW (8.0+) | Ano (Cloud) | Ano | Pokud už máte ES, lze použít i pro vektory |
### pgvector vs samostatná vektorová DB
| Vlastnost | pgvector | Samostatná (Pinecone, Qdrant, Milvus) |
|-----------|----------|---------------------------------------|
| **Architektura** | Extension v PostgreSQL | Samostatná služba |
| **Hybridní dotazy** | Nativní SQL + vektory | Nutná koordinace dvou systémů |
| **Latence** | Vyšší (disk-based PG) | Nižší (in-memory indexy) |
| **Škálování** | PG replikace / Citus | Nativní sharding, rebalancing |
| **Konzistence** | PG ACID transakce | Eventual consistency |
| **Provoz** | Jeden systém | Dva systémy (operational overhead) |
## Doporučení — Volba nástroje
| Scénář | Doporučení | Zdůvodnění |
|--------|-----------|-------------|
| **RAG na PostgreSQL datech** | pgvector | Hybridní SQL + vektory v jedné DB |
| **RAG produkce, žádný ops** | Pinecone | Plně managed, škálovatelné, žádný provoz |
| **Self-hosted RAG** | Qdrant (jednodušší) / Milvus (výkon) | Open source, kontrola nad daty |
| **Full-text + vektory** | Elasticsearch / Weaviate | Kombinace BM25 + vektorového skóre |
| **Výzkum / prototypování** | Chroma | Python-native, rychlý start |
| **Embedded / edge** | LanceDB | Žádný server, Arrow formát |
| **Multi-modal data** | Weaviate / LanceDB | Nativní podpora obrázků, audio, videa |
| **GPU akcelerace** | Milvus | CUDA podpora pro index build |
## Kdy vektorovou DB (ne)použít
**Použít** když:
- Potřebujete sémantické vyhledávání (podobnost podle významu, ne klíčových slov)
- Stavíte RAG / AI asistenta nad vlastními daty
- Deduplikace dokumentů, obrázků (near-duplicate detection)
- Doporučovací systémy (podobný obsah, podobní uživatelé)
**Nepoužít** když:
- Potřebujete přesnou shodu (klíče, ID, foreign keys) → SQL
- Full-text search stačí (BM25, stemming) → Elasticsearch, PostgreSQL full-text
- Vektory jen jako doplněk k primární DB → pgvector (jednoduchost)
- Méně než 1000 dokumentů → postačí brute-force v aplikaci
## Zdroje
Odkazy, knihy a standardy: [sources/databases/sources.md](sources/databases/sources.md)
### Doporučená literatura
| Kniha | Autoři | Popis |
|-------|--------|-------|
| Vector Databases | Borwankar (2026) | Komplexní průvodce vektorovými DB od konceptů po produkční nasazení |
*Poslední revize: 2026-06-03*

View File

@@ -0,0 +1,378 @@
# Case study: Proxmox VE demo cluster (3× node, Ceph, HA)
## 1. Requirements and parameters
| Parameter | Value |
|----------|---------|
| Number of hosts | 3 |
| Purpose | demo, learning, development |
| Hypervisor | Proxmox VE (free) |
| Budget | low-cost (~$10,000$15,000) |
| Storage | Ceph (HCI) |
| HA | yes |
| Location | 1 rack, standard office room |
---
## 2. Server configuration
Based on a combination of the **Mini variant** (23 hosts, single-socket) and the **pure Ceph variant** per SERVER-CONFIG.md. Each of the 3 nodes is identical.
### 2.1 Single node configuration
| Component | Specification | Rationale |
|------------|-------------|------------|
| **CPU** | 1× AMD EPYC 9224 (24C/48T, 200 W TDP) or Intel Xeon 5418Y (16C/32T) | SERVER-CONFIG.md: "Pure Ceph variant: CPU 1× EPYC 92249334 (1224C)". Ceph requires 12 cores per OSD; with 3 OSD + Proxmox + VM, 12+ cores is the minimum. |
| **RAM** | 128 GB DDR5-4800 (4× 32 GB RDIMM, 1DPC) | SERVER-CONFIG.md: "RAM 128256 GB" for Ceph variant. 128 GB is sufficient for demo; 48 GB per OSD + OS + lightweight VMs. |
| **OS disk** | 2× 240 GB SATA SSD, RAID 1 (HW controller in HBA mode or SW mdadm) | "OS: 2× SATA SSD RAID 1" per Ceph variant. |
| **Ceph OSD** | 3× 960 GB SATA SSD (HBA/IT mode, no HW RAID) | "Ceph OSD: 48× NVMe/SATA SSD (RAW, HBA mode)". For demo we reduce to 3 OSD/node. Total 9 OSD in cluster. |
| **NIC** | 2× dual-port 10 GbE SFP+ (total 4× 10 GbE) | "Network: 2× 25 GbE public + 2× 25 GbE cluster". For low-cost we choose 10 GbE (SFP+), the concept remains the same. |
| **BMC** | 1× 1 GbE (iDRAC / iLO / IPMI) | Standard management port, CONNECTIVITY.md. |
| **Form factor** | 1U rack server (Dell R660, HPE DL360 Gen11, or Supermicro) | 19" rack, suitable for 1U. |
### 2.2 CPU choice rationale
KB states for the Mini variant "1× EPYC 4124 (4C) or Xeon E-2400". However, 4 cores is insufficient for Ceph (OSD + Proxmox + VM). Therefore we choose EPYC 9224 (24C) / Xeon 5418Y (16C), which corresponds to the Ceph variant in SERVER-CONFIG.md. The price is higher, but the cluster is functional for real-world testing.
---
## 3. Storage variant — Ceph
### 3.1 Topology
```
3× Proxmox node ─── each 3× OSD (SATA SSD)
Ceph cluster
┌─────────┼─────────┐
3× MON 3× MGR 9× OSD
```
### 3.2 Ceph configuration
| Parameter | Value | Note |
|----------|---------|---------|
| Replication | 3 (size = 3, min_size = 2) | Standard per STORAGE.md |
| Failure domain | host | CRUSH: replication across nodes |
| Raw capacity | 9 × 960 GB ≈ 8.6 TB | |
| Usable capacity | ~2.9 TB (8.6 / 3) | Sufficient for demo |
| OSD backend | BlueStore | Default in Ceph, recommended |
| MON quorum | 3 (1 per node) | Minimum for HA |
| Cache | RAM (BlueStore cache) | 12 GB per OSD |
| Network public | 2× 10 GbE LACP | VM traffic + Ceph frontend |
| Network cluster | 2× 10 GbE LACP | Ceph backend replication |
| MTU | 9000 (jumbo frames) | Recommended per NETWORKING.md |
### 3.3 Storage layout on disk
```
/dev/sda 240 GB OS (RAID 1, mirror with /dev/sdb)
/dev/sdc 960 GB OSD.0 (RAW, BlueStore)
/dev/sdd 960 GB OSD.1 (RAW, BlueStore)
/dev/sde 960 GB OSD.2 (RAW, BlueStore)
```
### 3.4 Ceph pool design
| Pool | PG count | Replication | Purpose |
|------|----------|-----------|-------|
| vms | 128 | 3× | VM disks (RBD) |
| data | 64 | 3× | Data volume |
| backups | 32 | 3× | Backups (low priority) |
PG count is approximate for demo (9 OSD). Production formula: (OSD_total × 100) / replication_size.
---
## 4. Network
### 4.1 Topology
```
┌─────────────────┐
│ 10 GbE Switch │
│ (24-port SFP+) │
└──┬──┬──┬──┬──┬──┘
┌─────────────┘ │ │ └─────────────┐
│ │ │ │
┌─────┴─────┐ ┌────┴──┴───┐ ┌───────┴──┐
│ Node 1 │ │ Node 2 │ │ Node 3 │
│ 4×10GbE │ │ 4×10GbE │ │ 4×10GbE │
│ ┌──────┐ │ │ ┌──────┐ │ │ ┌──────┐ │
│ │1GbE │ │ │ │1GbE │ │ │ │1GbE │ │
│ │BMC │ │ │ │BMC │ │ │ │BMC │ │
└─────────┘ └───────────┘ └───────────┘
```
### 4.2 VLAN and traffic segmentation
| VLAN | Purpose | Ports | MTU |
|------|------|-------|-----|
| VLAN 10 | Management (Proxmox web UI, SSH) | 1× 1 GbE BMC | 1500 |
| VLAN 20 | VM traffic + Ceph public | 2× 10 GbE (bond) | 9000 |
| VLAN 30 | Ceph cluster (backend) | 2× 10 GbE (bond) | 9000 |
### 4.3 Switch
| Parameter | Value |
|----------|---------|
| Model | MikroTik CRS326-24S+2Q+RM or similar L2+ switch |
| Ports | 24× SFP+ 10 GbE |
| Management | VLAN 10, IP 10.0.0.254/24 |
| Features | VLAN, LACP (LAG), Jumbo frames (MTU 9000), SNMP |
### 4.4 Cabling
| Type | Length | Quantity | Purpose |
|-----|-------|-------|-------|
| SFP+ DAC (passive) | 3 m | 12 | 10 GbE connection server ↔ switch |
| Cat6A UTP | 3 m | 3 | Management (1 GbE BMC) |
| Cat6A UTP | 1 m | 1 | Internet uplink (patch panel) |
DAC cables are cheaper than SFP+ optics + patch cords — suitable for single-rack.
---
## 5. Rack layout
### 5.1 Dimensions and positions
| U | Device | Power (W) |
|---|----------|-----------|
| U1 | Switch 10 GbE (1U) | ~60 W |
| U2 | UPS (2U) | — |
| U3 | (empty, ventilation) | — |
| U4 | Server Node 1 (1U) | ~250 W |
| U5 | Server Node 2 (1U) | ~250 W |
| U6 | Server Node 3 (1U) | ~250 W |
| U7U15 | Empty (optional storage, patch panel) | — |
| Parameter | Value |
|----------|---------|
| Rack type | 15U wall-mount, 19", 600×600 mm |
| Total IT load | ~810 W |
| PUE estimate | ~1.5 (office room, no precision cooling) |
| Cooling | Standard office AC (ASHRAE A2: 1035 °C). Sufficient for <1 kW. |
**Note:** KB (DATACENTERS.md) states free air cooling for low density (<5 kW/rack). Standard ventilation and AC are sufficient in an office.
### 5.2 UPS
| Parameter | Value |
|----------|---------|
| Type | VI (line-interactive) — per DATACENTERS.md for smaller racks |
| Capacity | 2000 VA / 1200 W |
| Backup time | ~1520 min at 810 W load |
| Output | 8× C13 (for servers + switch) |
| Battery | VRLA (cheaper) or Li-ion LFP |
| Management | USB / SNMP card (automatic Proxmox shutdown) |
Optionally can be upgraded to VFI (double-conversion) UPS for cleaner output, but VI is sufficient for demo.
### 5.3 PDU
1× basic 1U PDU (8× C13), 230 V / 10 A — for distribution to servers.
---
## 6. Hypervisor — Proxmox VE
### 6.1 Installation and configuration
| Component | Version / Configuration |
|------------|---------------------|
| Hypervisor | Proxmox VE 8.x (Debian 12 + KVM + LXC) |
| Storage backend | Ceph Reef / Squid (18.x) integrated in Proxmox |
| Cluster | 3-node cluster, Corosync + PMXCFS |
| HA | Proxmox HA — 1 node failure tolerance (remaining 2 take over VMs) |
| Fencing | watchdog (softdog) + Proxmox HA manager |
### 6.2 License
| Item | Price | Note |
|---------|------|----------|
| Proxmox VE | $0 | Open source, full functionality without license |
| Proxmox community support | $0 | Forum, wiki |
| Proxmox enterprise support (optional) | ~€500/host/year | Can be purchased later |
HYPERVISORS.md: Proxmox VE is "open source (free)", no license required.
### 6.3 HA setup
- HA group: all 3 nodes, no-quorum-policy = "stop" (for demo)
- Max VM restart: 2 attempts
- Migration: live migration via Ceph RBD (shared storage)
---
## 7. Budget estimate
**Disclaimer:** KB does not contain specific component prices. The following amounts are approximate market estimates (Q2 2026, USD).
### 7.1 Servers (3×)
| Item | Qty | Price/unit | Total |
|---------|------|----------|--------|
| 1U rack server (basic config, without CPU/RAM/disk) | 3 | ~$1,200 | $3,600 |
| AMD EPYC 9224 (24C) / Intel Xeon 5418Y (16C) — per KB | 3 | ~$900 | $2,700 |
| RAM 128 GB (4× 32 GB DDR5-4800 RDIMM) | 3 | ~$600 | $1,800 |
| 240 GB SATA SSD (OS) | 6 | ~$50 | $300 |
| 960 GB SATA SSD (Ceph OSD) | 9 | ~$150 | $1,350 |
| Dual-port 10 GbE SFP+ NIC (e.g. Intel X710-DA2) | 6 | ~$120 | $720 |
| **Servers total** | | | **~$10,470** |
### 7.2 Network
| Item | Qty | Price/unit | Total |
|---------|------|----------|--------|
| MikroTik CRS326-24S+2Q+RM (24× 10GbE SFP+) | 1 | ~$600 | $600 |
| SFP+ DAC cable 3 m (passive) | 12 | ~$15 | $180 |
| Network total | | | **~$780** |
### 7.3 Rack and power
| Item | Qty | Price/unit | Total |
|---------|------|----------|--------|
| 15U wall-mount rack 19" | 1 | ~$300 | $300 |
| UPS 2000 VA (line-interactive, VRLA) | 1 | ~$450 | $450 |
| 1U PDU basic (8× C13) | 1 | ~$60 | $60 |
| Rack + power total | | | **~$810** |
### 7.4 Other
| Item | Price |
|---------|------|
| Cat6A patch cables, management | ~$50 |
| Mounting material, velcro | ~$30 |
| Shipping and installation | ~$200 |
| Other total | **~$280** |
### 7.5 Total calculation
| Category | Amount |
|-----------|--------|
| Servers (3× node) | ~$10,470 |
| Network (switch + cables) | ~$780 |
| Rack + power | ~$810 |
| Other | ~$280 |
| **Total** | **~$12,340** |
| Reserve (1015%) | ~$1,2001,800 |
| **Total with reserve** | **~$13,500$14,100** |
Budget **$10,000$15,000** is achievable. Using cheaper CPUs (EPYC 4124P / Xeon E-2488), it can be built for ~$8,0009,000, but with limited performance for Ceph.
**Possible savings:**
- CPU: 2× EPYC 4124P (4C) + 1× more powerful node → ~$800 savings (but asymmetric cluster)
- OSD: 2× instead of 3× SSD/node → ~$500 savings (less capacity)
- Switch: 12-port instead of 24-port → ~$300 savings
---
## 8. Topology diagram
```mermaid
flowchart TB
subgraph Rack["15U Rack (office)"]
U1["U1: 10GbE Switch (MikroTik)"]
U2["U2: UPS 2000 VA"]
U4["U4: Node 1 — Proxmox + Ceph OSD"]
U5["U5: Node 2 — Proxmox + Ceph OSD"]
U6["U6: Node 3 — Proxmox + Ceph OSD"]
end
subgraph Node1["Node 1 (detail)"]
N1_CPU["CPU: EPYC 9224 (24C)"]
N1_RAM["RAM: 128 GB DDR5"]
N1_OS["OS: 2× 240 GB SSD (RAID 1)"]
N1_OSD1["OSD.0: 960 GB SSD"]
N1_OSD2["OSD.1: 960 GB SSD"]
N1_OSD3["OSD.2: 960 GB SSD"]
N1_NIC["NIC: 4× 10GbE SFP+"]
N1_BMC["BMC: 1× 1GbE"]
end
U1 ---|"4× 10GbE LACP<br/>(public + cluster)"| U4
U1 ---|"4× 10GbE LACP"| U5
U1 ---|"4× 10GbE LACP"| U6
U4 --- N1_CPU
U4 --- N1_RAM
U4 --- N1_OS
U4 --- N1_OSD1
U4 --- N1_OSD2
U4 --- N1_OSD3
U4 --- N1_NIC
U4 --- N1_BMC
subgraph Ceph["Ceph Cluster"]
CEPH_MON["3× MON (1 per node)"]
CEPH_MGR["3× MGR (1 per node)"]
CEPH_OSD["9× OSD (3 per node)"]
end
U4 --- CEPH_MON
U5 --- CEPH_MON
U6 --- CEPH_MON
U4 --- CEPH_MGR
U5 --- CEPH_MGR
U6 --- CEPH_MGR
U4 --- CEPH_OSD
U5 --- CEPH_OSD
U6 --- CEPH_OSD
subgraph Proxmox["Proxmox VE Cluster"]
PMX_HA["HA Group (3 nodes)"]
PMX_HA --- U4
PMX_HA --- U5
PMX_HA --- U6
end
subgraph Uplink["Internet / LAN"]
UPLINK_SW["Office LAN<br/>(1 GbE)"]
end
U1 ---|"1× Cat6A<br/>1 GbE"| UPLINK_SW
U1 ---|"Internet<br/>(ISP router)"| UPLINK_SW
```
---
## 9. Summary and key decisions
| Decision | Variant | Rationale |
|------------|----------|------------|
| Hypervisor | Proxmox VE | HYPERVISORS.md: "For SME / low budget — open source, built-in Ceph, no license costs". Ideal for demo. |
| Storage | Ceph (3× replication) | STORAGE.md + SERVER-CONFIG.md: Ceph is the recommended SDS for Proxmox, 3 nodes minimum for quorum. |
| CPU | Single-socket EPYC 9224 / Xeon 5418Y | Compromise between price (Mini variant ~1 socket) and performance for Ceph (Ceph variant ~12+ cores). |
| Network | 10 GbE SFP+ (instead of 25 GbE) | KB recommends 25 GbE, but for low-cost demo 10 GbE is sufficient. The concept (public/cluster network separation) remains the same. |
| Rack | 15U wall-mount | Suitable for office, no raised floor, no precision cooling. |
| UPS | 2000 VA line-interactive | DATACENTERS.md: VI type for smaller racks. Sufficient for demo. |
| License | Proxmox VE (free) | No license costs, support can be purchased later. |
### Compromises compared to production deployment
- **25 GbE → 10 GbE**: lower Ceph cluster network throughput (not an issue in demo environment)
- **HDD → SSD**: for Ceph OSD we choose SSD instead of HDD (higher price, better performance — demo focuses on functionality, not capacity)
- **2× 10 GbE public + 2× 10 GbE cluster → combined on LACP**: can be merged when ports are scarce, but separation is better
- **Cooling**: office AC, not DC-grade precision cooling (PUE ~1.51.8)
### What KB does not address (supplemented from practice)
KB does not contain specific component prices — the budget is an approximate market estimate. It also does not specify a concrete switch model with L2+ features (VLAN, LACP, Jumbo frames). Here we follow common practice for the SOHO/SME segment.
---
## 10. References from KB
- **DATACENTERS.md** — rack layout, power chain, UPS types, cooling classes (ASHRAE), cabling standards
- **HYPERVISORS.md** — Proxmox VE as open source variant, platform comparison, Mini variant (23 hosts), Ceph connectivity
- **SERVER-CONFIG.md** — Pure Ceph variant (36 hosts), HW specification, network design, BIOS settings
- **STORAGE.md** — Ceph architecture (MON/MGR/OSD, CRUSH map, BlueStore, replication), SDS overview
- **CONNECTIVITY.md** — Ethernet speeds (10/25 GbE), SFP+ form factor, NIC placement, management port
- **NETWORKING.md** — VLAN segmentation, MTU and jumbo frames, best practices
- **SERVER-HW.md** — CPU selection (EPYC vs Xeon), RAM population (1DPC/2DPC), NUMA, form factors
---
*Last revision: 2026-06-04*

View File

@@ -0,0 +1,378 @@
# Případová studie: Proxmox VE demo cluster (3× node, Ceph, HA)
## 1. Zadání a parametry
| Parametr | Hodnota |
|----------|---------|
| Počet hostů | 3 |
| Účel | demo, učení, vývoj |
| Hypervisor | Proxmox VE (free) |
| Rozpočet | low-cost (~$10 000$15 000) |
| Storage | Ceph (HCI) |
| HA | ano |
| Lokalita | 1 rack, běžná kancelářská místnost |
---
## 2. Serverová sestava
Vychází z kombinace **varianty Mini** (23 hosty, single-socket) a **čistě Ceph varianty** dle SERVER-CONFIG.md. Každý ze 3 nodů je identický.
### 2.1 Konfigurace jednoho nodu
| Komponenta | Specifikace | Zdůvodnění |
|------------|-------------|------------|
| **CPU** | 1× AMD EPYC 9224 (24C/48T, 200 W TDP) nebo Intel Xeon 5418Y (16C/32T) | SERVER-CONFIG.md: "Čistě Ceph varianta: CPU 1× EPYC 92249334 (1224C)". Ceph vyžaduje 12 jádra na OSD; při 3 OSD + Proxmox + VM je 12+ jader minimum. |
| **RAM** | 128 GB DDR5-4800 (4× 32 GB RDIMM, 1DPC) | SERVER-CONFIG.md: "RAM 128256 GB" pro Ceph variantu. 128 GB dostačuje pro demo; 48 GB na OSD + OS + lehké VM. |
| **OS disk** | 2× 240 GB SATA SSD, RAID 1 (HW řadič v HBA režimu nebo SW mdadm) | "OS: 2× SATA SSD RAID 1" dle Ceph varianty. |
| **Ceph OSD** | 3× 960 GB SATA SSD (HBA/IT mode, žádný HW RAID) | "Ceph OSD: 48× NVMe/SATA SSD (RAW, HBA mode)". Pro demo snižujeme na 3 OSD/node. Celkem 9 OSD v clusteru. |
| **NIC** | 2× dual-port 10 GbE SFP+ (celkem 4× 10 GbE) | "Network: 2× 25 GbE public + 2× 25 GbE cluster". Pro low-cost volíme 10 GbE (SFP+), koncept zůstává stejný. |
| **BMC** | 1× 1 GbE (iDRAC / iLO / IPMI) | Standardní management port, CONNECTIVITY.md. |
| **Form factor** | 1U rack server (Dell R660, HPE DL360 Gen11, nebo Supermicro) | Rack 19", vhodný do 1U. |
### 2.2 Zdůvodnění CPU volby
KB uvádí pro Mini variantu "1× EPYC 4124 (4C) nebo Xeon E-2400". Pro Ceph je však 4 jader málo (OSD + Proxmox + VM). Proto volíme EPYC 9224 (24C) / Xeon 5418Y (16C), což odpovídá Ceph variantě v SERVER-CONFIG.md. Cena je vyšší, ale cluster je funkční i pro reálné testování.
---
## 3. Storage varianta — Ceph
### 3.1 Topologie
```
3× Proxmox node ─── každý 3× OSD (SATA SSD)
Ceph cluster
┌─────────┼─────────┐
3× MON 3× MGR 9× OSD
```
### 3.2 Konfigurace Ceph
| Parametr | Hodnota | Poznámka |
|----------|---------|----------|
| Replikace | 3 (size = 3, min_size = 2) | Standard dle STORAGE.md |
| Failure domain | host | CRUSH: replikace napříč nodem |
| Raw kapacita | 9 × 960 GB ≈ 8.6 TB | |
| Usable kapacita | ~2.9 TB (8.6 / 3) | Dostačující pro demo |
| OSD backend | BlueStore | Výchozí v Cephu, doporučeno |
| MON kvórum | 3 (1 per node) | Minimální pro HA |
| Cache | RAM (BlueStore cache) | 12 GB per OSD |
| Síť public | 2× 10 GbE LACP | VM traffic + Ceph frontend |
| Síť cluster | 2× 10 GbE LACP | Ceph backend replikace |
| MTU | 9000 (jumbo frames) | Doporučeno dle NETWORKING.md |
### 3.3 Storage layout na disku
```
/dev/sda 240 GB OS (RAID 1, mirror s /dev/sdb)
/dev/sdc 960 GB OSD.0 (RAW, BlueStore)
/dev/sdd 960 GB OSD.1 (RAW, BlueStore)
/dev/sde 960 GB OSD.2 (RAW, BlueStore)
```
### 3.4 Ceph pool design
| Pool | PG count | Replikace | Účel |
|------|----------|-----------|-------|
| vms | 128 | 3× | VM disky (RBD) |
| data | 64 | 3× | Data volume |
| backups | 32 | 3× | Zálohy (low priority) |
PG count orientační pro demo (9 OSD). Produkční vzorec: (OSD_total × 100) / replication_size.
---
## 4. Network
### 4.1 Topologie
```
┌─────────────────┐
│ 10 GbE Switch │
│ (24-port SFP+) │
└──┬──┬──┬──┬──┬──┘
┌─────────────┘ │ │ └─────────────┐
│ │ │ │
┌─────┴─────┐ ┌────┴──┴───┐ ┌───────┴──┐
│ Node 1 │ │ Node 2 │ │ Node 3 │
│ 4×10GbE │ │ 4×10GbE │ │ 4×10GbE │
│ ┌──────┐ │ │ ┌──────┐ │ │ ┌──────┐ │
│ │1GbE │ │ │ │1GbE │ │ │ │1GbE │ │
│ │BMC │ │ │ │BMC │ │ │ │BMC │ │
└─────────┘ └───────────┘ └───────────┘
```
### 4.2 VLAN a traffic segmentation
| VLAN | Účel | Porty | MTU |
|------|------|-------|-----|
| VLAN 10 | Management (Proxmox web UI, SSH) | 1× 1 GbE BMC | 1500 |
| VLAN 20 | VM traffic + Ceph public | 2× 10 GbE (bond) | 9000 |
| VLAN 30 | Ceph cluster (backend) | 2× 10 GbE (bond) | 9000 |
### 4.3 Switch
| Parametr | Hodnota |
|----------|---------|
| Model | MikroTik CRS326-24S+2Q+RM nebo podobný L2+ switch |
| Porty | 24× SFP+ 10 GbE |
| Management | VLAN 10, IP 10.0.0.254/24 |
| Features | VLAN, LACP (LAG), Jumbo frames (MTU 9000), SNMP |
### 4.4 Kabeláž
| Typ | Délka | Počet | Účel |
|-----|-------|-------|-------|
| SFP+ DAC (pasivní) | 3 m | 12 | 10 GbE propojení server ↔ switch |
| Cat6A UTP | 3 m | 3 | Management (1 GbE BMC) |
| Cat6A UTP | 1 m | 1 | Internet uplink (patch panel) |
DAC kabely jsou levnější než SFP+ optika + patch cordy — vhodné pro single-rack.
---
## 5. Rack layout
### 5.1 Rozměry a pozice
| U | Zařízení | Výkon (W) |
|---|----------|-----------|
| U1 | Switch 10 GbE (1U) | ~60 W |
| U2 | UPS (2U) | — |
| U3 | (volný, ventilace) | — |
| U4 | Server Node 1 (1U) | ~250 W |
| U5 | Server Node 2 (1U) | ~250 W |
| U6 | Server Node 3 (1U) | ~250 W |
| U7U15 | Volné (příp. storage, patch panel) | — |
| Parametr | Hodnota |
|----------|---------|
| Rack typ | 15U wall-mount, 19", 600×600 mm |
| Celkový IT load | ~810 W |
| PUE odhad | ~1.5 (kancelářská místnost, žádné precise cooling) |
| Chlazení | Běžná kancelářská klimatizace (ASHRAE A2: 1035 °C). Pro <1 kW dostačuje. |
**Poznámka:** KB (DATACENTERS.md) uvádí pro nízkou hustotu (<5 kW/rack) free air cooling. V kanceláři postačí standardní ventilace a AC.
### 5.2 UPS
| Parametr | Hodnota |
|----------|---------|
| Typ | VI (line-interactive) — dle DATACENTERS.md pro menší racky |
| Kapacita | 2000 VA / 1200 W |
| Záložní doba | ~1520 min při 810 W loadu |
| Výstup | 8× C13 (pro servery + switch) |
| Baterie | VRLA (levnější) nebo Li-ion LFP |
| Management | USB / SNMP karta (automatické vypnutí Proxmox) |
Volitelně lze rozšířit na VFI (double-conversion) UPS pro čistší výstup, ale u dema postačuje VI.
### 5.3 PDU
1× základní 1U PDU (8× C13), 230 V / 10 A — pro distribuci do serverů.
---
## 6. Hypervisor — Proxmox VE
### 6.1 Instalace a konfigurace
| Komponenta | Verze / Konfigurace |
|------------|---------------------|
| Hypervisor | Proxmox VE 8.x (Debian 12 + KVM + LXC) |
| Storage backend | Ceph Reef / Squid (18.x) integrovaný v Proxmox |
| Cluster | 3-node cluster, Corosync + PMXCFS |
| HA | Proxmox HA — 1 node failure tolerance (ostatní 2 převezmou VM) |
| Fencing | watchdog (softdog) + Proxmox HA manager |
### 6.2 Licence
| Položka | Cena | Poznámka |
|---------|------|----------|
| Proxmox VE | $0 | Open source, plná funkcionalita bez licence |
| Proxmox komunita support | $0 | Fórum, wiki |
| Proxmox podnikový support (volitelný) | ~€500/host/rok | Lze dokoupit později |
HYPERVISORS.md: Proxmox VE je "open source (free)", licence není vyžadována.
### 6.3 HA nastavení
- Skupina HA: všechny 3 nody, no-quorum-policy = "stop" (pro demo)
- Max restart VM: 2 pokusy
- Migration: live migration přes Ceph RBD (sdílený storage)
---
## 7. Odhad rozpočtu
**Upozornění:** KB neobsahuje konkrétní ceny komponent. Následující částky jsou orientační tržní odhady (Q2 2026, USD).
### 7.1 Servery (3×)
| Položka | Kusů | Cena/kus | Celkem |
|---------|------|----------|--------|
| 1U rack server (basic config, bez CPU/RAM/disk) | 3 | ~$1 200 | $3 600 |
| AMD EPYC 9224 (24C) / Intel Xeon 5418Y (16C) — dle KB | 3 | ~$900 | $2 700 |
| RAM 128 GB (4× 32 GB DDR5-4800 RDIMM) | 3 | ~$600 | $1 800 |
| 240 GB SATA SSD (OS) | 6 | ~$50 | $300 |
| 960 GB SATA SSD (Ceph OSD) | 9 | ~$150 | $1 350 |
| Dual-port 10 GbE SFP+ NIC (např. Intel X710-DA2) | 6 | ~$120 | $720 |
| **Servery celkem** | | | **~$10 470** |
### 7.2 Síť
| Položka | Kusů | Cena/kus | Celkem |
|---------|------|----------|--------|
| MikroTik CRS326-24S+2Q+RM (24× 10GbE SFP+) | 1 | ~$600 | $600 |
| SFP+ DAC kabel 3 m (pasivní) | 12 | ~$15 | $180 |
| Sítě celkem | | | **~$780** |
### 7.3 Rack a napájení
| Položka | Kusů | Cena/kus | Celkem |
|---------|------|----------|--------|
| 15U wall-mount rack 19" | 1 | ~$300 | $300 |
| UPS 2000 VA (line-interactive, VRLA) | 1 | ~$450 | $450 |
| 1U PDU basic (8× C13) | 1 | ~$60 | $60 |
| Rack + power celkem | | | **~$810** |
### 7.4 Ostatní
| Položka | Cena |
|---------|------|
| Cat6A patch kabely, management | ~$50 |
| Montážní materiál, velcro | ~$30 |
| Přeprava a instalace | ~$200 |
| Ostatní celkem | **~$280** |
### 7.5 Celková kalkulace
| Kategorie | Částka |
|-----------|--------|
| Servery (3× node) | ~$10 470 |
| Síť (switch + kabely) | ~$780 |
| Rack + napájení | ~$810 |
| Ostatní | ~$280 |
| **Celkem** | **~$12 340** |
| Rezerva (1015 %) | ~$1 2001 800 |
| **Celkem s rezervou** | **~$13 500$14 100** |
Rozpočet **$10 000$15 000** je dosažitelný. Při použití levnějších CPU (EPYC 4124P / Xeon E-2488) lze sestavit za ~$8 0009 000, ale s omezeným výkonem pro Ceph.
**Možné úspory:**
- CPU: 2× EPYC 4124P (4C) + 1× silnější node → ~$800 úspora (ale asymetrický cluster)
- OSD: 2× místo 3× SSD/node → ~$500 úspora (menší kapacita)
- Switch: 12-port místo 24-port → ~$300 úspora
---
## 8. Topologický diagram
```mermaid
flowchart TB
subgraph Rack["15U Rack (kancelář)"]
U1["U1: 10GbE Switch (MikroTik)"]
U2["U2: UPS 2000 VA"]
U4["U4: Node 1 — Proxmox + Ceph OSD"]
U5["U5: Node 2 — Proxmox + Ceph OSD"]
U6["U6: Node 3 — Proxmox + Ceph OSD"]
end
subgraph Node1["Node 1 (detail)"]
N1_CPU["CPU: EPYC 9224 (24C)"]
N1_RAM["RAM: 128 GB DDR5"]
N1_OS["OS: 2× 240 GB SSD (RAID 1)"]
N1_OSD1["OSD.0: 960 GB SSD"]
N1_OSD2["OSD.1: 960 GB SSD"]
N1_OSD3["OSD.2: 960 GB SSD"]
N1_NIC["NIC: 4× 10GbE SFP+"]
N1_BMC["BMC: 1× 1GbE"]
end
U1 ---|"4× 10GbE LACP<br/>(public + cluster)"| U4
U1 ---|"4× 10GbE LACP"| U5
U1 ---|"4× 10GbE LACP"| U6
U4 --- N1_CPU
U4 --- N1_RAM
U4 --- N1_OS
U4 --- N1_OSD1
U4 --- N1_OSD2
U4 --- N1_OSD3
U4 --- N1_NIC
U4 --- N1_BMC
subgraph Ceph["Ceph Cluster"]
CEPH_MON["3× MON (1 per node)"]
CEPH_MGR["3× MGR (1 per node)"]
CEPH_OSD["9× OSD (3 per node)"]
end
U4 --- CEPH_MON
U5 --- CEPH_MON
U6 --- CEPH_MON
U4 --- CEPH_MGR
U5 --- CEPH_MGR
U6 --- CEPH_MGR
U4 --- CEPH_OSD
U5 --- CEPH_OSD
U6 --- CEPH_OSD
subgraph Proxmox["Proxmox VE Cluster"]
PMX_HA["HA Group (3 nodes)"]
PMX_HA --- U4
PMX_HA --- U5
PMX_HA --- U6
end
subgraph Uplink["Internet / LAN"]
UPLINK_SW["Office LAN<br/>(1 GbE)"]
end
U1 ---|"1× Cat6A<br/>1 GbE"| UPLINK_SW
U1 ---|"Internet<br/>(ISP router)"| UPLINK_SW
```
---
## 9. Shrnutí a klíčová rozhodnutí
| Rozhodnutí | Varianta | Zdůvodnění |
|------------|----------|------------|
| Hypervisor | Proxmox VE | HYPERVISORS.md: "Pro SME / nízký budget — open source, vestavěný Ceph, žádné licenční náklady". Pro demo ideální. |
| Storage | Ceph (3× replikace) | STORAGE.md + SERVER-CONFIG.md: Ceph je doporučený SDS pro Proxmox, 3 nodes minimum pro kvórum. |
| CPU | Single-socket EPYC 9224 / Xeon 5418Y | Kompromis mezi cenou (Mini varianta ~1 socket) a výkonem pro Ceph (Ceph varianta ~12+ jader). |
| Network | 10 GbE SFP+ (místo 25 GbE) | KB doporučuje 25 GbE, ale pro demo low-cost stačí 10 GbE. Koncept (oddělení public/cluster sítě) zůstává stejný. |
| Rack | 15U wall-mount | Vhodný do kanceláře, bez raised floor, bez precision cooling. |
| UPS | 2000 VA line-interactive | DATACENTERS.md: VI typ pro menší racky. Pro demo dostačuje. |
| Licence | Proxmox VE (free) | Bez licenčních nákladů, support lze dokoupit později. |
### Kompromisy oproti produkčnímu nasazení
- **25 GbE → 10 GbE**: nižší propustnost Ceph cluster sítě (v demo prostředí nevadí)
- **HDD → SSD**: pro Ceph OSD volíme SSD místo HDD (vyšší cena, lepší výkon — v demu jde o funkčnost, ne kapacitu)
- **2× 10 GbE public + 2× 10 GbE cluster → dohromady na LACP**: lze sloučit při nedostatku portů, ale separace je lepší
- **Chlazení**: office AC, nikoliv DC-grade precision cooling (PUE ~1.51.8)
### Co KB neřeší (doplněno z praxe)
KB neobsahuje konkrétní ceny komponent — rozpočet je orientační tržní odhad. Dále neřeší konkrétní model switch poskytovatele L2+ funkcí (VLAN, LACP, Jumbo frames). Zde vycházíme z běžné praxe pro SOHO/SME segment.
---
## 10. Použité zdroje z KB
- **DATACENTERS.md** — rack layout, power chain, UPS typy, cooling třídy (ASHRAE), cabling standardy
- **HYPERVISORS.md** — Proxmox VE jako open source varianta, srovnání platforem, varianta Mini (23 hosty), Ceph connectivity
- **SERVER-CONFIG.md** — Čistě Ceph varianta (36 hostů), HW specifikace, network design, BIOS nastavení
- **STORAGE.md** — Ceph architektura (MON/MGR/OSD, CRUSH map, BlueStore, replikace), SDS přehled
- **CONNECTIVITY.md** — Ethernet rychlosti (10/25 GbE), SFP+ form factor, NIC placement, management port
- **NETWORKING.md** — VLAN segmentation, MTU a jumbo frames, best practices
- **SERVER-HW.md** — CPU selection (EPYC vs Xeon), RAM osazování (1DPC/2DPC), NUMA, form faktory
---
*Poslední revize: 2026-06-04*

BIN
sources/.DS_Store vendored Normal file

Binary file not shown.

21
sources/README.en.md Normal file
View File

@@ -0,0 +1,21 @@
# Raw sources — Immutable reference data
This directory contains raw reference data (links, books, standards, RFCs) from which the knowledge base is built.
**Rules:**
- Content is **immutable** — once added, it does not change (append only)
- A source is tagged `[done]` if it has already been processed into the KB
- Each area has its own `sources.md`
## Structure
```
sources/
├── README.md
├── cloud/
├── networking/
├── monitoring/
├── cicd/
├── databases/
└── infrastructure/
```

21
sources/README.md Normal file
View File

@@ -0,0 +1,21 @@
# Raw zdroje — Immutable reference data
Tento adresář obsahuje nespracovaná referenční data (odkazy, knihy, standardy, RFC), ze kterých knowledge base vychází.
**Pravidla:**
- Obsah je **immutable** — po přidání se nemění (pouze append)
- Zdroj označujeme tagem `[done]` pokud je již zpracován do KB
- Každá oblast má vlastní `sources.md`
## Struktura
```
sources/
├── README.md
├── cloud/
├── networking/
├── monitoring/
├── cicd/
├── databases/
└── infrastructure/
```

View File

@@ -0,0 +1,35 @@
# CI/CD and DevOps — Sources
## Official documentation
| Source | URL | Status |
|-------|-----|--------|
| Terraform docs | https://developer.hashicorp.com/terraform/docs | `[done]` |
| ArgoCD docs | https://argo-cd.readthedocs.io/ | `[done]` |
| Flux docs | https://fluxcd.io/flux/ | `[done]` |
| GitHub Actions docs | https://docs.github.com/en/actions | `[done]` |
| GitLab CI docs | https://docs.gitlab.com/ee/ci/ | `[done]` |
## Books
| Name | Author | ISBN | Status |
|-------|-------|------|--------|
| The DevOps Handbook | Kim, Humble, Debois, Willis | 978-1942788003 | `[done]` |
| Infrastructure as Code (2nd ed.) | Kief Morris | 978-1098114671 | `[done]` |
| Terraform: Up and Running (3rd ed.) | Yevgeniy Brikman | 978-1098166045 | `[done]` |
| Continuous Delivery | Humble, Farley | 978-0321601912 | `[done]` |
## Standards
| Standard | Description | Status |
|----------|-------|--------|
| 12 Factor App | https://12factor.net/ | `[done]` |
| CNCF Cloud Native Landscape | https://landscape.cncf.io/ | `[done]` |
## New books (20242026)
| Name | Author | ISBN | Status |
|-------|-------|------|--------|
| CI/CD Design Patterns | Bajpai, Schildmeijer, Piwosz, Mishra | 978-1-83588-965-7 | `[done]` |
| AI-Native Software Delivery | Durkin, Minick, Gaikwad | — (O'Reilly, 2025) | `[done]` |
| DevOps Frameworks, Techniques, and Tools | Vijayakumaran, Kofler, Öggl, Springer | 978-1-4932-2670-2 | `[done]` |

35
sources/cicd/sources.md Normal file
View File

@@ -0,0 +1,35 @@
# CI/CD a DevOps — Zdroje
## Oficiální dokumentace
| Zdroj | URL | Status |
|-------|-----|--------|
| Terraform docs | https://developer.hashicorp.com/terraform/docs | `[done]` |
| ArgoCD docs | https://argo-cd.readthedocs.io/ | `[done]` |
| Flux docs | https://fluxcd.io/flux/ | `[done]` |
| GitHub Actions docs | https://docs.github.com/en/actions | `[done]` |
| GitLab CI docs | https://docs.gitlab.com/ee/ci/ | `[done]` |
## Knihy
| Název | Autor | ISBN | Status |
|-------|-------|------|--------|
| The DevOps Handbook | Kim, Humble, Debois, Willis | 978-1942788003 | `[done]` |
| Infrastructure as Code (2nd ed.) | Kief Morris | 978-1098114671 | `[done]` |
| Terraform: Up and Running (3rd ed.) | Yevgeniy Brikman | 978-1098166045 | `[done]` |
| Continuous Delivery | Humble, Farley | 978-0321601912 | `[done]` |
## Standardy
| Standard | Popis | Status |
|----------|-------|--------|
| 12 Factor App | https://12factor.net/ | `[done]` |
| CNCF Cloud Native Landscape | https://landscape.cncf.io/ | `[done]` |
## Nové knihy (20242026)
| Název | Autor | ISBN | Status |
|-------|-------|------|--------|
| CI/CD Design Patterns | Bajpai, Schildmeijer, Piwosz, Mishra | 978-1-83588-965-7 | `[done]` |
| AI-Native Software Delivery | Durkin, Minick, Gaikwad | — (O'Reilly, 2025) | `[done]` |
| DevOps Frameworks, Techniques, and Tools | Vijayakumaran, Kofler, Öggl, Springer | 978-1-4932-2670-2 | `[done]` |

View File

@@ -0,0 +1,37 @@
# Cloud architecture — Sources
## Official documentation
| Source | URL | Status |
|-------|-----|--------|
| AWS Well-Architected Framework | https://docs.aws.amazon.com/wellarchitected/latest/framework/ | `[done]` |
| Azure Well-Architected Framework | https://learn.microsoft.com/en-us/azure/well-architected/ | `[done]` |
| Google Cloud Architecture Framework | https://cloud.google.com/architecture/framework | `[done]` |
| AWS Multi-AZ / Multi-Region whitepaper | https://docs.aws.amazon.com/whitepapers/latest/aws-fault-isolation-boundaries/ | `[done]` |
## Books
| Name | Author | ISBN | Status |
|-------|-------|------|--------|
| Cloud Architecture Patterns | Bill Wilder | 978-1449319779 | `[done]` |
| Building Evolutionary Architectures | Ford, Parsons, Kua | 978-1492097549 | `[done]` |
## New books (20242026)
| Name | Author | ISBN | Status |
|-------|-------|------|--------|
| Multi-Cloud Administration Guide | Jeroen Mulder | 978-1-5015-1948-2 | `[done]` |
| AWS for Solutions Architects (3rd ed.) | Shrivastava, Srivastav, Thakur | 978-1-83664-193-3 | `[done]` |
| Engineering Resilient Systems on AWS | Schwarz, Moran, Bachmeier | 978-1-098-16241-2 | `[done]` |
| Building Resilient Architectures on AWS | — | 978-1-83588-711-0 | `[done]` |
| Multi-Cloud Handbook for Developers | Natarajan, Jacob | 978-1-80461-709-0 | `[done]` |
| The Azure Cloud Native Architecture Mapbook (2nd ed.) | Stéphane Eyskens | 978-1-80580-505-2 | `[done]` |
| Cloud Computing: AWS, Azure, and Google Cloud | Azhar ul Haque Sario | 978-3384756886 | `[done]` |
## Certifications
| Certification | Area |
|-------------|--------|
| AWS Solutions Architect — Associate | AWS |
| Azure Solutions Architect Expert | Azure |
| Google Professional Cloud Architect | GCP |

37
sources/cloud/sources.md Normal file
View File

@@ -0,0 +1,37 @@
# Cloud architektura — Zdroje
## Oficiální dokumentace
| Zdroj | URL | Status |
|-------|-----|--------|
| AWS Well-Architected Framework | https://docs.aws.amazon.com/wellarchitected/latest/framework/ | `[done]` |
| Azure Well-Architected Framework | https://learn.microsoft.com/en-us/azure/well-architected/ | `[done]` |
| Google Cloud Architecture Framework | https://cloud.google.com/architecture/framework | `[done]` |
| AWS Multi-AZ / Multi-Region whitepaper | https://docs.aws.amazon.com/whitepapers/latest/aws-fault-isolation-boundaries/ | `[done]` |
## Knihy
| Název | Autor | ISBN | Status |
|-------|-------|------|--------|
| Cloud Architecture Patterns | Bill Wilder | 978-1449319779 | `[done]` |
| Building Evolutionary Architectures | Ford, Parsons, Kua | 978-1492097549 | `[done]` |
## Nové knihy (20242026)
| Název | Autor | ISBN | Status |
|-------|-------|------|--------|
| Multi-Cloud Administration Guide | Jeroen Mulder | 978-1-5015-1948-2 | `[done]` |
| AWS for Solutions Architects (3rd ed.) | Shrivastava, Srivastav, Thakur | 978-1-83664-193-3 | `[done]` |
| Engineering Resilient Systems on AWS | Schwarz, Moran, Bachmeier | 978-1-098-16241-2 | `[done]` |
| Building Resilient Architectures on AWS | — | 978-1-83588-711-0 | `[done]` |
| Multi-Cloud Handbook for Developers | Natarajan, Jacob | 978-1-80461-709-0 | `[done]` |
| The Azure Cloud Native Architecture Mapbook (2nd ed.) | Stéphane Eyskens | 978-1-80580-505-2 | `[done]` |
| Cloud Computing: AWS, Azure, and Google Cloud | Azhar ul Haque Sario | 978-3384756886 | `[done]` |
## Certifikace
| Certifikace | Oblast |
|-------------|--------|
| AWS Solutions Architect — Associate | AWS |
| Azure Solutions Architect Expert | Azure |
| Google Professional Cloud Architect | GCP |

View File

@@ -0,0 +1,34 @@
# Database architecture — Sources
## Official documentation
| Source | URL | Status |
|-------|-----|--------|
| PostgreSQL docs | https://www.postgresql.org/docs/ | `[done]` |
| MySQL docs | https://dev.mysql.com/doc/ | `[done]` |
| MongoDB docs | https://www.mongodb.com/docs/ | `[done]` |
| Redis docs | https://redis.io/docs/ | `[done]` |
| Cassandra docs | https://cassandra.apache.org/doc/ | `[done]` |
| Amazon DynamoDB docs | https://docs.aws.amazon.com/dynamodb/ | `[done]` |
## Books
| Name | Author | ISBN | Status |
|-------|-------|------|--------|
| Designing Data-Intensive Applications (1st ed.) | Martin Kleppmann | 978-1449373320 | `[done]` |
| Designing Data-Intensive Applications (2nd ed.) | Kleppmann, Riccomini | 978-1098119058 | `[done]` |
| Database Internals | Alex Petrov | 978-1492040346 | `[done]` |
| High Performance MySQL | Schwartz, Zaitsev, Tkachenko | 978-1492080510 | `[done]` |
| PostgreSQL: Up and Running | Regina Obe, Leo Hsu | 978-1491963418 | `[done]` |
| Architecting an Apache Iceberg Lakehouse | Alex Merced | 978-1-63343-510-0 | `[done]` |
| More SQL Antipatterns | Bill Karwin | 979-8888652060 | `[done]` |
| AI-Ready PostgreSQL 18 | Vibhor Kumar, Marc Linster | 978-1-80602-847-4 | `[done]` |
| Vector Databases | Nitin Borwankar | 978-1-098-17758-4 | `[done]` |
## Articles / talks
| Name | URL | Status |
|-------|-----|--------|
| CAP Theorem (Eric Brewer) | https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed/ | `[done]` |
| PACELC theorem | https://www.cs.umd.edu/~abadi/papers/abadi-pacelc.pdf | `[done]` |
| Amazon Dynamo DB paper | https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf | `[done]` |

View File

@@ -0,0 +1,34 @@
# Databázová architektura — Zdroje
## Oficiální dokumentace
| Zdroj | URL | Status |
|-------|-----|--------|
| PostgreSQL docs | https://www.postgresql.org/docs/ | `[done]` |
| MySQL docs | https://dev.mysql.com/doc/ | `[done]` |
| MongoDB docs | https://www.mongodb.com/docs/ | `[done]` |
| Redis docs | https://redis.io/docs/ | `[done]` |
| Cassandra docs | https://cassandra.apache.org/doc/ | `[done]` |
| Amazon DynamoDB docs | https://docs.aws.amazon.com/dynamodb/ | `[done]` |
## Knihy
| Název | Autor | ISBN | Status |
|-------|-------|------|--------|
| Designing Data-Intensive Applications (1st ed.) | Martin Kleppmann | 978-1449373320 | `[done]` |
| Designing Data-Intensive Applications (2nd ed.) | Kleppmann, Riccomini | 978-1098119058 | `[done]` |
| Database Internals | Alex Petrov | 978-1492040346 | `[done]` |
| High Performance MySQL | Schwartz, Zaitsev, Tkachenko | 978-1492080510 | `[done]` |
| PostgreSQL: Up and Running | Regina Obe, Leo Hsu | 978-1491963418 | `[done]` |
| Architecting an Apache Iceberg Lakehouse | Alex Merced | 978-1-63343-510-0 | `[done]` |
| More SQL Antipatterns | Bill Karwin | 979-8888652060 | `[done]` |
| AI-Ready PostgreSQL 18 | Vibhor Kumar, Marc Linster | 978-1-80602-847-4 | `[done]` |
| Vector Databases | Nitin Borwankar | 978-1-098-17758-4 | `[done]` |
## Články / přednášky
| Název | URL | Status |
|-------|-----|--------|
| CAP Theorem (Eric Brewer) | https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed/ | `[done]` |
| PACELC theorem | https://www.cs.umd.edu/~abadi/papers/abadi-pacelc.pdf | `[done]` |
| Amazon Dynamo DB paper | https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf | `[done]` |

View File

@@ -0,0 +1,124 @@
# Infrastructure — Sources
Split into separate files:
- [HYPERVISORS.md](../../HYPERVISORS.md) — hypervisors and virtualization
- [DATACENTERS.md](../../DATACENTERS.md) — data centers
- [STORAGE.md](../../STORAGE.md) — storage
- [HARDWARE.md](../../HARDWARE.md) — hardware and servers
## Official documentation
| Source | URL | Status |
|-------|-----|--------|
| VMware vSphere docs | https://docs.vmware.com/en/VMware-vSphere/ | `[done]` |
| Microsoft Hyper-V docs | https://learn.microsoft.com/en-us/windows-server/virtualization/hyper-v/ | `[done]` |
| Proxmox VE docs | https://pve.proxmox.com/wiki/Main_Page | `[done]` |
| OpenStack docs | https://docs.openstack.org/ | `[done]` |
| Ceph docs | https://docs.ceph.com/ | `[done]` |
| Redfish specification | https://www.dmtf.org/standards/redfish | `[done]` |
## Standards
| Standard | Description | Status |
|----------|-------|--------|
| TIA-942 | Telecommunications Infrastructure Standard for Data Centers | `[done]` |
| Uptime Institute Tier Standard | Data Center Tier Classification | `[done]` |
| ASHRAE TC 9.9 | Thermal Guidelines for Data Processing Environments | `[done]` |
| S.M.A.R.T. | Self-Monitoring, Analysis and Reporting Technology | `[done]` |
## Books
| Name | Author | ISBN | Status |
|-------|-------|------|--------|
| The Data Center as a Computer (1st ed. → 4th ed. 2025) | Barroso, Hölzle, Ranganathan | 978-3-031-99488-3 | `[done]` |
| Storage Systems | Ganger, Gibson | 978-1680837540 | `[done]` |
| Virtualization Essentials | Matthew Portnoy | 978-1119481513 | `[done]` |
| VMware vSphere Design (2nd ed.) | Forbes Guthrie, Scott Lowe | 978-1119130312 | `[done]` |
| AI Data Center Network Design and Technologies (1st ed.) | Subramaniam, Styszynski, Tambakuwala | 978-0-13-543628-8 | `[done]` |
| Electronics Cooling: From the Chip to the Datacenter | Abraham et al. | 978-0-443-47084-4 | `[done]` |
| The AI Cloud Infrastructure Blueprint | Thummarakoti, Vududala, Madupati, Kaushik | 978-1-041-16642-9 | `[done]` |
## Server connectivity
| Source | URL | Status |
|-------|-----|--------|
| HPE Gen11 NIC selection guide | https://www.hpe.com/psnow/doc/a50007643enw | `[done]` |
| Broadcom / Emulex FC HBA specs | https://www.broadcom.com/products/storage/fibre-channel-host-bus-adapters | `[done]` |
| NVIDIA Mellanox Ethernet + InfiniBand adapters | https://www.nvidia.com/en-us/networking/ethernet/ | `[done]` |
| NVMe-oF specification (NVM Express Inc.) | https://nvmexpress.org/specifications/ | `[done]` |
| Dell PowerEdge R760 NIC placement guide | https://www.dell.com/support/manuals/en-us/poweredge-r760/per760_ism_pub/ | `[done]` |
## Server memory — DIMM population
| Source | URL | Status |
|-------|-----|--------|
| Dell PowerEdge R760 Installation & Service Manual — System Memory Guidelines | https://www.dell.com/support/manuals/en-al/oth-r760/per760_ism_pub/system-memory-guidelines | `[done]` |
| Dell PowerEdge R760 — General Memory Module Installation Guidelines | https://www.dell.com/support/manuals/en-al/oth-r760/per760_ism_pub/general-memory-module-installation-guidelines | `[done]` |
| HPE Gen11 Server Memory Population Rules (4th Gen Intel Xeon) | https://www.hpe.com/psnow/doc/a50007437enw | `[done]` |
| HPE Gen11 Server Memory Population Rules (5th Gen Intel Xeon) | https://www.hpe.com/psnow/doc/a50010242enw | `[done]` |
| HPE Gen11/Gen12 Server Memory Population Rules (AMD EPYC 9005) | https://www.hpe.com/psnow/doc/a50012817enw | `[done]` |
| Single Rank vs Dual Rank vs Quad Rank vs Octa Rank Memory | https://corewavelabs.com/single-rank-vs-dual-rank-vs-quad-vs-octa-memory/ | `[done]` |
## Enterprise storage
| Source | URL | Status |
|-------|-----|--------|
| Hitachi VSP 5000 series datasheet | https://www.hitachivantara.com/en-us/products/storage/vsp-5000-series | `[done]` |
| Hitachi VSP E series datasheet | https://www.hitachivantara.com/en-us/products/storage/vsp-e-series | `[done]` |
| Huawei OceanStor Dorado V6 datasheet | https://e.huawei.com/en/products/storage/all-flash-storage/dorado-8000 | `[done]` |
| Huawei OceanStor Dorado V7 announcement | https://e.huawei.com/en/news/2025/oceanstor-dorado-v7 | `[done]` |
| Dell PowerStore documentation | https://www.dell.com/en-us/dt/storage/powerstore.htm | `[done]` |
| Dell PowerMax documentation | https://www.dell.com/en-us/dt/storage/powermax.htm | `[done]` |
| HPE Alletra documentation | https://www.hpe.com/us/en/storage/alletra.html | `[done]` |
| Infinidat InfiniBox SSA G4 datasheet | https://www.infinidat.com/en/products/infinibox-ssa | `[done]` |
| Pure Storage FlashArray documentation | https://www.purestorage.com/products/flasharray.html | `[done]` |
| Lenovo ThinkSystem DM series docs | https://lenovopress.com/storage/thinkstorage/dm-series | `[done]` |
| Lenovo ThinkSystem DE series docs | https://lenovopress.com/storage/thinkstorage/de-series | `[done]` |
| Synology Unified Controller datasheet | https://www.synology.com/en-us/products/UC3400 | `[done]` |
## OpenStack
| Source | URL | Status |
|-------|-----|--------|
| OpenStack Neutron networking docs | https://docs.openstack.org/neutron/latest/ | `[done]` |
| OpenStack Cinder block storage docs | https://docs.openstack.org/cinder/latest/ | `[done]` |
| OpenStack Swift object storage docs | https://docs.openstack.org/swift/latest/ | `[done]` |
| OpenStack Cyborg GPU lifecycle docs | https://docs.openstack.org/cyborg/latest/ | `[done]` |
| OpenStack Ironic bare metal docs | https://docs.openstack.org/ironic/latest/ | `[done]` |
| TripleO deployment docs | https://docs.openstack.org/tripleo-docs/latest/ | `[done]` |
| OpenStack Kolla (Kubernetes deployment) docs | https://docs.openstack.org/kolla/latest/ | `[done]` |
| Canonical Charmed OpenStack docs | https://ubuntu.com/openstack/docs | `[done]` |
| OpenStack Ceilometer / Telemetry docs | https://docs.openstack.org/ceilometer/latest/ | `[done]` |
| OpenStack Masakari (VM HA) docs | https://docs.openstack.org/masakari/latest/ | `[done]` |
| OpenStack Cyborg (GPU lifecycle management) | https://docs.openstack.org/cyborg/latest/ | `[done]` |
| OpenQA — OpenStack CI/CD | https://github.com/openstack-infra/openqa | `[done]` |
| OpenStack Charms (Juju) deployment | https://charmhub.io/openstack | `[done]` |
| OpenStack Zuul CI/CD system | https://zuul-ci.org/docs/zuul/ | `[done]` |
## VMware exit strategy
| Source | URL | Status |
|-------|-----|--------|
| VMware Alternatives in 2026: A Practical Exit Playbook — Platform9 | https://platform9.com/blog/vmware-alternatives-in-2026-a-practical-exit-playbook | `[done]` |
| VMware Exit Strategy — Intelligent Visibility | https://intelligentvisibility.com/data-center-infrastructure/vmware-exit-strategy | `[done]` |
| VMware to Proxmox Migration Guide 2026 — Petronella Tech | https://petronellatech.com/blog/vmware-to-proxmox-migration-guide | `[done]` |
| Migrating from VMware to Proxmox — Hornetsecurity | https://www.hornetsecurity.com/en/blog/migrate-vmware-to-proxmox | `[done]` |
| The Great VMware Exodus — Virtualization Howto | https://www.virtualizationhowto.com/2025/07/the-great-vmware-exodus-real-migration-stories-and-alternatives-for-2025/ | `[done]` |
| VMware to Hyper-V Migration 2026 — iShift | https://www.ishift.net/vmware-hyper-v-migration-2026 | `[done]` |
| VMware to Nutanix Migration 2026 — Redress Compliance | https://redresscompliance.com/vmware-to-nutanix | `[done]` |
| Hyper-V Licensing 2026 — Redress Compliance | https://redresscompliance.com/hyper-v-licensing-2026 | `[done]` |
| Beyond virtualization: a guide to modern vSphere alternatives — Spectro Cloud | https://www.spectrocloud.com/blog/vsphere-alternatives | `[done]` |
| VMware Migration in 2026: Proxmox, KVM, XCP-ng & Veeam — StarWind | https://starwindsoftware.com/blog/vmware-migration-to-proxmox-kvm-xcp-ng-2026 | `[done]` |
| Complete guide to modern vSphere alternatives — Spectro Cloud | https://www.spectrocloud.com/blog/vsphere-alternatives | `[done]` |
| Broadcom VMware Acquisition: What's Next — Sayers | https://www.sayers.com/blog/after-the-deal-whats-next-for-vmware-customers | `[done]` |
| Stanford University migration from VMware to Proxmox | https://itcommunity.stanford.edu/news/enterprise-technology-completes-successful-virtual-infrastructure-migration-vmware-proxmox | `[done]` |
## Hardware manufacturers
| Manufacturer | Server series | Management |
|---------|---------------|------------|
| Dell | PowerEdge (R6xx, R7xx) | iDRAC / OpenManage |
| HPE | ProLiant (DL, ML, Synergy) | iLO / OneView |
| Cisco | UCS (B-Series, C-Series) | UCS Manager / Intersight |
| Lenovo | ThinkSystem (SR, ST) | XClarity |
| Supermicro | SuperServer (cloud, storage, GPU) | IPMI / SuperDoctor |

View File

@@ -0,0 +1,124 @@
# Infrastruktura — Zdroje
Rozděleno do samostatných souborů:
- [HYPERVISORS.md](../../HYPERVISORS.md) — hypervisory a virtualizace
- [DATACENTERS.md](../../DATACENTERS.md) — datová centra
- [STORAGE.md](../../STORAGE.md) — storage
- [HARDWARE.md](../../HARDWARE.md) — hardware a servery
## Oficiální dokumentace
| Zdroj | URL | Status |
|-------|-----|--------|
| VMware vSphere docs | https://docs.vmware.com/en/VMware-vSphere/ | `[done]` |
| Microsoft Hyper-V docs | https://learn.microsoft.com/en-us/windows-server/virtualization/hyper-v/ | `[done]` |
| Proxmox VE docs | https://pve.proxmox.com/wiki/Main_Page | `[done]` |
| OpenStack docs | https://docs.openstack.org/ | `[done]` |
| Ceph docs | https://docs.ceph.com/ | `[done]` |
| Redfish specification | https://www.dmtf.org/standards/redfish | `[done]` |
## Standardy
| Standard | Popis | Status |
|----------|-------|--------|
| TIA-942 | Telecommunications Infrastructure Standard for Data Centers | `[done]` |
| Uptime Institute Tier Standard | Data Center Tier Classification | `[done]` |
| ASHRAE TC 9.9 | Thermal Guidelines for Data Processing Environments | `[done]` |
| S.M.A.R.T. | Self-Monitoring, Analysis and Reporting Technology | `[done]` |
## Knihy
| Název | Autor | ISBN | Status |
|-------|-------|------|--------|
| The Data Center as a Computer (1st ed. → 4th ed. 2025) | Barroso, Hölzle, Ranganathan | 978-3-031-99488-3 | `[done]` |
| Storage Systems | Ganger, Gibson | 978-1680837540 | `[done]` |
| Virtualization Essentials | Matthew Portnoy | 978-1119481513 | `[done]` |
| VMware vSphere Design (2nd ed.) | Forbes Guthrie, Scott Lowe | 978-1119130312 | `[done]` |
| AI Data Center Network Design and Technologies (1st ed.) | Subramaniam, Styszynski, Tambakuwala | 978-0-13-543628-8 | `[done]` |
| Electronics Cooling: From the Chip to the Datacenter | Abraham et al. | 978-0-443-47084-4 | `[done]` |
| The AI Cloud Infrastructure Blueprint | Thummarakoti, Vududala, Madupati, Kaushik | 978-1-041-16642-9 | `[done]` |
## Server connectivity
| Zdroj | URL | Status |
|-------|-----|--------|
| HPE Gen11 NIC selection guide | https://www.hpe.com/psnow/doc/a50007643enw | `[done]` |
| Broadcom / Emulex FC HBA specs | https://www.broadcom.com/products/storage/fibre-channel-host-bus-adapters | `[done]` |
| NVIDIA Mellanox Ethernet + InfiniBand adapters | https://www.nvidia.com/en-us/networking/ethernet/ | `[done]` |
| NVMe-oF specification (NVM Express Inc.) | https://nvmexpress.org/specifications/ | `[done]` |
| Dell PowerEdge R760 NIC placement guide | https://www.dell.com/support/manuals/en-us/poweredge-r760/per760_ism_pub/ | `[done]` |
## Server memory — osazování DIMM
| Zdroj | URL | Status |
|-------|-----|--------|
| Dell PowerEdge R760 Installation & Service Manual — System Memory Guidelines | https://www.dell.com/support/manuals/en-al/oth-r760/per760_ism_pub/system-memory-guidelines | `[done]` |
| Dell PowerEdge R760 — General Memory Module Installation Guidelines | https://www.dell.com/support/manuals/en-al/oth-r760/per760_ism_pub/general-memory-module-installation-guidelines | `[done]` |
| HPE Gen11 Server Memory Population Rules (4th Gen Intel Xeon) | https://www.hpe.com/psnow/doc/a50007437enw | `[done]` |
| HPE Gen11 Server Memory Population Rules (5th Gen Intel Xeon) | https://www.hpe.com/psnow/doc/a50010242enw | `[done]` |
| HPE Gen11/Gen12 Server Memory Population Rules (AMD EPYC 9005) | https://www.hpe.com/psnow/doc/a50012817enw | `[done]` |
| Single Rank vs Dual Rank vs Quad Rank vs Octa Rank Memory | https://corewavelabs.com/single-rank-vs-dual-rank-vs-quad-vs-octa-memory/ | `[done]` |
## Enterprise storage
| Zdroj | URL | Status |
|-------|-----|--------|
| Hitachi VSP 5000 series datasheet | https://www.hitachivantara.com/en-us/products/storage/vsp-5000-series | `[done]` |
| Hitachi VSP E series datasheet | https://www.hitachivantara.com/en-us/products/storage/vsp-e-series | `[done]` |
| Huawei OceanStor Dorado V6 datasheet | https://e.huawei.com/en/products/storage/all-flash-storage/dorado-8000 | `[done]` |
| Huawei OceanStor Dorado V7 announcement | https://e.huawei.com/en/news/2025/oceanstor-dorado-v7 | `[done]` |
| Dell PowerStore documentation | https://www.dell.com/en-us/dt/storage/powerstore.htm | `[done]` |
| Dell PowerMax documentation | https://www.dell.com/en-us/dt/storage/powermax.htm | `[done]` |
| HPE Alletra documentation | https://www.hpe.com/us/en/storage/alletra.html | `[done]` |
| Infinidat InfiniBox SSA G4 datasheet | https://www.infinidat.com/en/products/infinibox-ssa | `[done]` |
| Pure Storage FlashArray documentation | https://www.purestorage.com/products/flasharray.html | `[done]` |
| Lenovo ThinkSystem DM series docs | https://lenovopress.com/storage/thinkstorage/dm-series | `[done]` |
| Lenovo ThinkSystem DE series docs | https://lenovopress.com/storage/thinkstorage/de-series | `[done]` |
| Synology Unified Controller datasheet | https://www.synology.com/en-us/products/UC3400 | `[done]` |
## OpenStack
| Zdroj | URL | Status |
|-------|-----|--------|
| OpenStack Neutron networking docs | https://docs.openstack.org/neutron/latest/ | `[done]` |
| OpenStack Cinder block storage docs | https://docs.openstack.org/cinder/latest/ | `[done]` |
| OpenStack Swift object storage docs | https://docs.openstack.org/swift/latest/ | `[done]` |
| OpenStack Cyborg GPU lifecycle docs | https://docs.openstack.org/cyborg/latest/ | `[done]` |
| OpenStack Ironic bare metal docs | https://docs.openstack.org/ironic/latest/ | `[done]` |
| TripleO deployment docs | https://docs.openstack.org/tripleo-docs/latest/ | `[done]` |
| OpenStack Kolla (Kubernetes deployment) docs | https://docs.openstack.org/kolla/latest/ | `[done]` |
| Canonical Charmed OpenStack docs | https://ubuntu.com/openstack/docs | `[done]` |
| OpenStack Ceilometer / Telemetry docs | https://docs.openstack.org/ceilometer/latest/ | `[done]` |
| OpenStack Masakari (VM HA) docs | https://docs.openstack.org/masakari/latest/ | `[done]` |
| OpenStack Cyborg (GPU lifecycle management) | https://docs.openstack.org/cyborg/latest/ | `[done]` |
| OpenQA — OpenStack CI/CD | https://github.com/openstack-infra/openqa | `[done]` |
| OpenStack Charms (Juju) deployment | https://charmhub.io/openstack | `[done]` |
| OpenStack Zuul CI/CD system | https://zuul-ci.org/docs/zuul/ | `[done]` |
## VMware exit strategie
| Zdroj | URL | Status |
|-------|-----|--------|
| VMware Alternatives in 2026: A Practical Exit Playbook — Platform9 | https://platform9.com/blog/vmware-alternatives-in-2026-a-practical-exit-playbook | `[done]` |
| VMware Exit Strategy — Intelligent Visibility | https://intelligentvisibility.com/data-center-infrastructure/vmware-exit-strategy | `[done]` |
| VMware to Proxmox Migration Guide 2026 — Petronella Tech | https://petronellatech.com/blog/vmware-to-proxmox-migration-guide | `[done]` |
| Migrating from VMware to Proxmox — Hornetsecurity | https://www.hornetsecurity.com/en/blog/migrate-vmware-to-proxmox | `[done]` |
| The Great VMware Exodus — Virtualization Howto | https://www.virtualizationhowto.com/2025/07/the-great-vmware-exodus-real-migration-stories-and-alternatives-for-2025/ | `[done]` |
| VMware to Hyper-V Migration 2026 — iShift | https://www.ishift.net/vmware-hyper-v-migration-2026 | `[done]` |
| VMware to Nutanix Migration 2026 — Redress Compliance | https://redresscompliance.com/vmware-to-nutanix | `[done]` |
| Hyper-V Licensing 2026 — Redress Compliance | https://redresscompliance.com/hyper-v-licensing-2026 | `[done]` |
| Beyond virtualization: a guide to modern vSphere alternatives — Spectro Cloud | https://www.spectrocloud.com/blog/vsphere-alternatives | `[done]` |
| VMware Migration in 2026: Proxmox, KVM, XCP-ng & Veeam — StarWind | https://starwindsoftware.com/blog/vmware-migration-to-proxmox-kvm-xcp-ng-2026 | `[done]` |
| Complete guide to modern vSphere alternatives — Spectro Cloud | https://www.spectrocloud.com/blog/vsphere-alternatives | `[done]` |
| Broadcom VMware Acquisition: What's Next — Sayers | https://www.sayers.com/blog/after-the-deal-whats-next-for-vmware-customers | `[done]` |
| Stanford University migration from VMware to Proxmox | https://itcommunity.stanford.edu/news/enterprise-technology-completes-successful-virtual-infrastructure-migration-vmware-proxmox | `[done]` |
## Výrobci hardware
| Výrobce | Serverové řady | Management |
|---------|---------------|------------|
| Dell | PowerEdge (R6xx, R7xx) | iDRAC / OpenManage |
| HPE | ProLiant (DL, ML, Synergy) | iLO / OneView |
| Cisco | UCS (B-Series, C-Series) | UCS Manager / Intersight |
| Lenovo | ThinkSystem (SR, ST) | XClarity |
| Supermicro | SuperServer (cloud, storage, GPU) | IPMI / SuperDoctor |

View File

@@ -0,0 +1,50 @@
# Monitoring and observability — Sources
## Official documentation
| Source | URL | Status |
|-------|-----|--------|
| Prometheus docs | https://prometheus.io/docs/ | `[done]` |
| Grafana docs | https://grafana.com/docs/ | `[done]` |
| Zabbix docs | https://www.zabbix.com/documentation/ | `[done]` |
| OpenTelemetry specification | https://opentelemetry.io/docs/specs/otel/ | `[done]` |
| OpenMetrics standard | https://openmetrics.io/ | `[done]` |
## Books
| Name | Author | ISBN | Status |
|-------|-------|------|--------|
| Site Reliability Engineering | Beyer, Jones, Petoff, Murphy | 978-1491929124 | `[done]` |
| The Site Reliability Workbook | Beyer, Jones, Petoff, Murphy | 978-1492029502 | `[done]` |
| Observability Engineering | Majors, Fong-Pong | 978-1492076445 | `[done]` |
## Articles
| Name | URL | Status |
|-------|-----|--------|
| The USE Method (Brendan Gregg) | https://www.brendangregg.com/usemethod.html | `[done]` |
| The RED Method (Tom Wilkie) | https://grafana.com/blog/2018/08/02/the-red-method-how-to-instrument-your-services/ | `[done]` |
| Google SRE book (free) | https://sre.google/sre-book/table-of-contents/ | `[done]` |
## New books (20242026)
| Name | Author | ISBN | Status |
|-------|-------|------|--------|
| Mastering OpenTelemetry and Observability | Steve Flanders | 978-1-394-25312-8 | `[done]` |
| OpenTelemetry Cookbook | — | 978-9349174238 | `[done]` |
| Cloud Observability in Action | Michael Hausenblas | — (Manning, 2023) | `[done]` |
| Observability in the AI-Native Era | Lipsig, Grabner, Rati | 978-1-80638-959-9 | `[done]` |
| Mastering Prometheus | William Hegedus | 978-1-80512-566-2 | `[done]` |
| Observability with Grafana (LGTM stack) | Chapman, Holmes | 978-1-80324-964-3 | `[done]` |
| Open Source Observability | Corless, Pawar | — (O'Reilly, 2025) | `[done]` |
| Hands-On Monitoring and Alerting with Prometheus | Muhammad Badawy | 978-9349887565 | `[done]` |
## New tools (20242026)
| Tool | Description | URL | Status |
|---------|-------|-----|--------|
| Grafana Sigil | AI observability (OpenTelemetry-native) | https://github.com/grafana/sigil | `[done]` |
| InfraLens | eBPF-based zero-instrumentation observability | https://github.com/Herenn/Infralens | `[done]` |
| Ingero | GPU causal observability (eBPF) | https://github.com/ingero-io/ingero | `[done]` |
| GreptimeDB | Unified observability DB (OTel-native) | https://github.com/GreptimeTeam/greptimedb | `[done]` |
| Netdata | AI-powered full-stack observability | https://github.com/netdata/netdata | `[done]` |

View File

@@ -0,0 +1,50 @@
# Monitoring a observabilita — Zdroje
## Oficiální dokumentace
| Zdroj | URL | Status |
|-------|-----|--------|
| Prometheus docs | https://prometheus.io/docs/ | `[done]` |
| Grafana docs | https://grafana.com/docs/ | `[done]` |
| Zabbix docs | https://www.zabbix.com/documentation/ | `[done]` |
| OpenTelemetry specification | https://opentelemetry.io/docs/specs/otel/ | `[done]` |
| OpenMetrics standard | https://openmetrics.io/ | `[done]` |
## Knihy
| Název | Autor | ISBN | Status |
|-------|-------|------|--------|
| Site Reliability Engineering | Beyer, Jones, Petoff, Murphy | 978-1491929124 | `[done]` |
| The Site Reliability Workbook | Beyer, Jones, Petoff, Murphy | 978-1492029502 | `[done]` |
| Observability Engineering | Majors, Fong-Pong | 978-1492076445 | `[done]` |
## Články
| Název | URL | Status |
|-------|-----|--------|
| The USE Method (Brendan Gregg) | https://www.brendangregg.com/usemethod.html | `[done]` |
| The RED Method (Tom Wilkie) | https://grafana.com/blog/2018/08/02/the-red-method-how-to-instrument-your-services/ | `[done]` |
| Google SRE book (free) | https://sre.google/sre-book/table-of-contents/ | `[done]` |
## Nové knihy (20242026)
| Název | Autor | ISBN | Status |
|-------|-------|------|--------|
| Mastering OpenTelemetry and Observability | Steve Flanders | 978-1-394-25312-8 | `[done]` |
| OpenTelemetry Cookbook | — | 978-9349174238 | `[done]` |
| Cloud Observability in Action | Michael Hausenblas | — (Manning, 2023) | `[done]` |
| Observability in the AI-Native Era | Lipsig, Grabner, Rati | 978-1-80638-959-9 | `[done]` |
| Mastering Prometheus | William Hegedus | 978-1-80512-566-2 | `[done]` |
| Observability with Grafana (LGTM stack) | Chapman, Holmes | 978-1-80324-964-3 | `[done]` |
| Open Source Observability | Corless, Pawar | — (O'Reilly, 2025) | `[done]` |
| Hands-On Monitoring and Alerting with Prometheus | Muhammad Badawy | 978-9349887565 | `[done]` |
## Nové nástroje (20242026)
| Nástroj | Popis | URL | Status |
|---------|-------|-----|--------|
| Grafana Sigil | AI observability (OpenTelemetry-native) | https://github.com/grafana/sigil | `[done]` |
| InfraLens | eBPF-based zero-instrumentation observability | https://github.com/Herenn/Infralens | `[done]` |
| Ingero | GPU causal observability (eBPF) | https://github.com/ingero-io/ingero | `[done]` |
| GreptimeDB | Unified observability DB (OTel-native) | https://github.com/GreptimeTeam/greptimedb | `[done]` |
| Netdata | AI-powered full-stack observability | https://github.com/netdata/netdata | `[done]` |

View File

@@ -0,0 +1,40 @@
# Network architecture — Sources
## RFCs and standards
| RFC | Name | Status |
|-----|-------|--------|
| RFC 791 | Internet Protocol | `[done]` |
| RFC 793 | Transmission Control Protocol | `[done]` |
| RFC 1034/1035 | Domain Names — Concepts and Facilities | `[done]` |
| RFC 4271 | Border Gateway Protocol (BGP-4) | `[done]` |
| RFC 5246 | TLS 1.2 | `[done]` |
| RFC 8446 | TLS 1.3 | `[done]` |
## Official documentation
| Source | URL | Status |
|-------|-----|--------|
| AWS VPC docs | https://docs.aws.amazon.com/vpc/ | `[done]` |
| Azure Virtual Network docs | https://learn.microsoft.com/en-us/azure/virtual-network/ | `[done]` |
| Google VPC docs | https://cloud.google.com/vpc/docs | `[done]` |
## Books
| Name | Author | ISBN | Status |
|-------|-------|------|--------|
| Computer Networking: A Top-Down Approach | Kurose, Ross | 978-0133594140 | `[done]` |
| TCP/IP Illustrated | W. Richard Stevens | 978-0321336316 | `[done]` |
## New books (20242026)
| Name | Author | ISBN | Status |
|-------|-------|------|--------|
| AI Data Center Network Design and Technologies | Subramaniam, Styszynski, Tambakuwala | 978-0-13-543628-8 | `[done]` |
| Cloud Networking and Resilience | Cristian Critelli | 979-8868824357 | `[done]` |
| Zero Trust in Resilient Cloud and Network Architectures | Halley, Prajapati, Leza, Saini | 978-0-13-820460-0 | `[done]` |
| The Segmentation Blueprint | Kulkarni, Sivakumar, Morais, Lloyd | 978-0-13-546236-2 | `[done]` |
| Segment Routing for SP and Enterprise Networks | Deragisch et al. | 978-0-13-823101-9 | `[done]` |
| Understanding and Designing Azure Networking | Stuart, Moreno | — (2025) | `[done]` |
| Mastering Next-Gen Juniper Data Centers | Aninda Chatterjee | 978-0-13-533636-6 | `[done]` |
| Intelligent Cloud Networking: AI-Driven Resource Management | Manoj Yadav | 9364220110 | `[done]` |

View File

@@ -0,0 +1,40 @@
# Síťová architektura — Zdroje
## RFC a standardy
| RFC | Název | Status |
|-----|-------|--------|
| RFC 791 | Internet Protocol | `[done]` |
| RFC 793 | Transmission Control Protocol | `[done]` |
| RFC 1034/1035 | Domain Names — Concepts and Facilities | `[done]` |
| RFC 4271 | Border Gateway Protocol (BGP-4) | `[done]` |
| RFC 5246 | TLS 1.2 | `[done]` |
| RFC 8446 | TLS 1.3 | `[done]` |
## Oficiální dokumentace
| Zdroj | URL | Status |
|-------|-----|--------|
| AWS VPC docs | https://docs.aws.amazon.com/vpc/ | `[done]` |
| Azure Virtual Network docs | https://learn.microsoft.com/en-us/azure/virtual-network/ | `[done]` |
| Google VPC docs | https://cloud.google.com/vpc/docs | `[done]` |
## Knihy
| Název | Autor | ISBN | Status |
|-------|-------|------|--------|
| Computer Networking: A Top-Down Approach | Kurose, Ross | 978-0133594140 | `[done]` |
| TCP/IP Illustrated | W. Richard Stevens | 978-0321336316 | `[done]` |
## Nové knihy (20242026)
| Název | Autor | ISBN | Status |
|-------|-------|------|--------|
| AI Data Center Network Design and Technologies | Subramaniam, Styszynski, Tambakuwala | 978-0-13-543628-8 | `[done]` |
| Cloud Networking and Resilience | Cristian Critelli | 979-8868824357 | `[done]` |
| Zero Trust in Resilient Cloud and Network Architectures | Halley, Prajapati, Leza, Saini | 978-0-13-820460-0 | `[done]` |
| The Segmentation Blueprint | Kulkarni, Sivakumar, Morais, Lloyd | 978-0-13-546236-2 | `[done]` |
| Segment Routing for SP and Enterprise Networks | Deragisch et al. | 978-0-13-823101-9 | `[done]` |
| Understanding and Designing Azure Networking | Stuart, Moreno | — (2025) | `[done]` |
| Mastering Next-Gen Juniper Data Centers | Aninda Chatterjee | 978-0-13-533636-6 | `[done]` |
| Intelligent Cloud Networking: AI-Driven Resource Management | Manoj Yadav | 9364220110 | `[done]` |

50
templates/ADR.en.md Normal file
View File

@@ -0,0 +1,50 @@
# ADR — Architecture Decision Record
## Decision title
<!-- Brief title (e.g. "Using PostgreSQL as primary database") -->
## Status
<!--
Proposed | Approved | Deprecated | Superseded by [ADR-XXX]
-->
## Context
<!--
Describe the problem we are solving. What are the circumstances, constraints, and requirements?
-->
## Decision
<!--
What solution did we choose and why? Describe the architectural approach.
-->
## Rationale
<!--
Why did we choose this solution? What are the main benefits compared to alternatives?
-->
## Alternatives
<!--
What other options did we consider and why did we reject them?
-->
## Consequences
<!--
- What changes? What needs to be done?
- What are the trade-offs (e.g. higher complexity for lower latency)?
- Impact on other teams / systems?
-->
## Metadata
- **Date**: YYYY-MM-DD
- **Author**: name
- **Stakeholders**: team A, team B
- **References**: [link to design doc], [link to issue]

50
templates/ADR.md Normal file
View File

@@ -0,0 +1,50 @@
# ADR — Architecture Decision Record
## Název rozhodnutí
<!-- Stručný název (např. "Použití PostgreSQL jako primární databáze") -->
## Status
<!--
Navrženo | Schváleno | Deprecated | Nahrazeno [ADR-XXX]
-->
## Kontext
<!--
Popište problém, který řešíme. Jaké jsou okolnosti, omezení a požadavky?
-->
## Rozhodnutí
<!--
Jaké řešení jsme zvolili a proč? Popište architektonický přístup.
-->
## Důvody
<!--
Proč jsme zvolili toto řešení? Jaké jsou hlavní benefity oproti alternativám?
-->
## Alternativy
<!--
Jaké další možnosti jsme zvažovali a proč jsme je zamítli?
-->
## Důsledky
<!--
- Co se mění? Co je potřeba udělat?
- Jaké jsou trade-offy (např. vyšší komplexita za nižší latenci)?
- Dopad na ostatní týmy / systémy?
-->
## Metadata
- **Datum**: YYYY-MM-DD
- **Autor**: jméno
- **Zainteresované strany**: tým A, tým B
- **Reference**: [odkaz na design doc], [odkaz na issue]