commit

2026-06-03 14:47:26 +02:00
parent 70ee14c2c2
commit c6fa0bff6a
31 changed files with 4212 additions and 0 deletions
--- a/SERVER-CONFIG.md
+++ b/SERVER-CONFIG.md
@@ -0,0 +1,285 @@
+# ⚙️ Server configuration — best practices podle workloadu
+
+## Obecná BIOS/UEFI nastavení
+
+| Nastavení | Doporučení | Zdůvodnění |
+|-----------|-----------|------------|
+| **Boot mode** | UEFI | Secure Boot, GPT, větší disky |
+| **Power profile** | Performance / OS Control | Max výkon, C-States disabled |
+| **Hyper-Threading** | Enabled | +30-50 % throughput pro multi-thread |
+| **Virtualization** | Enabled (VT-x/AMD-V) | Nutné pro hypervisor, containers |
+| **SR-IOV** | Enabled | GPU, NIC passthrough |
+| **NUMA** | Enabled | NUMA-aware scheduling |
+| **ACPI** | Enabled | Power management, OS-level |
+| **Security Boot** | Enabled | Secure boot chain |
+| **TPM** | Enabled | Measured boot, key storage |
+
+---
+
+## 1. Databázové servery
+
+### Volba CPU
+
+| DB typ | CPU preference | Zdůvodnění |
+|--------|---------------|------------|
+| **OLTP** (PostgreSQL, MySQL) | High clock, moderate cores | Nízká latence na transakci, limited parallelism |
+| **OLAP** (ClickHouse, Snowflake) | Many cores, AVX-512 | Columnstore, high parallelism |
+| **In-memory** (Redis, Memcached) | High clock, low cache latency | Single-threaded (Redis), RAM bandwidth |
+| **Document** (MongoDB) | Balance (clock × cores) | Mixed workload |
+| **Distributed** (Cassandra, Scylla) | Many cores, high cache | Shard-per-core (Scylla), compaction |
+
+### Storage layout
+
+```
+Mount point    FS       RAID     Disk type    Účel
+─────────────  ─────    ─────    ──────────   ──────────────────
+/               ext4    1 (mirror)  2× SSD    OS, binární soubory
+/data           xfs     10 (stripe) 4-8× NVMe Databázová data
+/wal            xfs     1 (mirror) 2× NVMe   Write-ahead log (PostgreSQL)
+/tmp            tmpfs   —          RAM       Dočasné soubory
+```
+
+### PostgreSQL specific
+
+| Parametr | Doporučení | Poznámka |
+|----------|-----------|----------|
+| `shared_buffers` | 25 % RAM | Cache databázových bloků |
+| `effective_cache_size` | 75 % RAM | Odhad OS cache pro query planner |
+| `work_mem` | 4-64 MB per operation | SORT, HASH JOIN (correlate s max_connections) |
+| `maintenance_work_mem` | 1-10 % RAM | VACUUM, CREATE INDEX, ANALYZE |
+| `wal_buffers` | 64-256 MB | Write-ahead log buffer |
+| `max_connections` | 50-500 | Connection pooling (PgBouncer) |
+| `random_page_cost` | 1.1 (NVMe), 4 (HDD) | Index scan cost (NVMe = téměř seq scan) |
+| `effective_io_concurrency` | 200 (NVMe), 2 (HDD) | Parallel I/O |
+
+### MySQL / MariaDB specific
+
+| Parametr | Doporučení | Poznámka |
+|----------|-----------|----------|
+| `innodb_buffer_pool_size` | 70-80 % RAM | Hlavní cache InnoDB |
+| `innodb_log_file_size` | 1-4 GB | Redo log, čím větší tím lepší write perf |
+| `innodb_flush_log_at_trx_commit` | 1 (ACID) / 2 (perf) | 1 = fsync každou transakci |
+| `innodb_io_capacity` | 2000 (NVMe) / 200 (HDD) | IOPS limit |
+| `innodb_write_io_threads` | 4-8 | Parallel write threads |
+| `max_connections` | 100-500 | Connection pooling doporučen |
+
+### MongoDB specific
+
+| Parametr | Doporučení | Poznámka |
+|----------|-----------|----------|
+| **WiredTiger cache** | 50-80 % RAM | Storage engine cache |
+| **WiredTiger compression** | Snappy / Zstd | Komprese disku (zlib je pomalý) |
+| `filesystem` | XFS | Doporučený FS (ext4 OK, NTFS ne) |
+| **ReadConcern/WriteConcern** | majority/majority | Pro důležitá data |
+
+### Kernel tuning pro DB
+
+```
+# /etc/sysctl.d/99-database.conf
+
+vm.swappiness = 1              # Minimalizuj swap, preferuj cache
+vm.dirty_ratio = 30            # % RAM before background flush
+vm.dirty_background_ratio = 5  # Start flush at 5 %
+vm.nr_hugepages = 0            # Huge pages pokud DB podporuje (PostgreSQL, MongoDB)
+net.core.rmem_max = 134217728
+net.core.wmem_max = 134217728
+net.ipv4.tcp_rmem = 4096 87380 134217728
+net.ipv4.tcp_wmem = 4096 65536 134217728
+net.core.somaxconn = 4096
+
+# I/O scheduler (NVMe = none, SSD = mq-deadline, HDD = kyber/bfq)
+# echo none > /sys/block/nvme0n1/queue/scheduler
+```
+
+---
+
+## 2. Hypervisor host (ESXi / KVM / Hyper-V)
+
+### CPU a NUMA
+
+| Nastavení | Doporučení | Poznámka |
+|-----------|-----------|----------|
+| **Overcommit ratio** | 1:1 až 4:1 (vCPU:pCPU) | Podle workloadu (1:1 DB, 4:1 web) |
+| **NUMA-aware sizing** | VM ≤ 1 NUMA node | Cross-NUMA penalty ~1.5-2× latence |
+| **CPU pinning** | Dedikované VM: ano | Zamezí context switching |
+| **C-States** | Disabled (in BIOS) | Nižší latence, vyšší spotřeba |
+| **P-State** | OS control / Performance | HW power management |
+| **Hyper-Threading** | Enabled | Více vCPU, watch for noisy neighbor |
+
+### Storage pro hypervisor
+
+```
+VM storage:
+├── OS datastore          RAID 1 (2× SATA SSD) — ESXi boot, images
+├── VM datastore (gold)   RAID 10 (4× NVMe) — critical VMs, DB
+├── VM datastore (silver) RAID 5 (6× SAS SSD) — general VMs
+└── VM datastore (bronze) RAID 6 (8× SATA HDD) — backup, archive
+
+Swap datastore:                   1× NVMe nebo SATA SSD (dedikovaný)
+```
+
+### Network design
+
+| Traffic | VLAN | Speed | NIC teaming |
+|---------|------|-------|-------------|
+| **Management** | Mgmt VLAN | 1 GbE | Active/Passive |
+| **VM traffic** | VM VLANs | 25/100 GbE | LACP (802.3ad) |
+| **Storage** | Storage VLAN | 25/100 GbE | LACP / RDMA | 
+| **vMotion** | vMotion VLAN | 25/100 GbE | Dedikovaný, multi-NIC |
+| **FT (Fault Tolerance)** | FT VLAN | 10 GbE | Dedikovaný, low latency |
+
+### BIOS pro hypervisor
+
+| Nastavení | Hodnota | Zdůvodnění |
+|-----------|---------|------------|
+| Hyper-Threading | Enabled | Vyšší VM density |
+| Virtualization Technology | Enabled | VT-x/AMD-V |
+| VT-d / IOMMU | Enabled | Passthrough, SR-IOV |
+| Power Management | Performance / OS | Minimalizace latence VM |
+| C-States | Disabled | Nižší latence VM exit |
+| NUMA | Enabled | NUMA-aware VM placement |
+| SR-IOV | Enabled | NIC/GPU virtualizace |
+
+---
+
+## 3. Kubernetes node
+
+### Node profily
+
+| Role | CPU | RAM | Storage | Network | Use case |
+|------|-----|-----|---------|---------|----------|
+| **General purpose** | 16-32 cores | 64-128 GB | 1× NVMe OS + 1×NVMe local | Web, API, microservices |
+| **Memory optimized** | 32-64 cores | 256-512 GB | 1× NVMe OS + 2×NVMe local | In-memory cache, DB |
+| **Compute optimized** | 64-128 cores | 128-256 GB | 1× NVMe OS | Batch, CI/CD |
+| **GPU node** | 32-64 cores | 512-1024 GB | 1× NVMe OS + 4-8×NVMe local | AI/ML training, inference |
+| **Storage node** | 16-32 cores | 64-128 GB | 4-12× NVMe/SATA (Ceph/Longhorn) | SDS, persistent volumes |
+
+### Kernel tuning
+
+```
+# /etc/sysctl.d/99-kubernetes.conf
+net.bridge.bridge-nf-call-iptables = 1
+net.bridge.bridge-nf-call-ip6tables = 1
+net.ipv4.ip_forward = 1
+net.ipv4.conf.all.forwarding = 1
+
+# Connection tracking (pro NodePort, Service)
+net.netfilter.nf_conntrack_max = 2097152
+net.netfilter.nf_conntrack_tcp_timeout_established = 86400
+
+# File watchers (pro kubelet, containerd)
+fs.inotify.max_user_instances = 8192
+fs.inotify.max_user_watches = 524288
+
+# Memory management
+vm.swappiness = 0
+vm.overcommit_memory = 1      # Allow overcommit (CRI-O, containerd)
+vm.panic_on_oom = 0
+kernel.panic = 10
+kernel.panic_on_oops = 1
+```
+
+### Container storage
+
+| Typ | Doporučení | Poznámka |
+|-----|-----------|----------|
+| **OS disk** | RAID 1 (2× NVMe) | Ext4/XFS, 100-200 GB |
+| **Container runtime image** | RAID 1 (2× NVMe) | /var/lib/containerd, 200-500 GB |
+| **Local PV** | Single NVMe | Raw device, no RAID |
+| **Rook/Ceph OSD** | Raw NVMe/SATA | HBA/IT mode, no RAID |
+| **Longhorn** | Raw NVMe/SATA | Ext4/XFS per volume |
+
+---
+
+## 4. Storage server (Ceph / MinIO / NAS)
+
+### Ceph OSD node
+
+| Komponenta | Doporučení | Poznámka |
+|-----------|-----------|----------|
+| **CPU** | 1-2 cores per OSD | Do 12 OSD na node (24 cores) |
+| **RAM** | 4-8 GB per OSD + OS | BlueStore cache, 16-64 GB min |
+| **Network** | 2× 25/100 GbE | Public + Cluster network |
+| **Storage** | 10-12× NVMe/SATA SSD OSD | HBA/IT mode, žádný RAID |
+| **OS disk** | 2× SATA SSD RAID 1 | OS, Ceph MON/MGR |
+
+**BIOS pro Ceph:**
+- SATA/NVMe: AHCI/NVMe mode (ne RAID)
+- C-States: Disabled (nižší latence OSD)
+- NUMA: Enabled
+- Power: Performance
+
+### MinIO node
+
+| Komponenta | Doporučení |
+|-----------|-----------|
+| **CPU** | 8-16 cores (32+ pro erasure coding) |
+| **RAM** | 32-64 GB + 1 GB per 1 TB storage |
+| **Storage** | 4-16× NVMe (direct, no RAID) |
+| **Network** | 2× 25/100 GbE |
+| **OS** | Ubuntu / RHEL, XFS (pro data) |
+
+### NAS (TrueNAS / FreeNAS)
+
+- **ZFS**: RAID-Z1/Z2/Z3, compression (lz4, zstd), dedup
+- **ARC cache**: 1 GB per 1 TB storage (max 64 GB)
+- **L2ARC**: NVMe cache (optional, read-heavy)
+- **SLOG**: NVDIMM / Optane (sync write, ZIL)
+- **Network**: 2-4× 10/25 GbE LACP
+
+---
+
+## 5. Web / API servery
+
+| Parametr | Doporučení |
+|----------|-----------|
+| **CPU** | High clock, 8-32 cores |
+| **RAM** | 32-128 GB |
+| **Storage** | 2× NVMe RAID 1 (OS + app) |
+| **OS** | Ubuntu / RHEL, optimized kernel |
+| **Network** | 2× 10/25 GbE (bonding) |
+
+**Kernel tuning:**
+```
+net.ipv4.tcp_tw_reuse = 1
+net.ipv4.tcp_fin_timeout = 15
+net.core.somaxconn = 65535
+net.ipv4.tcp_max_syn_backlog = 65535
+net.core.netdev_max_backlog = 65535
+```
+
+---
+
+## Rychlý decision tree — výběr serveru
+
+```
+Jaký workload?
+│
+├── Databáze (OLTP)
+│   → EPYC, high clock, 8-16 GB/core, RAID10 NVMe, huge pages
+│
+├── Databáze (OLAP)
+│   → Xeon AMX/AVX-512, 16-64 GB/core, many cores
+│
+├── Virtualizace
+│   → EPYC, many cores, 4-8 GB/core, shared storage (SAN/NFS/vSAN)
+│
+├── Kubernetes
+│   → EPYC, balance, 2-4 GB/core, local NVMe
+│
+├── AI/ML training
+│   → GPU node (H100/B200), NVLink, InfiniBand, liquid cooling
+│
+├── AI/ML inference
+│   → A100/H200, MIG, large VRAM, PCIe 5.0
+│
+├── Storage (Ceph/SDS)
+│   → EPYC (PCIe lanes), HBA mode, 4-8 GB/OSD
+│
+└── Web / API
+    → EPYC, high clock, 2-4 GB/core, 10 GbE
+
+## Zdroje
+
+Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
+```