commit
This commit is contained in:
285
SERVER-CONFIG.md
Normal file
285
SERVER-CONFIG.md
Normal file
@@ -0,0 +1,285 @@
|
||||
# ⚙️ Server configuration — best practices podle workloadu
|
||||
|
||||
## Obecná BIOS/UEFI nastavení
|
||||
|
||||
| Nastavení | Doporučení | Zdůvodnění |
|
||||
|-----------|-----------|------------|
|
||||
| **Boot mode** | UEFI | Secure Boot, GPT, větší disky |
|
||||
| **Power profile** | Performance / OS Control | Max výkon, C-States disabled |
|
||||
| **Hyper-Threading** | Enabled | +30-50 % throughput pro multi-thread |
|
||||
| **Virtualization** | Enabled (VT-x/AMD-V) | Nutné pro hypervisor, containers |
|
||||
| **SR-IOV** | Enabled | GPU, NIC passthrough |
|
||||
| **NUMA** | Enabled | NUMA-aware scheduling |
|
||||
| **ACPI** | Enabled | Power management, OS-level |
|
||||
| **Security Boot** | Enabled | Secure boot chain |
|
||||
| **TPM** | Enabled | Measured boot, key storage |
|
||||
|
||||
---
|
||||
|
||||
## 1. Databázové servery
|
||||
|
||||
### Volba CPU
|
||||
|
||||
| DB typ | CPU preference | Zdůvodnění |
|
||||
|--------|---------------|------------|
|
||||
| **OLTP** (PostgreSQL, MySQL) | High clock, moderate cores | Nízká latence na transakci, limited parallelism |
|
||||
| **OLAP** (ClickHouse, Snowflake) | Many cores, AVX-512 | Columnstore, high parallelism |
|
||||
| **In-memory** (Redis, Memcached) | High clock, low cache latency | Single-threaded (Redis), RAM bandwidth |
|
||||
| **Document** (MongoDB) | Balance (clock × cores) | Mixed workload |
|
||||
| **Distributed** (Cassandra, Scylla) | Many cores, high cache | Shard-per-core (Scylla), compaction |
|
||||
|
||||
### Storage layout
|
||||
|
||||
```
|
||||
Mount point FS RAID Disk type Účel
|
||||
───────────── ───── ───── ────────── ──────────────────
|
||||
/ ext4 1 (mirror) 2× SSD OS, binární soubory
|
||||
/data xfs 10 (stripe) 4-8× NVMe Databázová data
|
||||
/wal xfs 1 (mirror) 2× NVMe Write-ahead log (PostgreSQL)
|
||||
/tmp tmpfs — RAM Dočasné soubory
|
||||
```
|
||||
|
||||
### PostgreSQL specific
|
||||
|
||||
| Parametr | Doporučení | Poznámka |
|
||||
|----------|-----------|----------|
|
||||
| `shared_buffers` | 25 % RAM | Cache databázových bloků |
|
||||
| `effective_cache_size` | 75 % RAM | Odhad OS cache pro query planner |
|
||||
| `work_mem` | 4-64 MB per operation | SORT, HASH JOIN (correlate s max_connections) |
|
||||
| `maintenance_work_mem` | 1-10 % RAM | VACUUM, CREATE INDEX, ANALYZE |
|
||||
| `wal_buffers` | 64-256 MB | Write-ahead log buffer |
|
||||
| `max_connections` | 50-500 | Connection pooling (PgBouncer) |
|
||||
| `random_page_cost` | 1.1 (NVMe), 4 (HDD) | Index scan cost (NVMe = téměř seq scan) |
|
||||
| `effective_io_concurrency` | 200 (NVMe), 2 (HDD) | Parallel I/O |
|
||||
|
||||
### MySQL / MariaDB specific
|
||||
|
||||
| Parametr | Doporučení | Poznámka |
|
||||
|----------|-----------|----------|
|
||||
| `innodb_buffer_pool_size` | 70-80 % RAM | Hlavní cache InnoDB |
|
||||
| `innodb_log_file_size` | 1-4 GB | Redo log, čím větší tím lepší write perf |
|
||||
| `innodb_flush_log_at_trx_commit` | 1 (ACID) / 2 (perf) | 1 = fsync každou transakci |
|
||||
| `innodb_io_capacity` | 2000 (NVMe) / 200 (HDD) | IOPS limit |
|
||||
| `innodb_write_io_threads` | 4-8 | Parallel write threads |
|
||||
| `max_connections` | 100-500 | Connection pooling doporučen |
|
||||
|
||||
### MongoDB specific
|
||||
|
||||
| Parametr | Doporučení | Poznámka |
|
||||
|----------|-----------|----------|
|
||||
| **WiredTiger cache** | 50-80 % RAM | Storage engine cache |
|
||||
| **WiredTiger compression** | Snappy / Zstd | Komprese disku (zlib je pomalý) |
|
||||
| `filesystem` | XFS | Doporučený FS (ext4 OK, NTFS ne) |
|
||||
| **ReadConcern/WriteConcern** | majority/majority | Pro důležitá data |
|
||||
|
||||
### Kernel tuning pro DB
|
||||
|
||||
```
|
||||
# /etc/sysctl.d/99-database.conf
|
||||
|
||||
vm.swappiness = 1 # Minimalizuj swap, preferuj cache
|
||||
vm.dirty_ratio = 30 # % RAM before background flush
|
||||
vm.dirty_background_ratio = 5 # Start flush at 5 %
|
||||
vm.nr_hugepages = 0 # Huge pages pokud DB podporuje (PostgreSQL, MongoDB)
|
||||
net.core.rmem_max = 134217728
|
||||
net.core.wmem_max = 134217728
|
||||
net.ipv4.tcp_rmem = 4096 87380 134217728
|
||||
net.ipv4.tcp_wmem = 4096 65536 134217728
|
||||
net.core.somaxconn = 4096
|
||||
|
||||
# I/O scheduler (NVMe = none, SSD = mq-deadline, HDD = kyber/bfq)
|
||||
# echo none > /sys/block/nvme0n1/queue/scheduler
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Hypervisor host (ESXi / KVM / Hyper-V)
|
||||
|
||||
### CPU a NUMA
|
||||
|
||||
| Nastavení | Doporučení | Poznámka |
|
||||
|-----------|-----------|----------|
|
||||
| **Overcommit ratio** | 1:1 až 4:1 (vCPU:pCPU) | Podle workloadu (1:1 DB, 4:1 web) |
|
||||
| **NUMA-aware sizing** | VM ≤ 1 NUMA node | Cross-NUMA penalty ~1.5-2× latence |
|
||||
| **CPU pinning** | Dedikované VM: ano | Zamezí context switching |
|
||||
| **C-States** | Disabled (in BIOS) | Nižší latence, vyšší spotřeba |
|
||||
| **P-State** | OS control / Performance | HW power management |
|
||||
| **Hyper-Threading** | Enabled | Více vCPU, watch for noisy neighbor |
|
||||
|
||||
### Storage pro hypervisor
|
||||
|
||||
```
|
||||
VM storage:
|
||||
├── OS datastore RAID 1 (2× SATA SSD) — ESXi boot, images
|
||||
├── VM datastore (gold) RAID 10 (4× NVMe) — critical VMs, DB
|
||||
├── VM datastore (silver) RAID 5 (6× SAS SSD) — general VMs
|
||||
└── VM datastore (bronze) RAID 6 (8× SATA HDD) — backup, archive
|
||||
|
||||
Swap datastore: 1× NVMe nebo SATA SSD (dedikovaný)
|
||||
```
|
||||
|
||||
### Network design
|
||||
|
||||
| Traffic | VLAN | Speed | NIC teaming |
|
||||
|---------|------|-------|-------------|
|
||||
| **Management** | Mgmt VLAN | 1 GbE | Active/Passive |
|
||||
| **VM traffic** | VM VLANs | 25/100 GbE | LACP (802.3ad) |
|
||||
| **Storage** | Storage VLAN | 25/100 GbE | LACP / RDMA |
|
||||
| **vMotion** | vMotion VLAN | 25/100 GbE | Dedikovaný, multi-NIC |
|
||||
| **FT (Fault Tolerance)** | FT VLAN | 10 GbE | Dedikovaný, low latency |
|
||||
|
||||
### BIOS pro hypervisor
|
||||
|
||||
| Nastavení | Hodnota | Zdůvodnění |
|
||||
|-----------|---------|------------|
|
||||
| Hyper-Threading | Enabled | Vyšší VM density |
|
||||
| Virtualization Technology | Enabled | VT-x/AMD-V |
|
||||
| VT-d / IOMMU | Enabled | Passthrough, SR-IOV |
|
||||
| Power Management | Performance / OS | Minimalizace latence VM |
|
||||
| C-States | Disabled | Nižší latence VM exit |
|
||||
| NUMA | Enabled | NUMA-aware VM placement |
|
||||
| SR-IOV | Enabled | NIC/GPU virtualizace |
|
||||
|
||||
---
|
||||
|
||||
## 3. Kubernetes node
|
||||
|
||||
### Node profily
|
||||
|
||||
| Role | CPU | RAM | Storage | Network | Use case |
|
||||
|------|-----|-----|---------|---------|----------|
|
||||
| **General purpose** | 16-32 cores | 64-128 GB | 1× NVMe OS + 1×NVMe local | Web, API, microservices |
|
||||
| **Memory optimized** | 32-64 cores | 256-512 GB | 1× NVMe OS + 2×NVMe local | In-memory cache, DB |
|
||||
| **Compute optimized** | 64-128 cores | 128-256 GB | 1× NVMe OS | Batch, CI/CD |
|
||||
| **GPU node** | 32-64 cores | 512-1024 GB | 1× NVMe OS + 4-8×NVMe local | AI/ML training, inference |
|
||||
| **Storage node** | 16-32 cores | 64-128 GB | 4-12× NVMe/SATA (Ceph/Longhorn) | SDS, persistent volumes |
|
||||
|
||||
### Kernel tuning
|
||||
|
||||
```
|
||||
# /etc/sysctl.d/99-kubernetes.conf
|
||||
net.bridge.bridge-nf-call-iptables = 1
|
||||
net.bridge.bridge-nf-call-ip6tables = 1
|
||||
net.ipv4.ip_forward = 1
|
||||
net.ipv4.conf.all.forwarding = 1
|
||||
|
||||
# Connection tracking (pro NodePort, Service)
|
||||
net.netfilter.nf_conntrack_max = 2097152
|
||||
net.netfilter.nf_conntrack_tcp_timeout_established = 86400
|
||||
|
||||
# File watchers (pro kubelet, containerd)
|
||||
fs.inotify.max_user_instances = 8192
|
||||
fs.inotify.max_user_watches = 524288
|
||||
|
||||
# Memory management
|
||||
vm.swappiness = 0
|
||||
vm.overcommit_memory = 1 # Allow overcommit (CRI-O, containerd)
|
||||
vm.panic_on_oom = 0
|
||||
kernel.panic = 10
|
||||
kernel.panic_on_oops = 1
|
||||
```
|
||||
|
||||
### Container storage
|
||||
|
||||
| Typ | Doporučení | Poznámka |
|
||||
|-----|-----------|----------|
|
||||
| **OS disk** | RAID 1 (2× NVMe) | Ext4/XFS, 100-200 GB |
|
||||
| **Container runtime image** | RAID 1 (2× NVMe) | /var/lib/containerd, 200-500 GB |
|
||||
| **Local PV** | Single NVMe | Raw device, no RAID |
|
||||
| **Rook/Ceph OSD** | Raw NVMe/SATA | HBA/IT mode, no RAID |
|
||||
| **Longhorn** | Raw NVMe/SATA | Ext4/XFS per volume |
|
||||
|
||||
---
|
||||
|
||||
## 4. Storage server (Ceph / MinIO / NAS)
|
||||
|
||||
### Ceph OSD node
|
||||
|
||||
| Komponenta | Doporučení | Poznámka |
|
||||
|-----------|-----------|----------|
|
||||
| **CPU** | 1-2 cores per OSD | Do 12 OSD na node (24 cores) |
|
||||
| **RAM** | 4-8 GB per OSD + OS | BlueStore cache, 16-64 GB min |
|
||||
| **Network** | 2× 25/100 GbE | Public + Cluster network |
|
||||
| **Storage** | 10-12× NVMe/SATA SSD OSD | HBA/IT mode, žádný RAID |
|
||||
| **OS disk** | 2× SATA SSD RAID 1 | OS, Ceph MON/MGR |
|
||||
|
||||
**BIOS pro Ceph:**
|
||||
- SATA/NVMe: AHCI/NVMe mode (ne RAID)
|
||||
- C-States: Disabled (nižší latence OSD)
|
||||
- NUMA: Enabled
|
||||
- Power: Performance
|
||||
|
||||
### MinIO node
|
||||
|
||||
| Komponenta | Doporučení |
|
||||
|-----------|-----------|
|
||||
| **CPU** | 8-16 cores (32+ pro erasure coding) |
|
||||
| **RAM** | 32-64 GB + 1 GB per 1 TB storage |
|
||||
| **Storage** | 4-16× NVMe (direct, no RAID) |
|
||||
| **Network** | 2× 25/100 GbE |
|
||||
| **OS** | Ubuntu / RHEL, XFS (pro data) |
|
||||
|
||||
### NAS (TrueNAS / FreeNAS)
|
||||
|
||||
- **ZFS**: RAID-Z1/Z2/Z3, compression (lz4, zstd), dedup
|
||||
- **ARC cache**: 1 GB per 1 TB storage (max 64 GB)
|
||||
- **L2ARC**: NVMe cache (optional, read-heavy)
|
||||
- **SLOG**: NVDIMM / Optane (sync write, ZIL)
|
||||
- **Network**: 2-4× 10/25 GbE LACP
|
||||
|
||||
---
|
||||
|
||||
## 5. Web / API servery
|
||||
|
||||
| Parametr | Doporučení |
|
||||
|----------|-----------|
|
||||
| **CPU** | High clock, 8-32 cores |
|
||||
| **RAM** | 32-128 GB |
|
||||
| **Storage** | 2× NVMe RAID 1 (OS + app) |
|
||||
| **OS** | Ubuntu / RHEL, optimized kernel |
|
||||
| **Network** | 2× 10/25 GbE (bonding) |
|
||||
|
||||
**Kernel tuning:**
|
||||
```
|
||||
net.ipv4.tcp_tw_reuse = 1
|
||||
net.ipv4.tcp_fin_timeout = 15
|
||||
net.core.somaxconn = 65535
|
||||
net.ipv4.tcp_max_syn_backlog = 65535
|
||||
net.core.netdev_max_backlog = 65535
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Rychlý decision tree — výběr serveru
|
||||
|
||||
```
|
||||
Jaký workload?
|
||||
│
|
||||
├── Databáze (OLTP)
|
||||
│ → EPYC, high clock, 8-16 GB/core, RAID10 NVMe, huge pages
|
||||
│
|
||||
├── Databáze (OLAP)
|
||||
│ → Xeon AMX/AVX-512, 16-64 GB/core, many cores
|
||||
│
|
||||
├── Virtualizace
|
||||
│ → EPYC, many cores, 4-8 GB/core, shared storage (SAN/NFS/vSAN)
|
||||
│
|
||||
├── Kubernetes
|
||||
│ → EPYC, balance, 2-4 GB/core, local NVMe
|
||||
│
|
||||
├── AI/ML training
|
||||
│ → GPU node (H100/B200), NVLink, InfiniBand, liquid cooling
|
||||
│
|
||||
├── AI/ML inference
|
||||
│ → A100/H200, MIG, large VRAM, PCIe 5.0
|
||||
│
|
||||
├── Storage (Ceph/SDS)
|
||||
│ → EPYC (PCIe lanes), HBA mode, 4-8 GB/OSD
|
||||
│
|
||||
└── Web / API
|
||||
→ EPYC, high clock, 2-4 GB/core, 10 GbE
|
||||
|
||||
## Zdroje
|
||||
|
||||
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
```
|
||||
Reference in New Issue
Block a user