Compare commits

..

2 Commits

Author SHA1 Message Date
Stanislav Hubacek
ef3c2f75b1 18.6.2026 2026-06-18 16:25:33 +02:00
Stanislav Hubacek
b53714113c new files 2026-06-16 15:47:45 +02:00
47 changed files with 5923 additions and 124 deletions

600
AI-INFRASTRUCTURE.en.md Normal file
View File

@@ -0,0 +1,600 @@
# 🧠 AI/ML Infrastructure
## Component overview
```mermaid
flowchart TD
subgraph Compute
GPU["GPU (H100/B200/Instinct)"]
CPU["CPU (AMD EPYC / Intel Xeon)"]
ASIC["ASIC (TPU, Trainium, Inferentia)"]
end
subgraph Network
IB["InfiniBand NDR/XDR"]
ROCE["RoCEv2"]
NVL["NVLink / NVSwitch"]
end
subgraph Storage
FS["Parallel FS (Lustre, GPFS, Weka)"]
OBJ["Object Store (S3, MinIO)"]
NVME["Local NVMe cache"]
end
subgraph Orchestration
S["Slurm"]
K["Kubernetes + Volcano/Kueue"]
end
subgraph Cooling
DLC["Direct-to-chip liquid"]
IMM["Immersion"]
AIR["Air (high-density)"]
end
Compute --> Network --> Storage
Orchestration --> Compute
Cooling --> Compute
```
---
## GPU compute
### NVIDIA
| GPU | Architecture | FP8 | FP16/BF16 | FP64 | HBM | NVLink | TDP | Rack config |
|-----|-------------|-----|-----------|------|-----|--------|-----|------|
| **H100 SXM** | Hopper | 3,958 TFLOPS | 1,979 TFLOPS | 67 TFLOPS | 80 GB HBM3 | 900 GB/s | 700 W | 68× in DGX H100 |
| **H200 SXM** | Hopper (HBM3e) | 3,958 TFLOPS | 1,979 TFLOPS | 67 TFLOPS | 141 GB HBM3e | 900 GB/s | 700 W | 68× in DGX H200 |
| **B200** | Blackwell | ~9,000 TFLOPS | ~4,500 TFLOPS | ~40 TFLOPS | 192 GB HBM3e | 1,800 GB/s | 1,000 W | 68× in DGX B200 |
| **GB200 Grace Hopper** | Blackwell | ~18,000 TFLOPS | ~9,000 TFLOPS | — | 192 GB + 480 GB (Grace) | NVLink-C2C | 1,000 W (GPU) + 500 W (CPU) | DGX GB200 (36× GPU) |
| **L40S** | Ada Lovelace | 733 TFLOPS | 367 TFLOPS | — | 48 GB GDDR6 | N/A | 350 W | Inference, enterprise |
| **A100 SXM** | Ampere | 1,248 TFLOPS | 624 TFLOPS | 19.5 TFLOPS | 80 GB HBM2e | 600 GB/s | 400 W | DGX A100 |
### AMD
| GPU | Architecture | FP8 | FP16/BF16 | FP64 | HBM | Infinity Fabric | TDP |
|-----|-------------|-----|-----------|------|-----|----------------|-----|
| **MI300X** | CDNA 3 | 2,615 TFLOPS | 1,307 TFLOPS | 81 TFLOPS | 192 GB HBM3 | 896 GB/s | 750 W |
| **MI250** | CDNA 2 | — | 383 TFLOPS | 95.7 TFLOPS | 128 GB HBM2e | 400 GB/s | 500 W |
### Intel
| GPU | Architecture | FP16/BF16 | FP32 | HBM | TDP |
|-----|-------------|-----------|------|-----|-----|
| **Gaudi 3** | Custom | 1,835 TFLOPS | — | 144 GB HBM2e | 600 W |
| **Max 1550** | Xe HPC | 600+ TFLOPS | 200 TFLOPS | 128 GB HBM2e | 600 W |
### Cloud ASIC
| ASIC | Provider | Use case | Performance |
|------|----------|----------|-------|
| **TPU v5p** | Google | Training | ~4,600 TFLOPS (BF16) per pod |
| **Trainium 2** | AWS | Training | ~1,000 TFLOPS (BF16) per chip |
| **Inferentia 2** | AWS | Inference | ~400 TOPS (INT8) per chip |
| **Maia 100** | Microsoft | Training + inference | Custom, 800 W TDP |
---
## AI networking
### Technology comparison
| Technology | Bandwidth per link | Latency | Topology | Use case |
|-------------|-------------------|---------|-----------|----------|
| **InfiniBand NDR200** | 200 Gb/s | < 1 µs | Fat-tree, Dragonfly+ | Training (NVIDIA) |
| **InfiniBand NDR400** | 400 Gb/s | < 1 µs | Fat-tree, Dragonfly+ | Training (NVIDIA) |
| **InfiniBand XDR** | 800 Gb/s (planned) | < 1 µs | Dragonfly+ | Next-gen training |
| **RoCEv2** (CX-7/8) | 200400 Gb/s | 12 µs | Fat-tree, Spine-leaf | Training (AMD, Intel, open) |
| **NVLink 4.0** | 900 GB/s per GPU | < 0.5 µs | NVSwitch full-mesh | Intra-node GPU comm |
| **NVLink 5.0** | 1,800 GB/s per GPU | < 0.5 µs | NVSwitch full-mesh | Intra-node (Blackwell) |
| **Ethernet (400 GbE)** | 400 Gb/s | 25 µs | Spine-leaf | Inference, data pipeline |
### AI fabric principles
- **Rail-optimized topology** — each GPU communicates on dedicated "rails" (same GPU indices across nodes connect to the same switch)
- **Fat-tree (Clos)** — standard for InfiniBand and RoCE, non-blocking bisection bandwidth
- **Dragonfly+** — reduces hop count while maintaining bandwidth (used in largest clusters)
- **GPU Direct RDMA** — direct GPU ↔ GPU communication without CPU involvement, supports InfiniBand and RoCE
- **SHARP (Scalable Hierarchical Aggregation and Reduction Protocol)** — in-network reduction for AllReduce (InfiniBand only)
### Bandwidth sizing
```text
Rule of thumb: InfiniBand bandwidth ≥ 50 % GPU HBM bandwidth for scalable training
Example: H100 has 3.35 TB/s HBM
→ Needs min. 1.6 TB/s bisection bandwidth per GPU
→ 8× H100 in DGX: 4× NDR400 IB per GPU = 4 × 50 GB/s = 200 GB/s
→ Reality: 8× 200 Gb/s (25 GB/s) per GPU in typical config = ~6 % HBM → bottleneck
```
---
## AI storage
### Requirements
| Dataset size | IO pattern | Recommended storage | Bandwidth |
|-------------|-----------|-------------------|-----------|
| < 10 TB | Sequential read (data loading) | Local NVMe | > 10 GB/s per node |
| 10100 TB | Random read (checkpointing) | Parallel FS (Lustre, Weka) | > 100 GB/s cluster-wide |
| 100 TB10 PB | Mixed (training + checkpoint) | Parallel FS + object store | > 500 GB/s |
| 10 PB+ | Multi-modal, video, LLM | Tiered (NVMe cache + parallel FS + object) | > 1 TB/s |
### Storage solution comparison
| Solution | Type | Bandwidth per node | Max capacity | Scaling | Use case |
|--------|-----|-------------------|-------------|-----------|----------|
| **Lustre** | Parallel FS (POSIX) | > 100 GB/s (cluster) | 100s PB | OST + MDS | HPC, LLM training (standard) |
| **GPFS / StorageScale** | Parallel FS (POSIX) | > 100 GB/s | 100s PB | NSD servers | HPC, AI (IBM) |
| **WekaFS** | Parallel FS (POSIX + NFS/SMB) | ~80 GB/s per 10 nodes | 10s PB | Container-native | AI/ML, NVIDIA DGX preferred |
| **VAST Data** | Universal storage (NVMe + QLC) | ~100 GB/s per cluster | 10s PB | Scale-out | AI, checkpoint, data lake |
| **Pure Storage//E** | All-flash (NVMe) | ~50 GB/s | ~30 PB | Scale-out | Enterprise AI, database |
| **MinIO / S3** | Object store | ~20 GB/s per gateway | EB | Erasure coding | Dataset repository, checkpoint |
| **NetApp AFF** | NAS + S3 | ~10 GB/s per controller | ~50 PB | HA pair | Enterprise, NFS baseline |
### Checkpointing strategies
| Strategy | RPO | Storage impact | Description |
|-----------|-----|---------------|-------|
| **Full checkpoint** | every N steps | High (stops training) | Full model + optimizer state |
| **Async checkpoint** | every N steps | Medium (non-blocking) | Copy to staging buffer, async write |
| **Distributed checkpoint** (NVIDIA NeMo) | every N steps | Low | Each rank writes its own shard |
| **In-memory checkpoint** (IBM) | on failover | Minimal (DRAM) | Replication to another node's DRAM |
| **Continuous checkpoint** (Microsoft) | every 15 min | Low (delta) | Changed shards only |
---
## AI cluster architecture
### Physical topology — DGX H100 example
```
┌──────── DGX H100 (8× GPU) ────────┐
│ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │
│ │GPU 0│ │GPU 1│ │GPU 2│ │GPU 3│ │
│ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ │
│ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ │
│ │GPU 4│ │GPU 5│ │GPU 6│ │GPU 7│ │
│ └─────┘ └─────┘ └─────┘ └─────┘ │
│ NVSwitch (NVLink 4.0, 900 GB/s) │
│ InfiniBand CX-7: 8× NDR400 │
└────────────────────────────────────┘
│ 8× IB rails
┌────┴──────────────┐
│ IB NDR400 Switches │ (rail-optimized)
└────────────────────┘
```
### Kubernetes for AI
| Component | Role |
|-----------|------|
| **Volcano** | Batch scheduling, gang scheduling, queue management |
| **Kueue** | Multi-tenant admission, resource quotas, fair sharing |
| **NVIDIA GPU Operator** | Driver, container toolkit, MIG, DCGM, monitoring |
| **HAMi** (ex k8s-vGPU-scheduler) | GPU sharing, MIG partitioning, fractional GPU |
| **Node Feature Discovery** | GPU type detection, NUMA topology |
| **Topology Manager** | NUMA-aware pod placement |
| **DPDK / SR-IOV** | High-performance networking for GPU Direct RDMA |
### Slurm for AI
| Component | Role |
|-----------|------|
| **slurm.conf** | Partition for GPU nodes, GRES (Generic Resource) |
| **gres.conf** | GPU type, GPU count per node |
| **srun --gres=gpu:8** | Allocate 8 GPUs per job |
| **sbatch --nodes=64 --ntasks=512** | 64 nodes, 512 ranks (8 GPU/node) |
| **Pixis** | NVIDIA orchestration plugin for Slurm |
---
## AI cluster cooling
### Power density comparison
| Configuration | TDP per node | Racks | kW/rack | Note |
|-------------|-------------|-------|---------|----------|
| Standard server (2U) | 1 kW | 20 | 510 | Typical DC |
| GPU server (DGX H100, 6×) | 42 kW | 6 | 4550 | Air cooling limit |
| GPU server (DGX B200, 6×) | 72 kW | 6 | 90100 | Liquid cooling required |
| GPU server (GB200 NVL72) | 120 kW | — | ~120 | Liquid cooling mandatory |
| NVIDIA NVL72 rack | 120 kW | 1 | 120 | Fully liquid cooled |
### Cooling technologies
| Method | Max kW/rack | CAPEX | OPEX | Complexity |
|--------|-------------|-------|------|-----------|
| **Air cooling (CRAC/CRAH)** | < 15 | Low | Medium | Low |
| **Air cooling (in-row)** | 1530 | Medium | Medium | Low |
| **Rear-door heat exchanger** | 3050 | Medium | Low | Medium |
| **Direct-to-chip liquid (cold plate)** | 50150 | High | Low | High |
| **Immersion (single-phase)** | 100200 | High | Low | High |
| **Immersion (two-phase)** | 200+ | Very high | Low | Very high |
---
## Inference infrastructure
### Inference server comparison
| Tool | Frameworks | Optimization | Use case |
|---------|-----------|-------------|----------|
| **vLLM** | Megatron, HF, AWQ, GPTQ | PagedAttention, KV cache, continuous batching | LLM inference (open source) |
| **TensorRT-LLM** | TensorRT | INT4/INT8/FP8, inflight batching, attention optimizations | Production (NVIDIA) |
| **Triton Inference Server** | All (TensorRT, vLLM, PyTorch) | Model ensemble, model caching, concurrent execution | Enterprise, multi-model |
| **SageMaker** | Managed | Auto-scaling, model parallelism | AWS managed |
| **OpenAI API / TGI** | HF Transformers | Continuous batching, flash attention | Hosting |
### Inference optimization
| Technique | Latency improvement | Throughput improvement | Memory reduction |
|----------|-----------------|---------------------|------------------|
| **FP8/INT8 quantization** | — | 2× | 2× |
| **INT4 quantization** | — | 4× | 4× |
| **Flash Attention 2/3** | 24× | — | 50 % (KV cache) |
| **PagedAttention** | — | 25× | 95 % (KV cache fragmentation) |
| **Continuous batching** | — | 1020× | — |
| **Speculative decoding** | 23× | — | — |
| **Multi-LoRA / S-LoRA** | — | 816× | — |
---
## Distributed training techniques
| Technique | Description | Frameworks |
|----------|-------|------------|
| **Data Parallelism (DDP/FSDP)** | Each GPU has model copy, different batch | PyTorch DDP, FSDP |
| **Tensor Parallelism (TP)** | Model split across layers (intra-node) | Megatron-LM, DeepSpeed |
| **Pipeline Parallelism (PP)** | Layers split across nodes | Megatron-LM, DeepSpeed |
| **Sequence Parallelism (SP)** | Sequence split across GPUs | Megatron-LM |
| **Expert Parallelism (EP)** | Different expert subnets on different GPUs | Mixture-of-Experts (MoE) |
| **3D Parallelism** | TP + PP + DP combination | Megatron-LM, NeMo |
| **ZeRO (1/2/3)** | Optimizer/gradient/parameter sharding | DeepSpeed |
| **NCCL / RCCL** | GPU collective communication library | NVIDIA/AMD |
---
## Operating systems for AI
### Distribution comparison
| OS | GPU driver | CUDA | Container toolkit | IB/RoCE | Lustre client | Production support |
|----|-----------|------|-------------------|---------|--------------|-------------------|
| **Ubuntu 22.04 LTS** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED, rdma-core | Yes (lustre-client) | NVIDIA DGX standard |
| **Ubuntu 24.04 LTS** | NVIDIA 550+ | 12.5+ | nvidia-container-toolkit | MLNX_OFED, rdma-core | Yes | Latest GPU support |
| **RHEL 9 / Rocky 9** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED | Yes (EL repo) | Red Hat, enterprise |
| **DGX OS** (Ubuntu-based) | NVIDIA custom | 12.x | Pre-installed | Pre-configured | Yes | NVIDIA DGX only supported |
| **SLES 15 SP5** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED | Yes | HPC, some Lustre clusters |
| **Debian 12** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | rdma-core | Yes (backports) | Community, research |
| **Flatcar / Bottlerocket** | Container-host | — | nvidia-container-toolkit | Limited | No | K8s-only, minimal footprint |
### Limitations and constraints
#### GPU drivers and CUDA
| Constraint | Detail |
|----------|--------|
| **Driver-CUDA compatibility** | NVIDIA driver major version must match CUDA toolkit (driver ≥ CUDA req). E.g., CUDA 12.5 requires driver ≥ 550 |
| **Kernel version** | NVIDIA driver not compatible with all kernels. New kernel (6.8+) may require DKMS build or delayed support |
| **Secure Boot** | NVIDIA driver requires signed module (MOK, shim) or disabled Secure Boot — common enterprise issue |
| **Open vs Proprietary driver** | NVIDIA `nvidia-open` (since R515) — open source kernel module. GPU support: DC (H100+) → OK, older GPUs → proprietary required |
| **nvidia-persistenced** | Required to maintain GPU initialization; without it GPUs may sleep after idle timeout (`nvidia-smi -pm 1`) |
| **GPU reset** | After crashed training job, GPU may hang. `nvidia-smi --gpu-reset` or reboot node, sometimes power cycle |
| **Multi-instance GPU (MIG)** | Requires specific driver, MIG mode on GPU, GPU restart. Cannot be changed at runtime. A100, H100, B200 only |
#### Network (InfiniBand / RoCE)
| Constraint | Detail |
|----------|--------|
| **MLNX_OFED vs rdma-core** | MLNX_OFED (NVIDIA) — full support, but own kernel modules, kernel version compatibility needed. `rdma-core` (open) — limited support, no custom modules |
| **Kernel compatibility** | MLNX_OFED supports only specific kernel versions (major.minor). Kernel upgrade → MLNX_OFED rebuild required |
| **NCCL** | NCCL version must be compatible with CUDA and IB firmware. `nccl-tests` for validation |
| **SHARP** | In-network reduction requires specific MLNX_OFED + IB switch firmware combination |
| **GPU Direct RDMA** | Requires `nvidia-peermem` module + MLNX_OFED. Does not work with all GPU and IB card combinations |
| **RoCE PFC/ECN** | RoCE requires lossless fabric (PFC, ECN, DCQCN). Switch and host configuration — complex tuning |
#### Storage
| Constraint | Detail |
|----------|--------|
| **Lustre client** | Client version must match server. Server upgrade → upgrade all clients. Compatible with RHEL/Debian derivatives only |
| **POSIX locking** | NFS and Lustre have different POSIX locking behavior. Distributed training relies on flock → problematic with mixed FS |
| **Filesystem cache** | Page cache can mask IO bottlenecks. Training jobs often require `O_DIRECT` or sync IO |
| **Local NVMe vs parallel FS** | Dataset staging on local NVMe eliminates network dependency but requires space and pre-fetch pipeline |
#### Container runtime
| Constraint | Detail |
|----------|--------|
| **Docker + GPU** | `nvidia-container-toolkit` (formerly nvidia-docker2). Requires runtime installation and config in `/etc/docker/daemon.json` |
| **Podman + GPU** | Requires `nvidia-container-toolkit` + podman hook. Less tested than Docker |
| **containerd + GPU** | Standard for K8s. Requires `cdi` (Container Device Interface) or `nvidia-container-runtime` |
| **Enroot + Pyxis** | NVIDIA container stack for Slurm (Enroot = daemonless container runtime, Pyxis = Slurm plugin) |
| **User namespace mapping** | Container GPU access requires device cgroup; rootless may fail (exception for /dev/dri and /dev/nvidia*) |
#### Kernel parameters
```text
# AI workload recommended sysctl
net.core.rmem_max = 134217728 # sufficient for NCCL
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
net.core.netdev_budget = 600 # for high packet rate
vm.max_map_count = 1048576 # PyTorch DataLoader workers
kernel.numa_balancing = 0 # disable NUMA balancing (breaks locality)
kernel.sched_min_granularity_ns = 10000000
# Disable security mitigations for perf (dedicated AI clusters only)
mitigations=off
transparent_hugepages=never # or madvise — THP may cause latency spikes
intel_idle.max_cstate=1 # reduce C-state transition latency
```
#### Firmware and HW
| Constraint | Detail |
|----------|--------|
| **GPU firmware (VBIOS)** | NVIDIA datacenter GPUs (H100, B200) have VBIOS updates via NVFlash. Without update → missing partitioning support or newer CUDA features |
| **InfiniBand firmware** | IB switch and HCA firmware must be compatible. Mix old switch + new HCA → degraded perf |
| **NVSwitch firmware** | DGX systems have NVSwitch firmware updatable only via NVIDIA DGX tools |
| **Power capping (nvidia-smi)** | `nvidia-smi -pl <power>` — limit TDP for power budget management. Test impact on training throughput |
| **GPU clock locking** | `nvidia-smi -ac <clock,mem>` — locked clock frequency for stable benchmarks. Apply after `nvidia-persistenced` |
| **PCIe Gen** | GPU in PCIe Gen4 slot (instead of Gen5) → bottleneck for CPU↔GPU data transfer. Important for FSDP sharding |
### Recommended OS per use case
| Use case | OS | Rationale |
|----------|-----|-------|
| **DGX cluster (production)** | DGX OS / Ubuntu 22.04 LTS | NVIDIA standard, best driver support |
| **Enterprise K8s (OpenShift)** | RHEL 9 / RHCOS | Red Hat support, GPU Operator compatible |
| **Vanilla K8s (on-prem)** | Ubuntu 22.04 LTS + Flatcar (workers) | Widest community support, Flatcar for minimal footprint |
| **Slurm cluster (HPC/AI)** | Rocky Linux 9 / Ubuntu 22.04 LTS | EL ecosystem (Lustre, OFED) or Ubuntu (community) |
| **Research / rapid prototyping** | Ubuntu 24.04 LTS | Latest CUDA, PyTorch, driver support |
| **Edge inference** | NVIDIA JetPack / Ubuntu (ARM) | Embedded GPU (Jetson Orin, AGX) |
---
## AI-ready data center — check-list
| Area | Requirement |
|--------|-----------|
| **Power** | 30120 kW/rack, HVDC (400 V DC), UPS supporting GPU spikes |
| **Cooling** | Liquid cooling ready (direct-to-chip), rear-door for 30+ kW |
| **Network** | InfiniBand (NDR/XDR) or RoCEv2, rail-optimized fat-tree |
| **Storage** | Parallel FS (Lustre/Weka), checkpoint bandwidth > 100 GB/s |
| **GPU density** | Max GPU/rack, minimize NVSwitch hops |
| **Physical** | Floor load 1,500+ kg/m², rack 52U60U |
| **Security** | Tenant isolation, network segmentation, data encryption |
| **Monitoring** | DCGM, NCCL health checks, thermals, power capping |
---
## Model and throughput limitations
### Model size per GPU
Maximum model size fitting on a single GPU depends on HBM capacity and precision:
| GPU | HBM | FP32 | FP16/BF16 | INT8 | INT4 |
|-----|-----|------|-----------|------|------|
| **H100 80GB** | 80 GB | ~10B | ~40B | ~80B | ~160B |
| **H200 141GB** | 141 GB | ~18B | ~70B | ~140B | ~280B |
| **B200 192GB** | 192 GB | ~24B | ~96B | ~192B | ~384B |
| **MI300X 192GB** | 192 GB | ~24B | ~96B | ~192B | ~384B |
| **A100 80GB** | 80 GB | ~10B | ~40B | ~80B | ~160B |
| **GB200 (192+480)** | 192 GB GPU + 480 GB Grace | — | ~96B + CPU offload | — | — |
*Approximate: 1B params ≈ 2 GB FP16 ≈ 4 GB FP32 ≈ 1 GB INT8 ≈ 0.5 GB INT4. Subtract ~1015 % HBM for activations, KV cache, optimizer states.*
### Memory breakdown inference
| Component | Llama 3 70B (FP16) | Llama 3 8B (FP16) |
|------------|-------------------|-------------------|
| Model weights | 140 GB | 16 GB |
| KV cache (4K context, batch 1) | ~2 GB | ~0.2 GB |
| KV cache (128K context, batch 1) | ~60 GB | ~6.5 GB |
| Activations (peak) | ~5 GB | ~1 GB |
| **Total 4K ctx** | ~147 GB | ~17 GB |
| **Total 128K ctx** | ~205 GB | ~23 GB |
**Conclusion:** Llama 3 70B FP16 does not fit on a single H100 (80 GB). Required: INT8 (170 GB → 2× H100), INT4 (85 GB → 1× H200), or tensor parallelism.
### Context length vs memory
| Context | KV cache 70B (FP16) | KV cache 8B (FP16) | Note |
|---------|-------------------|-------------------|------|
| 4K | ~2.2 GB | ~0.25 GB | Typical chat |
| 32K | ~18 GB | ~2 GB | Documents |
| 128K | ~72 GB | ~8 GB | Long-context (Claude, Gemini) |
| 1M | ~560 GB | ~64 GB | Experimental (Gemini 1.5 Pro) |
KV cache is **linear with context length** and quadratic with attention head count. Critical for long-context inference.
### Throughput inference
| Model | GPU | Precision | Batch size | Tokens/s | QPS (1K output) |
|-------|-----|-----------|-----------|----------|-----------------|
| Llama 3 8B | H100 | FP16 | 1 | ~800 | ~0.8 |
| Llama 3 8B | H100 | FP16 | 128 | ~4 500 | ~35 |
| Llama 3 8B | H100 | INT4 | 128 | ~8 000 | ~62 |
| Llama 3 70B | 4× H100 | FP16 | 1 | ~180 | ~0.18 |
| Llama 3 70B | 4× H100 | INT4 | 64 | ~1 200 | ~19 |
| Llama 3 70B | 8× H100 | FP16 (TP=8) | 128 | ~2 500 | ~20 |
| DeepSeek-R1 671B | 8× H200 | FP8 (MoE) | 64 | ~500 | ~8 |
| GPT-4 class (est.) | — | — | — | ~100300 | ~13 |
**Notes:**
- QPS (queries per second) depends on output length (1K tokens ≈ ~1 query)
- Larger batch increases throughput but increases TTFB (time to first token)
- Tensor Parallelism (TP) scales, but communication overhead grows linearly
### Training limits
#### Scaling efficiency
| GPU count | Model | Efficiency | Reason |
|-----------|-------|-----------|-------|
| 8 (1 node) | Llama 3 8B | ~95 % | NVLink intra-node |
| 64 (8 nodes) | Llama 3 8B | ~85 % | IB inter-node |
| 512 (64 nodes) | Llama 3 70B | ~75 % | Communication overhead |
| 4 096 (512 nodes) | Llama 3 70B | ~60 % | Pipeline bubble, network |
| 16 384 (2 048 nodes) | Llama 3 405B | ~45 % | Synchronous SGD overhead |
**Note:** Efficiency = (actual throughput) / (ideal linear speedup). Decreases logarithmically with GPU count.
#### Memory breakdown training
| Component | Llama 3 70B (BF16) | Llama 3 8B (BF16) |
|------------|-------------------|-------------------|
| Model weights | 140 GB | 16 GB |
| Optimizer states (Adam) | 280 GB | 32 GB |
| Gradients | 140 GB | 16 GB |
| Activations (peak) | ~30 GB | ~4 GB |
| **Total (DDP)** | ~590 GB | ~68 GB |
| **Total (FSDP shard=8)** | ~74 GB | ~8.5 GB |
**Conclusion:** FSDP (Fully Sharded Data Parallelism) is required for training models > 10B. Adam optimizer doubles memory vs inference (weights + optimizer + gradients).
#### Time to train
| Model | GPU count | GPU type | Precision | Time | Cost (on-prem estimate) |
|-------|-----------|---------|-----------|------|---------------------|
| Llama 3 8B | 64 | H100 | BF16 | ~3 days | ~$5 000 |
| Llama 3 70B | 512 | H100 | BF16 | ~14 days | ~$100 000 |
| Llama 3 405B | 16 384 | H100 | BF16 | ~60 days | ~$14 M |
| DeepSeek-R1 671B (MoE) | 2 048 | H800 | BF16 | ~30 days | ~$6 M |
| GPT-4 (est.) | 25 000 | A100/H100 | Mixed | ~90100 days | ~$100 M |
### Power and thermal limits
| Configuration | TDP limit | Throughput loss | Reason |
|-------------|-----------|------------------|--------|
| H100 SXM | 700 W (default) | 0 % | Nominal |
| H100 SXM | 600 W (-15 %) | ~58 % | Power capping |
| H100 SXM | 500 W (-30 %) | ~1525 % | Significant throttling |
| H100 SXM | 400 W (-43 %) | ~3050 % | Emergency only |
| DGX H100 (8×) | 5.6 kW (max) | 0 % | Liquid cooling required |
| DGX H100 (8×) | 4.5 kW (air) | ~1015 % | Rear-door heat exchanger |
GPU throttles when exceeding TDP or temperature (85°C+). Power capping correlates linearly with frequency but non-linearly with throughput.
### API and operational limits
| Limit | Description | Typical value |
|-------|-------|-----------------|
| **Rate limit** | Max requests per minute/hour | 10010 000 RPM (per tier) |
| **Tokens per minute (TPM)** | Max tokens per minute | 1M300M (per model) |
| **Context window** | Max input tokens | 4K2M (per model) |
| **Max output tokens** | Max generated tokens | 4K32K (per model) |
| **Concurrent requests** | Parallel request count | 1010 000 (per backend) |
| **Batch window** | Time to accumulate batch | 020 s (vLLM, TGI) |
| **TTFB timeout** | Max latency to first token | 30120 s |
| **Idle timeout** | GPU idle → scale to 0 | 515 min (cloud) |
### Limits per deployment model
| Dimension | On-prem HW | Managed cloud (SageMaker, Vertex) | API (OpenAI, Anthropic) |
|-----------|--------------|----------------------------------|------------------------|
| **Model size** | Limited by HBM (max 192 GB/GPU) | Unlimited (cluster scaling) | Unlimited |
| **Queries** | Limited by GPU count | Auto-scaling | Rate limit (per tier) |
| **Latency** | < 10 ms (same node) | 10100 ms (network hop) | 100 ms 10 s |
| **Customization** | Full (fine-tuning, quantization) | Managed (SageMaker, Bedrock) | Prompt engineering only |
| **Data privacy** | Yes (on-prem) | Contractual (region, encryption) | Limited |
| **Cost per 1M tokens** | ~$0.100.50 (FP16 inference) | ~$0.201.00 | ~$0.1515.00 |
| **Max context** | 128K+ (depending on GPU count) | 128K+ | 32K2M |
| **Cold start** | 0 (always-on) | 30 s 5 min | 0 (shared infra) |
---
## GPU pricing and price/performance (2026)
> Prices are approximate — NVIDIA does not publish official datacenter GPU price lists. Cloud prices from public providers (Q2 2026). HW purchase prices vary by volume, reseller, and region.
### Purchase price (buy)
| GPU | Price/GPU | Price 8× GPU baseboard | $/PFLOPS (FP16) | Note |
|-----|---------|----------------------|----------------|------|
| **H100 SXM** | $27,00040,000 | ~$200,000 | $25,000 | Scarcity 20232024, now stabilized |
| **H200 SXM** | $35,00050,000 | ~$280,000 | ~$35,000 | H100 upgrade, HBM3e |
| **B200** | ~$60,00070,000 | ~$500,000+ | ~$31,000 | Blackwell, FP4 support |
| **B100** | ~$30,000 | ~$240,000 | ~$20,000 | Lower price than B200, similar FP8 perf |
| **GB200** (Grace+Blackwell) | ~$70,000100,000 | ~$2,000,000 (rack) | — | CPU+GPU unified, high-density |
| **A100 80GB** | ~$10,00015,000 | ~$120,000 | ~$19,200 | Previous gen, still relevant |
| **MI300X** | ~$12,00018,000 | ~$100,000 | ~$9,600 | AMD, 192 GB HBM3 |
| **Gaudi 3** | ~$15,625 | ~$125,000 | **$8,515** | Intel, best $/PFLOPS |
| **L40S** | ~$8,00010,000 | — | — | Inference, enterprise |
### Cloud pricing (on-demand $/GPU/hr)
| GPU | Cheapest | Mid-range (CoreWeave, Lambda) | Hyperscaler (AWS, GCP, Azure) |
|-----|----------|-----------------------------|-------------------------------|
| **H100 SXM** | $1.38 (Thunder) | $2.893.29 | $4.156.88 |
| **H100 PCIe** | $2.01 (Spheron) | $2.50 | — |
| **H200 SXM** | $3.89 (Spheron) | $4.54 | $5.00+ |
| **B200** | **$3.39** (Spheron) | $6.02 | $14.24 (AWS) |
| **B200 spot** | **$2.12** (Spheron) | — | — |
| **GB200** | $3.50 (Runcrate) | $5.85 (Oracle) | $6.95 (GCP) |
| **MI300X** | **$1.50** (TensorWave) | $1.85 (Vultr) | $7.86 (Azure) |
| **A100 80GB** | $1.07 (Spheron) | $1.502.00 | $3.00+ |
| **Gaudi 3** | ~$1.502.50 | — | — |
| **L40S** | $0.91 (Spheron) | $1.502.00 | — |
### Inference cost ($/M tokens)
| GPU | Provider | $/hr | Est. tok/s | $/M tok |
|-----|----------|------|-----------|--------|
| **B200** | Spheron | $3.39 | ~4,000 | **$0.42** |
| **B200 spot** | Spheron | $2.12 | ~4,000 | **$0.15** |
| **H100 PCIe** | Spheron | $2.01 | ~1,200 | $0.47 |
| **A100 80GB** | Spheron | $1.07 | ~520 | $0.57 |
| **H100 SXM** | AWS | $6.88 | ~1,200 | $1.59 |
| **H200 SXM** | Spheron | $4.54 | ~1,800 | $0.70 |
| **L40S** | Spheron | $0.91 | ~450 | $0.56 |
*Values for Llama 3 70B (INT8, batch=1, output 1K tok). Actual values vary by batch size, context, and quantization.*
### Cost per GB HBM
| GPU | HBM | Price/hr cloud | $/GB/hr | Best for memory-bound workloads |
|-----|-----|-------------|--------|--------------------------------|
| **MI300X** | 192 GB | $1.50 | **$0.0078** | ✅ Best |
| **B200** | 192 GB | $3.39 | $0.0177 | ✅ Good |
| **H200** | 141 GB | $3.89 | $0.0276 | ⚠️ |
| **H100 SXM** | 80 GB | $1.38 | $0.0173 | ⚠️ Only up to 70B models |
| **GB200** | 384 GB | $3.50 | $0.0091 | ✅✅ (2× MI300X capacity) |
### Price/performance by scenario
| Scenario | Winner | Rationale |
|----------|--------|-----------|
| **Absolute performance** (cost no object) | **GB200 DGX NVL72** | 72× GPU, 18 PFLOPS FP8, 384 GB HBM/GPU |
| **Cloud inference** — best $/token | **B200 spot** | $0.15/M tok; 4× H100 throughput at lower cost |
| **Cloud inference** — on-demand | **B200** | $0.42/M tok |
| **Cloud inference** — budget | **A100 / L40S** | $0.570.56/M tok |
| **Training** — price/perf on purchase | **Gaudi 3** | $8,515/PFLOPS, 2.53× better than H100 |
| **Training** — cloud | **H100 SXM** | $1.38/hr, CUDA ecosystem, NCCL |
| **Memory-bound** — long context, 70B+ | **MI300X / GB200** | 192384 GB, $0.00780.0091/GB |
| **Ecosystem + safe choice** | **H100/H200** | CUDA, widest SW, NVIDIA tools |
| **Spot / preemptible** — lowest cost | **A100 / H100** | $1.071.38/hr, 5090% off on-demand |
### 2026 Trends
- **H100** — price dropped 64% from peak $8/hr to $1.382.89/hr, then 40% rebound from inference demand
- **B200** — new high-end, $3.39/hr cloud → ~$0.15/M tok on spot — new inference benchmark
- **MI300X** — supply growing (TensorWave, Vultr, CoreWeave, Oracle, Azure), from $1.50/hr
- **Gaudi 3** — best $/PFLOPS on purchase, but narrow ecosystem and limited cloud availability
- **Market bifurcation** — prior gen (H100, A100) commoditizing, new gen (B200, GB200) commanding premium
- [GPU.en.md](GPU.en.md) — GPU architecture, NVIDIA/AMD, vGPU, MIG
- [NETWORKING.en.md](NETWORKING.en.md) — InfiniBand, RoCE, network topology
- [STORAGE.en.md](STORAGE.en.md) — parallel filesystem, object store
- [DATACENTERS.en.md](DATACENTERS.en.md) — DC layout, power, cooling
- [CLOUD.en.md](CLOUD.en.md) — cloud AI services (SageMaker, Vertex AI)
## Sources
Links, books, and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
*Last revision: 2026-06-18*

602
AI-INFRASTRUCTURE.md Normal file
View File

@@ -0,0 +1,602 @@
# 🧠 Infrastruktura pro AI/ML
## Přehled komponent
```mermaid
flowchart TD
subgraph Compute
GPU["GPU (H100/B200/Instinct)"]
CPU["CPU (AMD EPYC / Intel Xeon)"]
ASIC["ASIC (TPU, Trainium, Inferentia)"]
end
subgraph Network
IB["InfiniBand NDR/XDR"]
ROCE["RoCEv2"]
NVL["NVLink / NVSwitch"]
end
subgraph Storage
FS["Parallel FS (Lustre, GPFS, Weka)"]
OBJ["Object Store (S3, MinIO)"]
NVME["Local NVMe cache"]
end
subgraph Orchestration
S["Slurm"]
K["Kubernetes + Volcano/Kueue"]
end
subgraph Cooling
DLC["Direct-to-chip liquid"]
IMM["Immersion"]
AIR["Air (high-density)"]
end
Compute --> Network --> Storage
Orchestration --> Compute
Cooling --> Compute
```
---
## GPU compute
### NVIDIA
| GPU | Architektura | FP8 | FP16/BF16 | FP64 | HBM | NVLink | TDP | Rack |
|-----|-------------|-----|-----------|------|-----|--------|-----|------|
| **H100 SXM** | Hopper | 3 958 TFLOPS | 1 979 TFLOPS | 67 TFLOPS | 80 GB HBM3 | 900 GB/s | 700 W | 68× v DGX H100 |
| **H200 SXM** | Hopper (HBM3e) | 3 958 TFLOPS | 1 979 TFLOPS | 67 TFLOPS | 141 GB HBM3e | 900 GB/s | 700 W | 68× v DGX H200 |
| **B200** | Blackwell | ~9 000 TFLOPS | ~4 500 TFLOPS | ~40 TFLOPS | 192 GB HBM3e | 1 800 GB/s | 1 000 W | 68× v DGX B200 |
| **GB200 Grace Hopper** | Blackwell | ~18 000 TFLOPS | ~9 000 TFLOPS | — | 192 GB + 480 GB (Grace) | NVLink-C2C | 1 000 W (GPU) + 500 W (CPU) | DGX GB200 (36× GPU) |
| **L40S** | Ada Lovelace | 733 TFLOPS | 367 TFLOPS | — | 48 GB GDDR6 | N/A | 350 W | Inference, enterprise |
| **A100 SXM** | Ampere | 1 248 TFLOPS | 624 TFLOPS | 19,5 TFLOPS | 80 GB HBM2e | 600 GB/s | 400 W | DGX A100 |
### AMD
| GPU | Architektura | FP8 | FP16/BF16 | FP64 | HBM | Infinity Fabric | TDP |
|-----|-------------|-----|-----------|------|-----|----------------|-----|
| **MI300X** | CDNA 3 | 2 615 TFLOPS | 1 307 TFLOPS | 81 TFLOPS | 192 GB HBM3 | 896 GB/s | 750 W |
| **MI250** | CDNA 2 | — | 383 TFLOPS | 95,7 TFLOPS | 128 GB HBM2e | 400 GB/s | 500 W |
### Intel
| GPU | Architektura | FP16/BF16 | FP32 | HBM | TDP |
|-----|-------------|-----------|------|-----|-----|
| **Gaudi 3** | Custom | 1 835 TFLOPS | — | 144 GB HBM2e | 600 W |
| **Max 1550** | Xe HPC | 600+ TFLOPS | 200 TFLOPS | 128 GB HBM2e | 600 W |
### Cloud ASIC
| ASIC | Provider | Use case | Výkon |
|------|----------|----------|-------|
| **TPU v5p** | Google | Training | ~4 600 TFLOPS (BF16) per pod |
| **Trainium 2** | AWS | Training | ~1 000 TFLOPS (BF16) per chip |
| **Inferentia 2** | AWS | Inference | ~400 TOPS (INT8) per chip |
| **Maia 100** | Microsoft | Training + inference | Custom, 800 W TDP |
---
## AI networking
### Srovnání technologií
| Technologie | Bandwidth per link | Latence | Topologie | Use case |
|-------------|-------------------|---------|-----------|----------|
| **InfiniBand NDR200** | 200 Gb/s | < 1 µs | Fat-tree, Dragonfly+ | Training (NVIDIA) |
| **InfiniBand NDR400** | 400 Gb/s | < 1 µs | Fat-tree, Dragonfly+ | Training (NVIDIA) |
| **InfiniBand XDR** | 800 Gb/s (planned) | < 1 µs | Dragonfly+ | Next-gen training |
| **RoCEv2** (CX-7/8) | 200400 Gb/s | 12 µs | Fat-tree, Spine-leaf | Training (AMD, Intel, open) |
| **NVLink 4.0** | 900 GB/s per GPU | < 0,5 µs | NVSwitch full-mesh | Intra-node GPU comm |
| **NVLink 5.0** | 1 800 GB/s per GPU | < 0,5 µs | NVSwitch full-mesh | Intra-node (Blackwell) |
| **Ethernet (400 GbE)** | 400 Gb/s | 25 µs | Spine-leaf | Inference, data pipeline |
### Principy AI fabric
- **Rail-optimized topology** — každá GPU komunikuje na dedikovaném "rails" (stejné GPU indexy napříč uzly jsou na stejném switchi)
- **Fat-tree (Clos)** — standard pro InfiniBand a RoCE, non-blocking bisection bandwidth
- **Dragonfly+** — redukce počtu hopů při zachování bandwidth (používáno v největších clusterech)
- **GPU Direct RDMA** — přímá komunikace GPU ↔ GPU bez CPU involvementu, podpora InfiniBand a RoCE
- **SHARP (Scalable Hierarchical Aggregation and Reduction Protocol)** — in-network reduction pro AllReduce (pouze InfiniBand)
### Bandwidth dimenzování
```text
Pravidlo: InfiniBand bandwidth ≥ 50 % GPU HBM bandwidth pro škálovatelné training
Příklad: H100 má 3,35 TB/s HBM
→ Potřebuje min. 1,6 TB/s bisection bandwidth per GPU
→ 8× H100 v DGX: 4× NDR400 IB na GPU = 4 × 50 GB/s = 200 GB/s
→ Reálně: 8× 200 Gb/s (25 GB/s) per GPU v typické konfiguraci = ~6 % HBM → bottleneck
```
---
## AI storage
### Požadavky
| Dataset size | IO pattern | Doporučený storage | Bandwidth |
|-------------|-----------|-------------------|-----------|
| < 10 TB | Sequential read (data loading) | Local NVMe | > 10 GB/s per node |
| 10100 TB | Random read (checkpointing) | Parallel FS (Lustre, Weka) | > 100 GB/s cluster-wide |
| 100 TB10 PB | Mixed (training + checkpoint) | Parallel FS + object store | > 500 GB/s |
| 10 PB+ | Multi-modal, video, LLM | Tiered (NVMe cache + parallel FS + object) | > 1 TB/s |
### Srovnání storage řešení
| Řešení | Typ | Bandwidth per node | Max capacity | Škálování | Use case |
|--------|-----|-------------------|-------------|-----------|----------|
| **Lustre** | Parallel FS (POSIX) | > 100 GB/s (cluster) | 100s PB | OST + MDS | HPC, LLM training (standard) |
| **GPFS / StorageScale** | Parallel FS (POSIX) | > 100 GB/s | 100s PB | NSD servers | HPC, AI (IBM) |
| **WekaFS** | Parallel FS (POSIX + NFS/SMB) | ~80 GB/s per 10 nodes | 10s PB | Container-native | AI/ML, NVIDIA DGX preferred |
| **VAST Data** | Universal storage (NVMe + QLC) | ~100 GB/s per cluster | 10s PB | Scale-out | AI, checkpoint, data lake |
| **Pure Storage//E** | All-flash (NVMe) | ~50 GB/s | ~30 PB | Scale-out | Enterprise AI, database |
| **MinIO / S3** | Object store | ~20 GB/s per gateway | EB | Erasure coding | Dataset repository, checkpoint |
| **NetApp AFF** | NAS + S3 | ~10 GB/s per controller | ~50 PB | HA pair | Enterprise, NFS baseline |
### Checkpointing strategie
| Strategie | RPO | Storage impact | Popis |
|-----------|-----|---------------|-------|
| **Full checkpoint** | každý N step | Vysoký (zastaví training) | Celý model + optimizer state |
| **Async checkpoint** | každý N step | Střední (non-blocking) | Kopie do staging bufferu, zápis na pozadí |
| **Distributed checkpoint** (NVIDIA NeMo) | každý N step | Nízký | Každá rank zapisuje svůj shard |
| **In-memory checkpoint** (IBM) | při failover | Minimální (DRAM) | Replikace do DRAM jiného node |
| **Continuous checkpoint** (Microsoft) | každý 15 min | Nízký (delta) | Jen changed shardy |
---
## AI cluster architektura
### Fyzická topologie — DGX H100 example
```
┌──────── DGX H100 (8× GPU) ────────┐
│ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │
│ │GPU 0│ │GPU 1│ │GPU 2│ │GPU 3│ │
│ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ │
│ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ │
│ │GPU 4│ │GPU 5│ │GPU 6│ │GPU 7│ │
│ └─────┘ └─────┘ └─────┘ └─────┘ │
│ NVSwitch (NVLink 4.0, 900 GB/s) │
│ InfiniBand CX-7: 8× NDR400 │
└────────────────────────────────────┘
│ 8× IB rails
┌────┴──────────────┐
│ IB NDR400 Switches │ (rail-optimized)
└────────────────────┘
```
### Kubernetes pro AI
| Komponenta | Role |
|-----------|------|
| **Volcano** | Batch scheduling, gang scheduling, queue management |
| **Kueue** | Multi-tenant admission, resource quotas, fair sharing |
| **NVIDIA GPU Operator** | Driver, container toolkit, MIG, DCGM, monitoring |
| **HAMi** (ex k8s-vGPU-scheduler) | GPU sharing, MIG partitioning, fractional GPU |
| **Node Feature Discovery** | Detekce GPU typu, NUMA topologie |
| **Topology Manager** | NUMA-aware pod placement |
| **DPDK / SR-IOV** | High-performance networking pro GPU Direct RDMA |
### Slurm pro AI
| Komponenta | Role |
|-----------|------|
| **slurm.conf** | Partition pro GPU nodes, GRES (Generic Resource) |
| **gres.conf** | GPU typ, počet GPU na node |
| **srun --gres=gpu:8** | Alokace 8 GPU pro job |
| **sbatch --nodes=64 --ntasks=512** | 64 uzly, 512 ranků (8 GPU/node) |
| **Pixis** | NVIDIA orchestrace plugin pro Slurm |
---
## Chlazení AI clusterů
### Power density srovnání
| Konfigurace | TDP per node | Racků | kW/rack | Poznámka |
|-------------|-------------|-------|---------|----------|
| Standardní server (2U) | 1 kW | 20 | 510 | Běžné DC |
| GPU server (DGX H100, 6×) | 42 kW | 6 | 4550 | Air cooling limit |
| GPU server (DGX B200, 6×) | 72 kW | 6 | 90100 | Liquid cooling nutný |
| GPU server (GB200 NVL72) | 120 kW | — | ~120 | Liquid cooling mandatory |
| NVIDIA NVL72 rack | 120 kW | 1 | 120 | Plně liquid cooled |
### Chladící technologie
| Metoda | Max kW/rack | CAPEX | OPEX | Komplexita |
|--------|-------------|-------|------|-----------|
| **Air cooling (CRAC/CRAH)** | < 15 | Nízká | Střední | Nízká |
| **Air cooling (in-row)** | 1530 | Střední | Střední | Nízká |
| **Rear-door heat exchanger** | 3050 | Střední | Nízká | Střední |
| **Direct-to-chip liquid (cold plate)** | 50150 | Vysoká | Nízká | Vysoká |
| **Immersion (single-phase)** | 100200 | Vysoká | Nízká | Vysoká |
| **Immersion (two-phase)** | 200+ | Velmi vysoká | Nízká | Velmi vysoká |
---
## Inference infrastruktura
### Srovnání inference serverů
| Nástroj | Frameworky | Optimalizace | Use case |
|---------|-----------|-------------|----------|
| **vLLM** | Megatron, HF, AWQ, GPTQ | PagedAttention, KV cache, continuous batching | LLM inference (open source) |
| **TensorRT-LLM** | TensorRT | INT4/INT8/FP8, inflight batching, attention optimizations | Produkce (NVIDIA) |
| **Triton Inference Server** | Vše (TensorRT, vLLM, PyTorch) | Model ensemble, model caching, concurrent execution | Enterprise, multi-model |
| **SageMaker** | Managed | Auto-scaling, model parallelism | AWS managed |
| **OpenAI API / TGI** | HF Transformers | Continuous batching, flash attention | Hosting |
### Optimalizace pro inference
| Technika | Latence zlepšení | Propustnost zlepšení | Memory reduction |
|----------|-----------------|---------------------|------------------|
| **FP8/INT8 quantization** | — | 2× | 2× |
| **INT4 quantization** | — | 4× | 4× |
| **Flash Attention 2/3** | 24× | — | 50 % (KV cache) |
| **PagedAttention** | — | 25× | 95 % (KV cache fragmentation) |
| **Continuous batching** | — | 1020× | — |
| **Speculative decoding** | 23× | — | — |
| **Multi-LoRA / S-LoRA** | — | 816× | — |
---
## Distribuované training techniky
| Technika | Popis | Frameworky |
|----------|-------|------------|
| **Data Parallelism (DDP/FSDP)** | Každá GPU má kopii modelu, různé batch | PyTorch DDP, FSDP |
| **Tensor Parallelism (TP)** | Model rozdělen po vrstvách (intra-node) | Megatron-LM, DeepSpeed |
| **Pipeline Parallelism (PP)** | Vrstvy rozděleny napříč uzly | Megatron-LM, DeepSpeed |
| **Sequence Parallelism (SP)** | Sekvence rozdělena napříč GPU | Megatron-LM |
| **Expert Parallelism (EP)** | Různé expertní subsítě na různých GPU | Mixture-of-Experts (MoE) |
| **3D Parallelism** | TP + PP + DP kombinace | Megatron-LM, NeMo |
| **ZeRO (1/2/3)** | Optimalizátor/gradient/parametry sharding | DeepSpeed |
| **NCCL / RCCL** | GPU collective communication library | NVIDIA/AMD |
---
## Operační systémy pro AI
### Srovnání distribucí
| OS | GPU driver | CUDA | Container toolkit | IB/RoCE | Lustre klient | Produkční podpora |
|----|-----------|------|-------------------|---------|--------------|-------------------|
| **Ubuntu 22.04 LTS** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED, rdma-core | Ano (lustre-client) | NVIDIA DGX standard |
| **Ubuntu 24.04 LTS** | NVIDIA 550+ | 12.5+ | nvidia-container-toolkit | MLNX_OFED, rdma-core | Ano | Nejnovější GPU podpora |
| **RHEL 9 / Rocky 9** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED | Ano (EL repo) | Red Hat, enterprise |
| **DGX OS** (Ubuntu-based) | NVIDIA custom | 12.x | Pre-installed | Pre-configured | Ano | NVIDIA DGX jediná podporovaná |
| **SLES 15 SP5** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED | Ano | HPC, některé Lustre clustery |
| **Debian 12** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | rdma-core | Ano (backports) | Community, research |
| **Flatcar / Bottlerocket** | Container-host | — | nvidia-container-toolkit | Omezeně | Ne | K8s-only, minimal footprint |
### Omezení a limity
#### GPU drivery a CUDA
| Omezení | Detail |
|----------|--------|
| **Driver-CUDA kompatibilita** | NVIDIA driver major verze musí odpovídat CUDA toolkit (driver ≥ CUDA req). Např. CUDA 12.5 vyžaduje driver ≥ 550 |
| **Kernel version** | NVIDIA driver není kompatibilní se všemi kernely. Nový kernel (6.8+) může vyžadovat DKMS build nebo opožděnou podporu |
| **Secure Boot** | NVIDIA driver vyžaduje podepsaný modul (MOK, shim) nebo vypnutý Secure Boot — častý problém v enterprise |
| **Open vs Proprietary driver** | NVIDIA `nvidia-open` (od R515) — open source kernel modul. Podpora GPU: datové centrum (H100+) → OK, starší GPU → proprietary nutný |
| **nvidia-persistenced** | Nutný pro udržení GPU initialization, bez něj GPU po idle timeout usnou (`nvidia-smi -pm 1`) |
| **GPU reset** | Po crash training jobu může GPU viset. `nvidia-smi --gpu-reset` nebo reboot node, někdy i power cycle |
| **Multi-instance GPU (MIG)** | Vyžaduje specifický driver, MIG mode na GPU, restart GPU. Nelze měnit za běhu. Podpora jen A100, H100, B200 |
#### Network (InfiniBand / RoCE)
| Omezení | Detail |
|----------|--------|
| **MLNX_OFED vs rdma-core** | MLNX_OFED (NVIDIA) — plná podpora, ale vlastní kernel moduly, nutná compatibility s kernel verzí. `rdma-core` (open) — omezená podpora, ale bez modulů |
| **Kernel compatibility** | MLNX_OFED podporuje jen specifické kernel verze (major.minor). Upgrade kernelu → nutný rebuild MLNX_OFED |
| **NCCL** | Verze NCCL musí být kompatibilní s CUDA a IB firmware. `nccl-tests` jako validace |
| **SHARP** | In-network reduction vyžaduje specifickou MLNX_OFED + IB switch firmware kombinaci |
| **GPU Direct RDMA** | Vyžaduje `nvidia-peermem` modul + MLNX_OFED. Nefunguje se všemi GPU a IB kartami |
| **RoCE v PFC/ECN** | RoCE vyžaduje lossless fabric (PFC, ECN, DCQCN). Nastavení switch i host — komplexní tuning |
#### Storage
| Omezení | Detail |
|----------|--------|
| **Lustre klient** | Verze klienta musí odpovídat serveru. Upgrade serveru → upgrade všech klientů. Kompatibilní jen s RHEL/Debian deriváty |
| **POSIX locking** | NFS a Lustre mají odlišné POSIX locking chování. Distributed training spoléhá na flock → problém při smíšených FS |
| **Filesystem cache** | Page cache může maskovat IO bottleneck. Training joby často vyžadují `O_DIRECT` nebo `sync` IO |
| **Local NVMe vs parallel FS** | Dataset staging na lokální NVMe eliminuje síťovou závislost, ale vyžaduje prostor a pre-fetch pipeline |
#### Kontejnerový runtime
| Omezení | Detail |
|----------|--------|
| **Docker + GPU** | `nvidia-container-toolkit` (dříve nvidia-docker2). Nutná instalace runtime a config v `/etc/docker/daemon.json` |
| **Podman + GPU** | Vyžaduje `nvidia-container-toolkit` + podman hook. Méně testováno než Docker |
| **containerd + GPU** | Standart pro K8s. Vyžaduje `cdi` (Container Device Interface) nebo `nvidia-container-runtime` |
| **Enroot + Pyxis** | NVIDIA container stack pro Slurm (Enroot = container runtime bez daemona, Pyxis = Slurm plugin) |
| **User namespace mapping** | Kontejnerové GPU access vyžaduje device cgroup a rootless může selhat (výjimka pro /dev/dri a /dev/nvidia*) |
#### Kernel parametry
```text
# AI workload recommended sysctl
net.core.rmem_max = 134217728 # dostatečný pro NCCL
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
net.core.netdev_budget = 600 # pro vysokou packet rate
vm.max_map_count = 1048576 # PyTorch DataLoader workers
kernel.numa_balancing = 0 # vypnout NUMA balancing (ruší locality)
kernel.sched_min_granularity_ns = 10000000
# Disable security mitigations pro perf (pouze na dedicated AI clusterech)
mitigations=off
transparent_hugepages=never # nebo madvise — THP může způsobovat latency spiky
intel_idle.max_cstate=1 # redukce C-state transition latency
```
#### Firmware a HW
| Omezení | Detail |
|----------|--------|
| **GPU firmware (VBIOS)** | NVIDIA datacenter GPU (H100, B200) mají VBIOS updates přes NVFlash. Bez update → chybí podpora partitioning nebo novějších CUDA feature |
| **InfiniBand firmware** | IB switch a HCA firmware musí být kompatibilní. Mix starého switch + nového HCA → degraded perf |
| **NVSwitch firmware** | DGX systémy mají NVSwitch firmware updatovatelný jen přes NVIDIA DGX tools |
| **Power capping (nvidia-smi)** | `nvidia-smi -pl <power>` — omezení TDP pro power budget management. Nutné testovat vliv na training throughput |
| **GPU clock locking** | `nvidia-smi -ac <clock,mem>` — locked clock frekvence pro stabilní benchmarky. Aplikace až po `nvidia-persistenced` |
| **PCIe Gen** | GPU v PCIe Gen4 slotu (místo Gen5) → bottleneck pro data transfer CPU↔GPU. Důležité pro FSDP sharding |
### Doporučené OS per use case
| Use case | OS | Zdůvodnění |
|----------|-----|-------|
| **DGX cluster (produkce)** | DGX OS / Ubuntu 22.04 LTS | NVIDIA standard, nejlepší driver support |
| **Enterprise K8s (OpenShift)** | RHEL 9 / RHCOS | Red Hat support, GPU Operator kompatibilní |
| **Vanilla K8s (on-prem)** | Ubuntu 22.04 LTS + Flatcar (workers) | Nejširší community support, Flatcar pro minimal footprint |
| **Slurm cluster (HPC/AI)** | Rocky Linux 9 / Ubuntu 22.04 LTS | EL ekosystém (Lustre, OFED) nebo Ubuntu (community) |
| **Výzkum / rapid prototyping** | Ubuntu 24.04 LTS | Nejnovější CUDA, PyTorch, driver support |
| **Edge inference** | NVIDIA JetPack / Ubuntu (ARM) | Embedded GPU (Jetson Orin, AGX) |
---
## AI-ready datové centrum — check-list
| Oblast | Požadavek |
|--------|-----------|
| **Power** | 30120 kW/rack, HVDC (400 V DC), UPS s podporou GPU špiček |
| **Cooling** | Liquid cooling ready (direct-to-chip), rear-door pro 30+ kW |
| **Network** | InfiniBand (NDR/XDR) nebo RoCEv2, rail-optimized fat-tree |
| **Storage** | Parallel FS (Lustre/Weka), checkpoint bandwidth > 100 GB/s |
| **GPU density** | Max GPU/rack, minimalizace NVSwitch hopů |
| **Physical** | Podlaha nosnost 1 500+ kg/m², rack 52U60U |
| **Security** | Tenant isolation, network segmentation, data encryption |
| **Monitoring** | DCGM, NCCL health checks, thermals, power capping |
---
## Omezení modelů a propustnosti
### Model size per GPU
Maximální velikost modelu, který se vejde na jednu GPU, závisí na HBM kapacitě a precision:
| GPU | HBM | FP32 | FP16/BF16 | INT8 | INT4 |
|-----|-----|------|-----------|------|------|
| **H100 80GB** | 80 GB | ~10B | ~40B | ~80B | ~160B |
| **H200 141GB** | 141 GB | ~18B | ~70B | ~140B | ~280B |
| **B200 192GB** | 192 GB | ~24B | ~96B | ~192B | ~384B |
| **MI300X 192GB** | 192 GB | ~24B | ~96B | ~192B | ~384B |
| **A100 80GB** | 80 GB | ~10B | ~40B | ~80B | ~160B |
| **GB200 (192+480)** | 192 GB GPU + 480 GB Grace | — | ~96B + CPU offload | — | — |
*Hodnoty orientační: 1B parametrů ≈ 2 GB FP16 ≈ 4 GB FP32 ≈ 1 GB INT8 ≈ 0,5 GB INT4. Reálně odečíst ~1015 % HBM pro activations, KV cache, optimizer states.*
### Memory breakdown inference
| Komponenta | Llama 3 70B (FP16) | Llama 3 8B (FP16) |
|------------|-------------------|-------------------|
| Model weights | 140 GB | 16 GB |
| KV cache (4K context, batch 1) | ~2 GB | ~0,2 GB |
| KV cache (128K context, batch 1) | ~60 GB | ~6,5 GB |
| Activations (peak) | ~5 GB | ~1 GB |
| **Celkem 4K ctx** | ~147 GB | ~17 GB |
| **Celkem 128K ctx** | ~205 GB | ~23 GB |
**Závěr:** Llama 3 70B v FP16 se nevejde na jednu H100 (80 GB). Nutné: INT8 (170 GB → 2× H100), INT4 (85 GB → 1× H200), nebo tensor parallelism.
### Context length vs memory
| Context | KV cache 70B (FP16) | KV cache 8B (FP16) | Poznámka |
|---------|-------------------|-------------------|----------|
| 4K | ~2,2 GB | ~0,25 GB | Běžný chat |
| 32K | ~18 GB | ~2 GB | Dokumenty |
| 128K | ~72 GB | ~8 GB | Long-context (Claude, Gemini) |
| 1M | ~560 GB | ~64 GB | Experimentální (Gemini 1.5 Pro) |
KV cache je **lineární s délkou kontextu** a kvadratická s počtem hlav pozornosti. Pro long-context je kritická.
### Throughput inference
| Model | GPU | Precision | Batch size | Tokens/s | QPS (1K output) |
|-------|-----|-----------|-----------|----------|-----------------|
| Llama 3 8B | H100 | FP16 | 1 | ~800 | ~0,8 |
| Llama 3 8B | H100 | FP16 | 128 | ~4 500 | ~35 |
| Llama 3 8B | H100 | INT4 | 128 | ~8 000 | ~62 |
| Llama 3 70B | 4× H100 | FP16 | 1 | ~180 | ~0,18 |
| Llama 3 70B | 4× H100 | INT4 | 64 | ~1 200 | ~19 |
| Llama 3 70B | 8× H100 | FP16 (TP=8) | 128 | ~2 500 | ~20 |
| DeepSeek-R1 671B | 8× H200 | FP8 (MoE) | 64 | ~500 | ~8 |
| GPT-4 class (est.) | — | — | — | ~100300 | ~13 |
**Poznámky:**
- QPS (queries per second) závisí na output délce (1K tokenů ≈ ~1 query)
- Batch size zvyšuje throughput, ale zvyšuje TTFB (time to first token)
- Tensor Parallelism (TP) škáluje, ale komunikační režba roste lineárně
### Training limits
#### Scaling efficiency
| Počet GPU | Model | Efficiency | Důvod |
|-----------|-------|-----------|-------|
| 8 (1 node) | Llama 3 8B | ~95 % | NVLink intra-node |
| 64 (8 nodes) | Llama 3 8B | ~85 % | IB inter-node |
| 512 (64 nodes) | Llama 3 70B | ~75 % | Komunikační režie |
| 4 096 (512 nodes) | Llama 3 70B | ~60 % | Pipeline bubble, network |
| 16 384 (2 048 nodes) | Llama 3 405B | ~45 % | Synchronous SGD overhead |
**Poznámka:** Efficiency = (actual throughput) / (ideal linear speedup). Klesá logaritmicky s počtem GPU.
#### Memory breakdown training
| Komponenta | Llama 3 70B (BF16) | Llama 3 8B (BF16) |
|------------|-------------------|-------------------|
| Model weights | 140 GB | 16 GB |
| Optimizer states (Adam) | 280 GB | 32 GB |
| Gradients | 140 GB | 16 GB |
| Activations (peak) | ~30 GB | ~4 GB |
| **Celkem (DDP)** | ~590 GB | ~68 GB |
| **Celkem (FSDP shard=8)** | ~74 GB | ~8,5 GB |
**Závěr:** FSDP (Fully Sharded Data Parallelism) je nutný pro trénování modelů > 10B. Adam optimizer zdvojnásobuje memory oproti inference (weights + optimizer + gradients).
#### Time to train
| Model | GPU count | GPU type | Precision | Time | Cost (on-prem odhad) |
|-------|-----------|---------|-----------|------|---------------------|
| Llama 3 8B | 64 | H100 | BF16 | ~3 dny | ~$5 000 |
| Llama 3 70B | 512 | H100 | BF16 | ~14 dní | ~$100 000 |
| Llama 3 405B | 16 384 | H100 | BF16 | ~60 dní | ~$14 M |
| DeepSeek-R1 671B (MoE) | 2 048 | H800 | BF16 | ~30 dní | ~$6 M |
| GPT-4 (est.) | 25 000 | A100/H100 | Mixed | ~90100 dní | ~$100 M |
### Power a thermal limity
| Konfigurace | TDP limit | Throughput ztráta | Důvod |
|-------------|-----------|------------------|-------|
| H100 SXM | 700 W (default) | 0 % | Nominální |
| H100 SXM | 600 W (-15 %) | ~58 % | Power capping |
| H100 SXM | 500 W (-30 %) | ~1525 % | Výrazný throttling |
| H100 SXM | 400 W (-43 %) | ~3050 % | Jen pro emergency |
| DGX H100 (8×) | 5,6 kW (max) | 0 % | Nutné liquid cooling |
| DGX H100 (8×) | 4,5 kW (air) | ~1015 % | Rear-door heat exchanger |
GPU throttluje při překročení TDP nebo teploty (85°C+). Power capping je lineární korelace s frekvencí, ale nelineární s propustností.
### API a provozní limity
| Limit | Popis | Typická hodnota |
|-------|-------|-----------------|
| **Rate limit** | Max requestů za minutu/hodinu | 10010 000 RPM (dle tieru) |
| **Tokens per minute (TPM)** | Max tokenů za minutu | 1M300M (dle modelu) |
| **Context window** | Max vstupních tokenů | 4K2M (dle modelu) |
| **Max output tokens** | Max vygenerovaných tokenů | 4K32K (dle modelu) |
| **Concurrent requests** | Počet paralelních requestů | 1010 000 (dle backendu) |
| **Batch window** | Čas na sebírání batch | 020 s (vLLM, TGI) |
| **TTFB timeout** | Max latence na první token | 30120 s |
| **Idle timeout** | GPU idle → škálování na 0 | 515 min (cloud) |
### Limity per deployment model
| Model | Samostatný HW | Managed cloud (SageMaker, Vertex) | API (OpenAI, Anthropic) |
|-------|--------------|----------------------------------|------------------------|
| **Model size** | Limitován HBM (max 192 GB/GPU) | Neomezen (škálování cluster) | Neomezen |
| **Queries** | Limitován GPU count | Auto-scaling | Rate limit (dle tieru) |
| **Latency** | < 10 ms (same node) | 10100 ms (network hop) | 100 ms 10 s |
| **Customization** | Plná (fine-tuning, quantization) | Managed (SageMaker, Bedrock) | Pouze prompt engineering |
| **Data privacy** | Ano (on-prem) | Smluvní (region, encryption) | Omezená |
| **Cost per 1M tokens** | ~$0,100,50 (FP16 inference) | ~$0,201,00 | ~$0,1515,00 |
| **Max context** | 128K+ (dle GPU count) | 128K+ | 32K2M |
| **Cold start** | 0 (always-on) | 30 s 5 min | 0 (shared infra) |
---
## Ceny GPU a poměr cena/výkon (2026)
> Ceny jsou orientační — NVIDIA nezveřejňuje oficiální ceník pro datacenter GPU. Cloud ceny dle veřejných providerů (Q2 2026). Při koupi HW se cena liší dle objemu, resellera a regionu.
### Pořizovací cena (buy)
| GPU | Cena/GPU | Cena 8× GPU baseboard | $/PFLOPS (FP16) | Poznámka |
|-----|---------|----------------------|----------------|----------|
| **H100 SXM** | $27 00040 000 | ~$200 000 | $25 000 | Scareita 20232024, nyní stabilizace |
| **H200 SXM** | $35 00050 000 | ~$280 000 | ~$35 000 | Upgrade H100, HBM3e |
| **B200** | ~$60 00070 000 | ~$500 000+ | ~$31 000 | Blackwell, FP4 support |
| **B100** | ~$30 000 | ~$240 000 | ~$20 000 | Nižší cena než B200, podobný výkon FP8 |
| **GB200** (Grace+Blackwell) | ~$70 000100 000 | ~$2 000 000 (rack) | — | CPU+GPU unified, high-density |
| **A100 80GB** | ~$10 00015 000 | ~$120 000 | ~$19 200 | Předchozí generace, stále relevantní |
| **MI300X** | ~$12 00018 000 | ~$100 000 | ~$9 600 | AMD, 192 GB HBM3 |
| **Gaudi 3** | ~$15 625 | ~$125 000 | **$8 515** | Intel, nejlepší $/PFLOPS |
| **L40S** | ~$8 00010 000 | — | — | Inference, enterprise |
### Cloud ceny (on-demand $/GPU/hr)
| GPU | Nejdostupnější | Mid-range (CoreWeave, Lambda) | Hyperscaler (AWS, GCP, Azure) |
|-----|--------------|-------------------------------|-------------------------------|
| **H100 SXM** | $1.38 (Thunder) | $2.893.29 | $4.156.88 |
| **H100 PCIe** | $2.01 (Spheron) | $2.50 | — |
| **H200 SXM** | $3.89 (Spheron) | $4.54 | $5.00+ |
| **B200** | **$3.39** (Spheron) | $6.02 | $14.24 (AWS) |
| **B200** | **$2.12** (spot) | — | — |
| **GB200** | $3.50 (Runcrate) | $5.85 (Oracle) | $6.95 (GCP) |
| **MI300X** | **$1.50** (TensorWave) | $1.85 (Vultr) | $7.86 (Azure) |
| **A100 80GB** | $1.07 (Spheron) | $1.502.00 | $3.00+ |
| **Gaudi 3** | ~$1.502.50 | — | — |
| **L40S** | $0.91 (Spheron) | $1.502.00 | — |
### Cena za inferenci ($/M tokenů)
| GPU | Provider | $/hr | Est. tok/s | $/M tok |
|-----|----------|------|-----------|--------|
| **B200** | Spheron | $3.39 | ~4 000 | **$0.42** |
| **B200** (spot) | Spheron | $2.12 | ~4 000 | **$0.15** |
| **H100 PCIe** | Spheron | $2.01 | ~1 200 | $0.47 |
| **A100 80GB** | Spheron | $1.07 | ~520 | $0.57 |
| **H100 SXM** | AWS | $6.88 | ~1 200 | $1.59 |
| **H200 SXM** | Spheron | $4.54 | ~1 800 | $0.70 |
| **L40S** | Spheron | $0.91 | ~450 | $0.56 |
*Hodnoty pro Llama 3 70B (INT8, batch=1, output 1K tok). Reálné hodnoty se liší dle batch size, kontextu a kvantizace.*
### Cena za GB HBM
| GPU | HBM | Cena/hr cloud | $/GB/hr | Vhodnost pro memory-bound workloady |
|-----|-----|-------------|--------|-----------------------------------|
| **MI300X** | 192 GB | $1.50 | **$0.0078** | ✅ Nejlepší |
| **B200** | 192 GB | $3.39 | $0.0177 | ✅ Dobrý |
| **H200** | 141 GB | $3.89 | $0.0276 | ⚠️ |
| **H100 SXM** | 80 GB | $1.38 | $0.0173 | ⚠️ Jen do 70B modelů |
| **GB200** | 384 GB | $3.50 | $0.0091 | ✅✅ (2× MI300X kapacita) |
### Poměr cena/výkon dle scénáře
| Scénář | Vítěz | Zdůvodnění |
|--------|-------|-------|
| **Absolutní výkon** (cena není limit) | **GB200 DGX NVL72** | 72× GPU, 18 PFLOPS FP8, 384 GB HBM/GPU |
| **Cloud inference** — nejlepší $/token | **B200 spot** | $0.15/M tok; 4× throughput H100 při nižší ceně |
| **Cloud inference** — on-demand | **B200** | $0.42/M tok |
| **Cloud inference** — rozpočet | **A100 / L40S** | $0.570.56/M tok |
| **Training** — cena/výkon při koupi | **Gaudi 3** | $8 515/PFLOPS, 2.53× lepší než H100 |
| **Training** — cloud | **H100 SXM** | $1.38/hr, CUDA ekosystém, NCCL |
| **Memory-bound** — long context, 70B+ | **MI300X / GB200** | 192384 GB, $0.00780.0091/GB |
| **Ekosystém + bezpečná volba** | **H100/H200** | CUDA, nejširší SW, NVIDIA tools |
| **Spot / preemptible** — nejnižší cena | **A100 / H100** | $1.071.38/hr, 5090 % sleva oproti on-demand |
### Trendy 2026
- **H100** — cena klesla o 64 % z peaku $8/hr na $1.382.89/hr, pak rebound o 40 % díky inference boomu
- **B200** — nový high-end, $3.39/hr cloud → ~$0.15/M tok na spotu — benchmark pro inference
- **MI300X** — nabídka roste (TensorWave, Vultr, CoreWeave, Oracle, Azure), cena od $1.50/hr
- **Gaudi 3** — nejlepší $/PFLOPS při koupi, ale úzký ekosystém a omezená cloud dostupnost
- **Market se bifurkoval** — starší generace (H100, A100) komoditizují, nová (B200, GB200) drží prémii
## Související
- [GPU.md](GPU.md) — GPU architektura, NVIDIA/AMD, vGPU, MIG
- [NETWORKING.md](NETWORKING.md) — InfiniBand, RoCE, network topologie
- [STORAGE.md](STORAGE.md) — parallel filesystem, object store
- [DATACENTERS.md](DATACENTERS.md) — DC layout, power, cooling
- [CLOUD.md](CLOUD.md) — cloud AI služby (SageMaker, Vertex AI)
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Poslední revize: 2026-06-18*

232
BIG-DATA.en.md Normal file
View File

@@ -0,0 +1,232 @@
# 🗄️ Big Data — ecosystem, architecture, tools
## Overview
The Big Data ecosystem in 2026: "Hadoop is dead, and yet it's everywhere." HDFS has shrunk, MapReduce is effectively gone, the Cloudera/Hortonworks era is over. But YARN lives on, the Hive Metastore has changed clothes into Iceberg/Delta, and the lakehouse pattern (cheap object storage + table format + distributed engine) is the inheritance Hadoop left behind.
The modern Big Data stack has 8 layers:
1. **Storage** — HDFS, S3, GCS, ABFS, MinIO
2. **Table format** — Apache Iceberg, Delta Lake, Apache Hudi, Apache Paimon
3. **Catalog** — Hive Metastore, Unity Catalog, Polaris, Nessie, AWS Glue
4. **Batch processing** — Apache Spark, Trino-on-Spark, Dremio
5. **Stream processing** — Apache Flink, Spark Structured Streaming, Kafka Streams
6. **Distributed SQL** — Trino, Presto, StarRocks, ClickHouse
7. **Transformation** — dbt, SQLMesh
8. **Orchestration** — Apache Airflow 3.0, Dagster, Prefect, Kestra
---
## Storage
### HDFS (Hadoop Distributed File System)
| Feature | Detail |
|---------|--------|
| **Architecture** | Master/worker: NameNode (metadata) + DataNode (data) |
| **Replication** | Default 3×, configurable (rack-aware) |
| **Block size** | Default 128 MB (range 64 MB 256 MB) |
| **Limits** | NameNode memory ~ 1 GB / 1 million blocks; ~1000 DataNodes per cluster |
| **Use case** | On-prem clusters, sequential read/write, large files |
| **Status 2026** | Declining — most projects migrate to object storage (S3, GCS, MinIO) |
HDFS remains relevant for on-prem environments where object storage is unavailable, or for specific use cases (YARN clusters, Spark shuffle). For new projects, object storage is recommended.
### Object storage as Data Lake
| Platform | Service | Use case |
|----------|--------|----------|
| **AWS** | S3 | Primary data lake, Iceberg/Delta on S3 |
| **Azure** | ADLS Gen2 / Blob | Data lake for Azure ecosystem |
| **GCP** | GCS | Data lake for GCP (Dataproc, BigQuery) |
| **On-prem** | MinIO | S3-compatible object storage on own HW |
### HDFS capacity planning
| Data size | Configuration |
|-----------|-------------|
| **< 100 TB** | 35 DataNodes, 10 GbE, replication 3× |
| **100 TB 1 PB** | 520 DataNodes, 25/100 GbE, rack-aware, NameNode HA |
| **1 PB+** | 20+ DataNodes, 100 GbE, Federation (multiple NameNodes) |
---
## Open Table Formats
Table formats bring ACID transactions, schema evolution, and time travel to data lake object storage.
| Format | Organization | Engine compatibility | Streaming | Catalog |
|--------|-------------|---------------------|-----------|---------|
| **Apache Iceberg** | Apache Foundation | Spark, Flink, Trino, Dremio, Athena, Snowflake | Flink sink, snapshot-based | REST catalog, Polaris, Glue, Hive |
| **Delta Lake** | Linux Foundation (Databricks) | Spark (native), Trino, Flink (limited), Athena | Spark Streaming, DLT | Unity Catalog (proprietary), Hive |
| **Apache Hudi** | Apache Foundation | Spark, Flink, Trino (connector) | Built-in CDC, incremental | Hive, Glue (limited) |
| **Apache Paimon** | Apache Foundation | Flink (native), Spark | LSM-tree, changelog mode | Hive, REST |
**Recommendation 2026:**
- **Iceberg** — broadest multi-engine support, vendor-neutral, open catalog (Polaris)
- **Delta Lake** — best for Spark/Databricks ecosystem, UniForm for cross-format reads
- **Hudi** — losing momentum, only if already in production
- **Paimon** — emerging, Flink-native, LSM architecture
---
## Processing Engines
### Apache Spark
Dominant batch processing engine and unifying engine (batch + streaming + SQL + ML).
| Feature | Detail |
|---------|--------|
| **Version 2026** | Spark 4.x (4.1.0), native Kubernetes support, Structured Streaming, Delta Lake integration |
| **API** | Scala, Java, Python (PySpark), SQL, R (SparkR) |
| **Batch** | DataFrame/Dataset, RDD, SQL queries — 10100× faster than MapReduce |
| **Streaming** | Structured Streaming (micro-batch), latency ~100 ms 5 s |
| **SQL** | Spark SQL, ANSI SQL, Hive compatible |
| **ML** | MLlib, SparkML, MLflow integration |
| **Scheduler** | YARN, Kubernetes (production-ready since Spark 3.x), standalone |
| **Fault tolerance** | RDD lineage, checkpointing |
**When to use Spark:**
- Batch ETL/ELT pipelines
- Unified engine for batch + streaming (team preference)
- Machine learning pipelines (MLlib, SparkML)
- SQL analytics on large datasets
### Apache Flink
Highest-performance engine for true streaming (per-event processing).
| Feature | Detail |
|---------|--------|
| **Version 2026** | Flink 2.x (streaming-first, batch as bounded stream) |
| **API** | DataStream API, Table/SQL API, ProcessFunction (low-level) |
| **Latency** | < 100 ms (true streaming, Chandy-Lamport checkpointing) |
| **State management** | Managed state (ValueState, ListState, MapState), RocksDB backend |
| **Event time** | Native, watermarks, out-of-order handling |
| **Batch** | Batch as bounded stream (same runtime) |
| **Deployment** | YARN, Kubernetes, standalone |
| **Economics** | Higher memory requirements (managed state), requires careful tuning |
**When to use Flink:**
- Fraud detection, real-time bidding, IoT (< 100 ms latency)
- Complex stateful stream processing
- CDC pipelines
- Event-driven architectures
### Trino (ex PrestoSQL)
Distributed SQL query engine — federated queries across various sources.
| Feature | Detail |
|---------|--------|
| **Architecture** | Coordinator + Worker (no storage, no scheduler) |
| **Connectors** | Iceberg, Delta, Hive, HDFS, S3, GCS, ADLS, PostgreSQL, MySQL, Kafka, Elasticsearch |
| **Use case** | Interactive SQL, federated queries, lakehouse queries |
| **Version 2026** | Trino 470+, Iceberg native, Delta Lake connector |
---
## Spark vs Flink vs Trino comparison
| Criteria | Spark | Flink | Trino |
|----------|-------|-------|-------|
| **Primary use case** | Batch + unifying | True streaming | Interactive SQL |
| **Streaming latency** | 100 ms 5 s (micro-batch) | < 100 ms (true streaming) | N/A |
| **Throughput** | High (batch-optimized) | High (pipeline-optimized) | Medium (ad-hoc) |
| **State management** | State store (external) | Managed state (embedded) | N/A |
| **SQL support** | Spark SQL | Flink SQL | ANSI SQL (broadest) |
| **ML/AI** | MLlib, SparkML | — | — |
| **Kubernetes** | Native (production) | Native (production) | Native (production) |
| **Learning curve** | Medium | High | Low |
| **Operational complexity** | Medium | High | Medium |
---
## Orchestration
| Tool | Version 2026 | Use case |
|------|-------------|----------|
| **Apache Airflow** | 3.0+ (taskflow API, dynamic tasks, deferrable operators) | Universal orchestration, largest ecosystem |
| **Dagster** | 1.x (asset-oriented, software-defined assets) | Data pipelines, observability, asset lineage |
| **Prefect** | 3.x (native async, workers, blocks) | Python-native, serverless workers |
| **Kestra** | 1.x (YAML-native, declarative) | Event-driven orchestration |
| **Apache NiFi** | 2.x (flow-based, visual) | Data ingestion, CDC, streaming |
---
## Lakehouse architecture
Lakehouse combines data lake flexibility (object storage) with data warehouse performance and governance.
```
┌──────────────────────────────────────────────────────┐
│ Query Engines │
│ Trino Spark SQL Flink SQL Dremio Athena │
└─────────────────────────┬────────────────────────────┘
┌─────────────────────────▼────────────────────────────┐
│ Table Format Layer │
│ Apache Iceberg / Delta Lake / Hudi │
│ (ACID, time travel, schema evolution) │
└─────────────────────────┬────────────────────────────┘
┌─────────────────────────▼────────────────────────────┐
│ Storage Layer │
│ S3 / GCS / ADLS / MinIO / HDFS │
│ (Parquet / ORC / Avro) │
└──────────────────────────────────────────────────────┘
```
For Iceberg details see [DATABASES.en.md — Apache Iceberg Lakehouse](DATABASES.en.md#apache-iceberg-lakehouse).
---
## Big Data Infrastructure
### Cluster sizing
| Component | Spark (batch) | Flink (streaming) | Trino (SQL) |
|-----------|--------------|-------------------|-------------|
| **CPU** | 1664 cores/node | 1632 cores/node | 832 cores/node |
| **RAM** | 64256 GB/node | 64256 GB/node (incl. managed state) | 64256 GB/node |
| **Storage** | HDFS / object storage | Object storage (checkpoints) | None (stateless) |
| **Network** | 25100 GbE (shuffle-heavy) | 25100 GbE (checkpointing) | 25100 GbE |
| **Disk** | NVMe (scratch, shuffle) | NVMe (RocksDB state backend) | — |
| **Cluster size** | 5200+ nodes | 3100+ nodes | 550 nodes |
### Network considerations
- **Spark shuffle** — heavy network traffic between nodes; recommend 25100 GbE, ideally no oversubscription
- **Flink checkpointing** — periodic state writes to object storage; requires stable latency
- **HDFS rack awareness** — optimizes replication across racks
- **Data locality** — HDFS: local disk reads; object storage: network-bound
### Kubernetes vs YARN
| Criteria | YARN | Kubernetes |
|----------|------|-----------|
| **Resource isolation** | Cgroups (YARN containers) | Cgroups + namespaces (pods) |
| **Ecosystem fit** | Hadoop-native (HDFS, Hive, Spark) | Cloud-native, Spark, Flink, Trino |
| **Operational complexity** | Lower (single cluster manager) | Higher (requires K8s cluster) |
| **Multi-tenant isolation** | YARN queues (Capacity/Fair Scheduler) | Namespaces, ResourceQuotas, LimitRanges |
| **Stateful workloads** | Limited | StatefulSets, PVC, Operators |
| **2026 trend** | Legacy (declining) | Standard for new projects |
---
## Cloud deployment
| Cloud | Batch processing | Streaming | SQL | Managed K8s |
|-------|-----------------|-----------|-----|-------------|
| **AWS** | EMR (Spark, Hive, Flink) | Kinesis, MSK (Kafka), EMR Flink | Athena (Trino), Redshift | EKS |
| **Azure** | HDInsight (Spark, Hive), Synapse | Event Hubs, HDInsight Flink | Synapse SQL, Azure Data Explorer | AKS |
| **GCP** | Dataproc (Spark, Flink, Hive, Trino) | Pub/Sub, Dataflow (Beam), Dataproc Flink | BigQuery | GKE |
---
## Sources
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
*Last revision: 2026-06-18*

232
BIG-DATA.md Normal file
View File

@@ -0,0 +1,232 @@
# 🗄️ Big Data — ekosystém, architektura, nástroje
## Přehled
Big Data ekosystém v roce 2026: "Hadoop je mrtvý, a přitom je všude." HDFS se zmenšil, MapReduce je fakticky mrtvý, Cloudera/Hortonworks éra skončila. Ale YARN žije, Hive Metastore se převlékl do Iceberg/Delta a lakehouse pattern (levné object storage + tabulkový formát + distribuovaný engine) je dědictví, které Hadoop zanechal.
Moderní Big Data stack má 8 vrstev:
1. **Storage** — HDFS, S3, GCS, ABFS, MinIO
2. **Tabulkový formát** — Apache Iceberg, Delta Lake, Apache Hudi, Apache Paimon
3. **Catalog** — Hive Metastore, Unity Catalog, Polaris, Nessie, AWS Glue
4. **Dávkové zpracování** — Apache Spark, Trino-on-Spark, Dremio
5. **Streamové zpracování** — Apache Flink, Spark Structured Streaming, Kafka Streams
6. **Distribuované SQL** — Trino, Presto, StarRocks, ClickHouse
7. **Transformace** — dbt, SQLMesh
8. **Orchestrace** — Apache Airflow 3.0, Dagster, Prefect, Kestra
---
## Úložiště (Storage)
### HDFS (Hadoop Distributed File System)
| Vlastnost | Detail |
|-----------|--------|
| **Architektura** | Master/worker: NameNode (metadata) + DataNode (data) |
| **Replikace** | Výchozí 3×, konfigurovatelná (rack-aware) |
| **Block size** | Výchozí 128 MB (lze 64 MB 256 MB) |
| **Limity** | NameNode memory ~ 1 GB / 1 milion bloků; ~1000 DataNode v clusteru |
| **Use case** | On-prem clustery, sekvenční čtení/zápis, velké soubory |
| **Stav 2026** | Klesající podíl — většina migruje na object storage (S3, GCS, MinIO) |
HDFS je stále relevantní pro on-prem prostředí, kde object storage není dostupná, nebo pro specifické use case (YARN cluster, Spark shuffle). Pro nové projekty se doporučuje object storage.
### Object storage jako Data Lake
| Platforma | Služba | Use case |
|-----------|--------|----------|
| **AWS** | S3 | Hlavní data lake, Iceberg/Delta na S3 |
| **Azure** | ADLS Gen2 / Blob | Data lake pro Azure ekosystém |
| **GCP** | GCS | Data lake pro GCP (Dataproc, BigQuery) |
| **On-prem** | MinIO | S3-kompatibilní object storage na vlastním HW |
### Kapacitní plánování HDFS
| Velikost dat | Konfigurace |
|-------------|------------|
| **< 100 TB** | 35 DataNode, 10 GbE, replication 3× |
| **100 TB 1 PB** | 520 DataNode, 25/100 GbE, rack-aware, NameNode HA |
| **1 PB+** | 20+ DataNode, 100 GbE, Federation (více NameNode) |
---
## Tabulkové formáty (Open Table Formats)
Tabulkové formáty přináší ACID transakce, schema evolution a time travel do data lake objektového úložiště.
| Formát | Organizace | Engine kompatibilita | Streaming | Katalog |
|--------|-----------|---------------------|-----------|---------|
| **Apache Iceberg** | Apache Foundation | Spark, Flink, Trino, Dremio, Athena, Snowflake | Flink sink, snapshot-based | REST catalog, Polaris, Glue, Hive |
| **Delta Lake** | Linux Foundation (Databricks) | Spark (native), Trino, Flink (limited), Athena | Spark Streaming, DLT | Unity Catalog (proprietary), Hive |
| **Apache Hudi** | Apache Foundation | Spark, Flink, Trino (connector) | Built-in CDC, incremental | Hive, Glue (limited) |
| **Apache Paimon** | Apache Foundation | Flink (native), Spark | LSM-tree, changelog mode | Hive, REST |
**Doporučení 2026:**
- **Iceberg** — nejširší multi-engine podpora, vendor-neutral, otevřený katalog (Polaris)
- **Delta Lake** — nejlepší pro Spark/Databricks ekosystém, UniForm pro cross-format čtení
- **Hudi** — ztrácí momentum, jen pokud již v produkci
- **Paimon** — emerging, Flink-native, LSM architektura
---
## Zpracování (Processing Engines)
### Apache Spark
Dominantní engine pro dávkové zpracování a unifying engine (batch + streaming + SQL + ML).
| Vlastnost | Detail |
|-----------|--------|
| **Verze 2026** | Spark 4.x (4.1.0), native Kubernetes support, Structured Streaming, Delta Lake integrace |
| **API** | Scala, Java, Python (PySpark), SQL, R (SparkR) |
| **Batch** | DataFrame/Dataset, RDD, SQL queries — 10100× rychlejší než MapReduce |
| **Streaming** | Structured Streaming (micro-batch), latence ~100 ms 5 s |
| **SQL** | Spark SQL, ANSI SQL, Hive兼容 |
| **ML** | MLlib, SparkML, integrace s MLflow |
| **Scheduler** | YARN, Kubernetes (production-ready od Spark 3.x), standalone |
| **Fault tolerance** | RDD lineage, checkpointing |
**Kdy použít Spark:**
- Dávkové ETL/ELT pipelines
- Jednotný engine pro batch + streaming (team preference)
- Machine learning pipelines (MLlib, SparkML)
- SQL analytika na velkých datech
### Apache Flink
Nejvýkonnější engine pro true streaming (per-event zpracování).
| Vlastnost | Detail |
|-----------|--------|
| **Verze 2026** | Flink 2.x (streaming-first, batch jako speciální případ streamu) |
| **API** | DataStream API, Table/SQL API, ProcessFunction (low-level) |
| **Latence** | < 100 ms (true streaming, Chandy-Lamport checkpointing) |
| **State management** | Managed state (ValueState, ListState, MapState), RocksDB backend |
| **Event time** | Nativní, watermarky, out-of-order handling |
| **Batch** | Batch jako bounded stream (stejný runtime) |
| **Deployment** | YARN, Kubernetes, standalone |
| **Ekonomika** | Vyšší paměťové nároky (managed state), nutnost pečlivého tuningu |
**Kdy použít Flink:**
- Fraud detection, real-time bidding, IoT (< 100 ms latence)
- Komplexní stateful stream processing
- CDC pipelines
- Event-driven architektury
### Trino (ex PrestoSQL)
Distribuovaný SQL query engine — federované dotazy napříč různými zdroji.
| Vlastnost | Detail |
|-----------|--------|
| **Architektura** | Coordinator + Worker (bez storage, bez scheduleru) |
| **Konektory** | Iceberg, Delta, Hive, HDFS, S3, GCS, ADLS, PostgreSQL, MySQL, Kafka, Elasticsearch |
| **Use case** | Interactive SQL, federované dotazy, lakehouse queries |
| **Verze 2026** | Trino 470+, Iceberg native, Delta Lake connector |
---
## Srovnání Spark vs Flink vs Trino
| Kritérium | Spark | Flink | Trino |
|-----------|-------|-------|-------|
| **Primární use case** | Batch + unifying | True streaming | Interactive SQL |
| **Latence streaming** | 100 ms 5 s (micro-batch) | < 100 ms (true streaming) | N/A |
| **Throughput** | Vysoký (batch optimalizace) | Vysoký (pipeline optimalizace) | Střední (ad-hoc) |
| **State management** | State store (external) | Managed state (embedded) | N/A |
| **SQL support** | Spark SQL | Flink SQL | ANSI SQL (nejširší) |
| **ML/AI** | MLlib, SparkML | — | — |
| **Kubernetes** | Native (production) | Native (production) | Native (production) |
| **Křivka učení** | Střední | Vysoká | Nízká |
| **Provozní náročnost** | Střední | Vysoká | Střední |
---
## Orchestrace
| Nástroj | Verze 2026 | Use case |
|---------|-----------|----------|
| **Apache Airflow** | 3.0+ (taskflow API, dynamic tasks, deferrable operators) | Univerzální orchestrace, největší ekosystém |
| **Dagster** | 1.x (asset-oriented, software-defined assets) | Data pipelines, observabilita, asset lineage |
| **Prefect** | 3.x (native async, workers, blocks) | Python-native, serverless workers |
| **Kestra** | 1.x (YAML-native, declarative) | Event-driven orchestration |
| **Apache NiFi** | 2.x (flow-based, visual) | Data ingestion, CDC, streaming |
---
## Lakehouse architektura
Lakehouse kombinuje flexibilitu data lake (object storage) s výkonem a governance data warehouse.
```
┌──────────────────────────────────────────────────────┐
│ Query Engines │
│ Trino Spark SQL Flink SQL Dremio Athena │
└─────────────────────────┬────────────────────────────┘
┌─────────────────────────▼────────────────────────────┐
│ Table Format Layer │
│ Apache Iceberg / Delta Lake / Hudi │
│ (ACID, time travel, schema evolution) │
└─────────────────────────┬────────────────────────────┘
┌─────────────────────────▼────────────────────────────┐
│ Storage Layer │
│ S3 / GCS / ADLS / MinIO / HDFS │
│ (Parquet / ORC / Avro) │
└──────────────────────────────────────────────────────┘
```
Detailněji Iceberg viz [DATABASES.md — Apache Iceberg Lakehouse](DATABASES.md#apache-iceberg-lakehouse).
---
## Infrastruktura pro Big Data
### Cluster sizing
| Komponenta | Spark (batch) | Flink (streaming) | Trino (SQL) |
|------------|--------------|-------------------|-------------|
| **CPU** | 1664 cores/node | 1632 cores/node | 832 cores/node |
| **RAM** | 64256 GB/node | 64256 GB/node (včetně managed state) | 64256 GB/node |
| **Storage** | HDFS / object storage | Object storage (checkpointy) | Žádná (stateless) |
| **Network** | 25100 GbE (shuffle-heavy) | 25100 GbE (checkpointing) | 25100 GbE |
| **Disk** | NVMe (scratch, shuffle) | NVMe (RocksDB state backend) | — |
| **Cluster velikost** | 5200+ nodes | 3100+ nodes | 550 nodes |
### Network considerations
- **Spark shuffle** — heavy network traffic mezi uzly; doporučeno 25100 GbE, ideálně bez oversubscription
- **Flink checkpointing** — periodický zápis stavu na object storage; vyžaduje stabilní latenci
- **HDFS rack awareness** — optimalizuje replikaci napříč racky
- **Data locality** — HDFS: čtení z lokálního disku; object storage: network-bound
### Kubernetes vs YARN
| Kritérium | YARN | Kubernetes |
|-----------|------|-----------|
| **Resource isolation** | Cgroups (YARN containers) | Cgroups + namespaces (pods) |
| **Ecosystem fit** | Hadoop-native (HDFS, Hive, Spark) | Cloud-native, Spark, Flink, Trino |
| **Operational complexity** | Nižší (jeden cluster manager) | Vyšší (vyžaduje K8s cluster) |
| **Multi-tenant isolation** | YARN queues (Capacity/Fair Scheduler) | Namespaces, ResourceQuotas, LimitRanges |
| **Stateful workloads** | Omezená | StatefulSets, PVC, Operators |
| **2026 trend** | Legacy (klesající) | Standard pro nové projekty |
---
## Nasazení v cloudu
| Cloud | Dávkové zpracování | Streaming | SQL | Managed K8s |
|-------|-------------------|-----------|-----|-------------|
| **AWS** | EMR (Spark, Hive, Flink) | Kinesis, MSK (Kafka), EMR Flink | Athena (Trino), Redshift | EKS |
| **Azure** | HDInsight (Spark, Hive), Synapse | Event Hubs, HDInsight Flink | Synapse SQL, Azure Data Explorer | AKS |
| **GCP** | Dataproc (Spark, Flink, Hive, Trino) | Pub/Sub, Dataflow (Beam), Dataproc Flink | BigQuery | GKE |
---
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Poslední revize: 2026-06-18*

View File

@@ -123,7 +123,7 @@ ScyllaDB is advantageous when:
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
### Recommended reading

View File

@@ -637,7 +637,7 @@ New tools: Harness (AI-native CD), GitLab 19.0 (agentic MR workflows, secrets ma
## Resources
Links, books and standards: [sources/cicd/sources.md](sources/cicd/sources.md)
Links, books and standards: [sources/cicd/sources.en.md](sources/cicd/sources.en.md)
### Recommended Reading

View File

@@ -144,7 +144,7 @@ Analogues: Azure Well-Architected Framework, GCP Architecture Framework
| **Storage optimized** | I4i, im4gn | 1:4 + NVMe | Transactional DB, data warehousing, Kafka | i4i.large ~$0.138/h |
| **GPU / ML** | P5, g5, trn1 | GPU attach | AI training (P5), inference (g5), ML (trn1) | g5.xlarge ~$1.006/h |
See [GPU.md](GPU.md) for GPU model and configuration details.
See [GPU.en.md](GPU.en.md) for GPU model and configuration details.
### Storage
@@ -287,7 +287,7 @@ Automated checks of architectural characteristics — analogous to tests for arc
## Hybrid Cloud Connectivity
See also: [NETWORKING.md](NETWORKING.md) — network architecture (VPN, BGP, VPC design).
See also: [NETWORKING.en.md](NETWORKING.en.md) — network architecture (VPN, BGP, VPC design).
- **Site-to-Site VPN** — IPSec tunnel over the internet
- **Direct Connect / ExpressRoute / Dedicated Interconnect** — private physical connection
@@ -480,7 +480,7 @@ OpenStack is the dominant open-source platform for building private clouds (IaaS
## Resources
Links, books and standards: [sources/cloud/sources.md](sources/cloud/sources.md)
Links, books and standards: [sources/cloud/sources.en.md](sources/cloud/sources.en.md)
- **Cost tagging** — assign tags for chargeback/showback (Environment, Team, Cost Center, Application)
- **Automated compliance** — AWS Config, Azure Policy, GCP Org Policies for guardrails
- **Multi-account strategy** — AWS Control Tower, Azure Landing Zones, GCP Resource Hierarchy

View File

@@ -259,7 +259,7 @@ HPE ProLiant Gen11 (DL360/DL380) supports:
## Sources
Links, books, and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
Links, books, and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
### Recommended literature

View File

@@ -90,7 +90,7 @@ Each transaction sees a snapshot of data as of the start time. Old row versions
## Resources
Links, books and standards: [sources/databases/sources.md](sources/databases/sources.md)
Links, books and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
### Recommended Reading

View File

@@ -6,20 +6,20 @@
| DB | License | Use Case | Details |
|----|---------|----------|--------|
| **PostgreSQL** | Open source | Universal, geospatial, analytics, AI | [POSTGRESQL.md](POSTGRESQL.md) |
| **MySQL / MariaDB** | Open source | Web, LAMP stack, e-commerce | [MYSQL.md](MYSQL.md) |
| **PostgreSQL** | Open source | Universal, geospatial, analytics, AI | [POSTGRESQL.en.md](POSTGRESQL.en.md) |
| **MySQL / MariaDB** | Open source | Web, LAMP stack, e-commerce | [MYSQL.en.md](MYSQL.en.md) |
| **Microsoft SQL Server** | Proprietary | Enterprise .NET, Windows ecosystem | — |
| **Oracle DB** | Proprietary | Enterprise, finance, mainframe, RAC cluster | [ORACLE.md](ORACLE.md) |
| **Oracle DB** | Proprietary | Enterprise, finance, mainframe, RAC cluster | [ORACLE.en.md](ORACLE.en.md) |
| **Amazon Aurora** | Managed | MySQL/PostgreSQL compatible, cloud-native | — |
### NoSQL
| Type | DB | Use Case | Details |
|-----|----|----------|--------|
| **Document** | MongoDB, Couchbase | JSON data, flexible schema | [MONGODB.md](MONGODB.md) |
| **Key-Value / Cache** | Redis, Memcached, DynamoDB | Cache, session store, real-time | [REDIS.md](REDIS.md) |
| **Wide-column** | Cassandra, ScyllaDB | Time-series, IoT, big data | [CASSANDRA.md](CASSANDRA.md) |
| **Vector** | Pinecone, Qdrant, Milvus, pgvector | Embeddings, RAG, semantic search | [VEKTOROVE-DB.md](VEKTOROVE-DB.md) |
| **Document** | MongoDB, Couchbase | JSON data, flexible schema | [MONGODB.en.md](MONGODB.en.md) |
| **Key-Value / Cache** | Redis, Memcached, DynamoDB | Cache, session store, real-time | [REDIS.en.md](REDIS.en.md) |
| **Wide-column** | Cassandra, ScyllaDB | Time-series, IoT, big data | [CASSANDRA.en.md](CASSANDRA.en.md) |
| **Vector** | Pinecone, Qdrant, Milvus, pgvector | Embeddings, RAG, semantic search | [VECTOR-DBS.en.md](VECTOR-DBS.en.md) |
| **Graph** | Neo4j, Dgraph | Relationships, recommendations, social graphs | — |
### Storage Engines
@@ -258,6 +258,8 @@ Table metadata (.metadata.json)
| **Hidden partitioning** | Automatic partition filters (user does not need to specify) |
| **Multi-engine** | Spark, Flink, Trino, Dremio, Snowflake over the same data |
For a broader overview of the Big Data ecosystem (HDFS, Spark, Flink, Trino, Delta Lake, Hudi) see [BIG-DATA.en.md](BIG-DATA.en.md).
### When to Use Iceberg
- Multi-tool access to the same governed data
@@ -305,7 +307,7 @@ Table metadata (.metadata.json)
## Resources
Links, books and standards: [sources/databases/sources.md](sources/databases/sources.md)
Links, books and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
### Recommended Reading

View File

@@ -258,6 +258,8 @@ Table metadata (.metadata.json)
| **Hidden partitioning** | Automatické partition filtry (uživatel nemusí uvádět) |
| **Multi-engine** | Spark, Flink, Trino, Dremio, Snowflake nad stejnými daty |
Detailnější přehled Big Data ekosystému (HDFS, Spark, Flink, Trino, Delta Lake, Hudi) viz [BIG-DATA.md](BIG-DATA.md).
### Kdy použít Iceberg
- Multi-tool přístup ke stejným governed datům

View File

@@ -658,6 +658,281 @@ flowchart TD
CLIM -->|"Cold (SE, NO)"| FC3["Free cooling 7000+ h/year<br/>Air-side economizer<br/>PUE < 1.2"]
```
## Secondary data center topologies
When planning a second DC, the choice of topology is key based on distance, RPO/RTO, and budget.
### Distance classification
| Category | Distance | Latency (round-trip) | Use case |
|-----------|-----------|---------------------|----------|
| **Metro (Campus)** | 120 km | < 1 ms | Synchronous replication, stretched cluster |
| **Metro** | 20100 km | 15 ms | Metro cluster, mostly sync replication |
| **Regional** | 100500 km | 520 ms | Asynchronous replication, warm standby |
| **Continent** | 5003000 km | 20100 ms | Asynchronous replication, cold standby |
| **Global** | 3000+ km | > 100 ms | Async only, no real-time dependencies |
### Topologies by operational mode
#### Active-Active (Hot-Hot)
```
DC-A (Primary) DC-B (Active)
┌────────────────────┐ ┌────────────────────┐
│ App Active │ │ App Active │
│ DB Active │◄─sync─►│ DB Active │
│ Users → LB → A │ │ Users → LB → B │
└────────────────────┘ └────────────────────┘
│ │
└──── Global Load Balancer ────┘
```
| Parameter | Value |
|----------|---------|
| **RTO** | 0seconds (automatic failover, traffic is redirected) |
| **RPO** | 0 (sync replication, commit is confirmed only after write to both DCs) |
| **Max distance** | < 100 km (latency < 5 ms RTT for sync DB replication) |
| **Operating costs** | 2× (both DCs fully active, both fully equipped) |
| **Advantages** | Zero downtime, instant switchover, full utilization of both DCs |
| **Disadvantages** | Requires synchronous replication → distance limit, complex networking, split-brain risk |
**Split-brain solutions**: STONITH (Shoot The Other Node In The Head), watchdog, quorum (3rd node in 3rd location / cloud), fencing, SCSI-3 persistent reservation.
**Use case**: Financial services, telco, payment gateways — where even a minute of downtime = millions.
#### Active-Passive (Hot-Warm, MetroCluster)
```
DC-A (Primary) DC-B (Standby)
┌────────────────────┐ ┌────────────────────┐
│ App Active │ │ App Standby │
│ DB Primary │──sync──►│ DB Standby │
│ Users → LB → A │ │ ~~~ (waiting) ~~~ │
│ DNS: A-record │ │ DNS: health check │
└────────────────────┘ └────────────────────┘
```
| Parameter | Value |
|----------|---------|
| **RTO** | tens of secondsminutes (DNS failover + App startup) |
| **RPO** | 0 (sync) or seconds (async) |
| **Max distance** | sync < 100 km, async unlimited |
| **Operating costs** | 1.51.8× (second DC has reduced or idle compute) |
| **MetroCluster** | Specific implementation: FC SAN over DWDM, sync mirror, automatic failover |
**MetroCluster** (NetApp, Dell EMC, HPE):
- Storage-based cluster with synchronous mirroring between DCs
- Automatic failover on entire DC failure
- Requires dedicated DWDM or dark fiber interconnection
- Typical distance: up to 50 km (for latency < 1 ms RTT)
- Use case: enterprise storage, primary+secondary DC in metropolitan area
#### Hot-Cold (Warm Standby → Cold)
```
DC-A (Primary) DC-B (Cold Standby)
┌────────────────────┐ ┌────────────────────┐
│ App Active │ │ ~~~ powered off ~~~│
│ DB Active │──async─►│ Backup storage │
│ Users → A │ │ ~~~ no compute ~~~│
└────────────────────┘ └────────────────────┘
```
| Parameter | Value |
|----------|---------|
| **RTO** | hoursdays (purchase/rent HW, restore from backup) |
| **RPO** | hours (last backup) |
| **Max distance** | unlimited |
| **Operating costs** | 1.11.3× (only storage and facility, compute only at failover) |
| **Typical use case** | Low-cost DR, compliance, last resort |
#### Pilot Light
```
DC-A (Primary) DC-B (Pilot Light)
┌────────────────────┐ ┌────────────────────┐
│ App Active │ │ ~~~ off ~~~ │
│ DB Active │──async─►│ DB replica (mini) │
│ All services │ │ Core services only│
│ │ │ (DNS, LDAP, mon) │
└────────────────────┘ └────────────────────┘
On DR: spin-up compute
from IaC, rest from backup
```
- DC-B runs with minimum compute (only core services and DB replica)
- Application layer is spun up from IaC (Terraform, Ansible) only during DR
- Compromise between cost and RTO
### Comparison table
| Topology | RTO | RPO | Cost (× primary) | Max distance | Failover |
|-----------|-----|-----|-------------------|-------------|----------|
| **Active-Active** | 0s | 0 | 2.0× | < 100 km | Auto (traffic) |
| **MetroCluster** | smin | 0 | 1.82.0× | < 50 km | Auto (storage) |
| **Active-Passive (sync)** | min | 0 | 1.51.8× | < 100 km | Semi-auto |
| **Active-Passive (async)** | minh | smin | 1.31.5× | unlimited | Semi-auto |
| **Pilot Light** | h | minh | 1.21.4× | unlimited | Manual |
| **Warm Standby** | minh | smin | 1.51.8× | unlimited | Semi-auto |
| **Cold Standby** | days | h | 1.11.3× | unlimited | Manual |
### Stretched Cluster
```
┌──── Site A (50 km) ────┐ ┌──── Site B ──────────┐
│ ┌──────────────────┐ │ │ ┌──────────────────┐ │
│ │ ESXi / Hyper-V │ │ │ │ ESXi / Hyper-V │ │
│ │ VM │ │ │ │ VM (complement) │ │
│ └────────┬─────────┘ │ │ └────────┬─────────┘ │
│ │ │ │ │ │
│ ┌────────▼─────────┐ │ │ ┌────────▼─────────┐ │
│ │ Storage (SAN) │──┼────┼──│ Storage (SAN) │ │
│ │ MetroCluster │ │ │ │ MetroCluster │ │
│ └──────────────────┘ │ │ └──────────────────┘ │
└────────────────────────┘ └────────────────────────┘
┌─────▼──────┐
│ vCenter / │
│ Cluster │
│ (single) │
└────────────┘
```
- One cluster stretched across two sites (single management domain)
- VMs can live-migrate between sites (vMotion over distance)
- Storage synchronously mirrored (MetroCluster, VPLEX, vSAN延伸)
- **Requirements**: dark fiber / DWDM, low latency (< 5 ms), high link reliability
- **Risks**: split-brain, brain drain (split-site cluster), network dependency
- **Use case**: enterprise with own dark fiber between two DCs in a metropolitan area
### Decision tree
```mermaid
flowchart TD
Start(["Secondary DC"]) --> RPO{"Required RPO?"}
RPO -->|"0 (no data loss)"| SYNC{"Sync replication possible?"}
SYNC -->|"Yes, < 100 km"| ACT{"Want zero downtime?"}
ACT -->|"Yes"| AA["Active-Active<br/>RTO=0, RPO=0, 2× cost"]
ACT -->|"No"| AP["Active-Passive<br/>RTO=min, RPO=0, 1.5×"]
SYNC -->|"No, > 100 km"| ASYNC["Active-Passive (async)<br/>RTO=min, RPO=s, 1.3×"]
RPO -->|"minuteshours"| WARM{"Want fast failover?"}
WARM -->|"Yes"| PILOT["Pilot Light<br/>RTO=h, RPO=min, 1.2×"]
WARM -->|"No"| COLD["Cold Standby<br/>RTO=days, RPO=h, 1.1×"]
Start --> DIST{"Distance between DCs"}
DIST -->|"< 50 km, own fiber"| MC["MetroCluster / Stretched Cluster<br/>Single management, sync storage"]
DIST -->|"50300 km"| REG["Regional DR<br/>Active-Passive, async replication"]
DIST -->|"> 300 km"| GLOBAL["Global DR<br/>Cold standby, backup & restore"]
```
### Physical infrastructure for DC interconnection
| Technology | Bandwidth | Max distance | Latency | Use case |
|------------|-----------|-------------|---------|----------|
| **Dark fiber** | 100 GbE800 GbE | 1080 km (single-mode) | < 0.1 ms | MetroCluster, stretched cluster |
| **DWDM** | 400 GbE1.6 TbE (per lambda) | 80120 km (without amplifier) | < 0.5 ms | Metro, metro cluster |
| **CWDM** | 1025 GbE (per channel) | 1040 km | < 0.3 ms | Campus, smaller metro |
| **MPLS L2VPN** | 10100 GbE | unlimited | 110 ms | Regional DR, async replication |
| **Internet IPsec** | 110 GbE | unlimited | 550 ms | Cold standby, backup |
### Impact of individual technologies on DC topology selection
Choosing a secondary DC topology is not purely an infrastructure decision — each layer (DB, hypervisor, orchestration, messaging) brings its own constraints.
#### Databases
| DB technology | Sync replication | Max distance | Auto-failover | Split-brain handling | Note |
|---------------|---------------|-------------|---------------|-------------------|----------|
| **PostgreSQL** | Synchronous commit (synchronous_standby_names) | < 100 km (latency < 10 ms) | Patroni / repmgr + etcd | Quorum (etcd, 3+ node) | Streaming replication, needs wal_keep_segments |
| **MySQL** | Group Replication (multi-primary, single-primary) | < 100 km | MySQL InnoDB Cluster + MySQL Router | Paxos (Group Replication, 3+ node) | Semi-sync as compromise |
| **Oracle** | Data Guard (SYNC/FASTSYNC/ASYNC), RAC extended | sync < 100 km, async unlimited | Data Guard Broker / FSFO (Fast Start Failover) | Observer (3rd node) | Far Sync for remote DCs |
| **MSSQL** | AlwaysOn Availability Groups (SYNCHRONOUS_COMMIT) | < 100 km | AlwaysOn + Cluster quorum | File share majority / cloud witness | Multi-site cluster support |
| **MongoDB** | Majority write concern + journaling | < 100 km | Replica set auto-election | Arbitration node (voting member) | Priority-based failover |
| **Cassandra** | N/A (multi-master, eventual consistency) | unlimited | Yes (peer-to-peer) | None (multi-master, gossip protocol) | Snitch-aware topology, NetworkTopologyStrategy |
| **Redis** | Redis Sentinel / Redis Cluster (async) | unlimited (async) | Sentinel / Cluster failover | Quorum (Sentinel, majority) | PSYNC replication, replication lag |
Key limitation for **sync replication**: latency < 5 ms RTT (commit must wait for confirmation from both DCs). At 100 km RTT ~1 ms — OK. At 1000 km (~10 ms RTT) sync replication reduces transaction throughput by 80+ %.
Suitable for **Active-Active**:
- **Cassandra / ScyllaDB** — native multi-DC, eventual consistency, no split-brain
- **MySQL Group Replication (multi-primary)** — 3+ DC for quorum
- **CockroachDB / TiDB** — native multi-region, ACID across DCs
- **Redis Enterprise** — Active-Active (CRDT-based)
Suitable for **Active-Passive**:
- **PostgreSQL + Patroni** — auto-failover, etcd quorum
- **Oracle Data Guard** — FSFO, far sync for remote DCs
- **MSSQL AlwaysOn** — cloud witness
- **MongoDB Replica Set** — arbitration node in 3rd location
#### Hypervisors
| Hypervisor | Cluster technology | Stretched cluster | Max distance | Split-brain |
|-----------|-------------------|-------------------|-------------|-------------|
| **VMware vSphere** | vSAN延伸, Metro vCenter, Site Recovery Manager | Yes (vSAN延伸, Metro Cluster) | < 50 km (vSAN延伸), < 10 ms RTT | Fencing (STONITH), witness host |
| **Hyper-V** | Storage Replica + Failover Cluster | Yes (Cluster Sets) | < 50 km (sync), unlimited (async) | File share witness / cloud witness |
| **Proxmox VE** | Proxmox HA + Ceph | Limited (Ceph stretch cluster) | < 50 km (Ceph sync) | Ceph monitor quorum (3+ DC) |
| **XCP-ng / XenServer** | Xen Orchestra HA + SR (Storage Repository) replication | Limited | depends on storage replication | — |
| **Nutanix AHV** | Metro Availability (sync), Async DR | Yes (Metro) | < 100 km (sync), unlimited (async) | Witness VM (cloud / 3rd site) |
| **KVM / oVirt** | oVirt HA + GlusterFS / NFS | Limited | depends on storage replication | — |
**vSAN延伸 specific requirements:**
- Dedicated vSAN network (25 GbE min., < 5 ms RTT)
- Witness host in 3rd location (or cloud witness)
- All VM policies (FTT=1, mirroring striped)
- Storage policy: `site-A + site-B + witness`
#### Kubernetes and container platforms
| Platform | Multi-cluster DR | Replication | Max distance | Failover |
|-----------|-----------------|-----------|-------------|----------|
| **Vanilla K8s** | KubeFed, Cluster API, Velero + Restic | Velero (backup/restore), Rook (Ceph) | unlimited | Manual (Velero restore) |
| **OpenShift** | ACM (Advanced Cluster Management), Velero | OADP (OpenShift API for Data Protection) | unlimited | ACM failover (subscription) |
| **Rancher** | Rancher Multi-Cluster App, Velero | Longhorn (sync/async DR), Velero | unlimited | Semi-auto |
| **Google GKE** | Multi-cluster Services, Backup for GKE | Config Sync, Backup for GKE | unlimited | Manual |
| **Azure AKS** | Azure ARC + Velero + Azure Traffic Manager | AKS backup (velero), Azure Site Recovery | unlimited | Manual (Velero) |
| **AWS EKS** | EKS multi-cluster, Velero + S3 cross-region | Velero (S3), Rook (EBS snapshots) | unlimited | Manual |
**Key K8s DR principles:**
- **Applications must be stateless** (or state externalized to DB/storage)
- **Velero** — backup/restore entire cluster (PV, resources, helm releases)
- **Rook/Ceph** — cross-region mirroring RBD volumes
- **KubeFed / ACM** — subscription-based deploy to multiple clusters
- **Ingress/Gateway API** — traffic routing between clusters
- **External DNS** — DNS failover on cluster outage
#### Messaging / streaming
| Platform | Replication | Topology | DR support | Max distance |
|-----------|-----------|-----------|------------|-------------|
| **Apache Kafka** | MirrorMaker 2, Confluent Cluster Linking, KRaft quorum | Active-Passive (MM2), Active-Active (Cluster Linking) | MM2: async, Cluster Linking: async | unlimited |
| **RabbitMQ** | Classic Queue Mirroring, Quorum Queues | Active-Passive (Warm Standby) | Federation / Shovel (async) | unlimited |
| **Red Hat AMQ** | (Artemis) Cluster + HA | Active-Passive (shared store / replication) | Live-backup pair | < 100 km (sync) |
| **NATS** | NATS JetStream (cluster + cross-account) | Active-Active (Leaf nodes, cross-account) | Super-cluster, failover | unlimited |
| **Apache Pulsar** | BookKeeper (bookie rack-aware), geo-replication | Active-Active (geo-replication) | Built-in (cluster-level) | unlimited (async) |
| **AWS SQS/SNS** | Managed, AWS region pairs | Active-Active (multi-region) | Built-in (AWS managed) | unlimited |
| **Azure Service Bus** | Managed, paired region | Active-Passive (paired region) | Built-in (geo-recovery) | unlimited |
| **Oracle Service Bus (OSB)** | Oracle WebLogic Cluster + JDBC store + AQ | Active-Passive (WebLogic Cluster + Data Guard) | OSB/WLS cluster + Oracle RAC/Data Guard sync | < 100 km (Data Guard sync), unlimited (async) |
**Messaging DR recommendations:**
- **Kafka**: use Cluster Linking for Active-Active, or MirrorMaker 2 for Active-Passive; replicate only critical topics
- **RabbitMQ**: Quorum Queues + Federation upstream for DR; avoid Classic Queue Mirroring (deprecated)
- **Pulsar**: native geo-replication, bookie rack-aware for stretched cluster; easiest DR among messaging platforms
- **OSB**: WebLogic cluster + Oracle RAC/Data Guard; DR depends on DB layer, not on OSB itself
### Per-layer limitations summary table
| Layer | Limiting factor for secondary DC | Max distance for sync | Impact on topology selection |
|--------|-----------------------------------|----------------------|--------------------------|
| **Storage** | Sync mirror latency, DWDM cost | < 50 km (MetroCluster) | Stretched cluster only in metro |
| **Databases** | Commit wait for sync replication | < 100 km (5 ms RTT) | Active-Active only with multi-master DB |
| **Hypervisor** | Stretched cluster quorum + fencing | < 50 km (vSAN, 5 ms) | MetroCluster / stretched cluster |
| **Kubernetes** | Velero restore time, Rook mirror latency | unlimited (async) | Active-Passive, cold standby |
| **Messaging** | Replication lag, offset management | unlimited (async) | Active-Active (Kafka, Pulsar, NATS) or Active-Passive |
| **Network** | Dark fiber/DWDM cost, latency | < 100 km (metro fiber) | Limits sync replication options |
| **Application** | Stateful/stateless, connection draining | depends on architecture | Stateless app → any topology |
## Disk monitoring — S.M.A.R.T.
Self-Monitoring, Analysis and Reporting Technology — predictive monitoring of HDD/SSD.
@@ -675,7 +950,7 @@ Tools: `smartmontools` (smartctl, smartd), Prometheus exporter (`node_exporter`)
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
### Recommended literature
@@ -735,7 +1010,7 @@ Best practices: separate auth and recursive resolvers, DNSSEC, split-horizon (in
### Monitoring and observability
See [MONITORING.md](MONITORING.md). Before running first workloads, DC must have:
See [MONITORING.en.md](MONITORING.en.md). Before running first workloads, DC must have:
- Metric collection (Prometheus, Zabbix)
- Centralized logs (Loki, ELK)
- Alerting (Alertmanager, PagerDuty)

View File

@@ -658,6 +658,281 @@ flowchart TD
CLIM -->|"Chladná (SE, NO)"| FC3["Free cooling 7000+ h/rok<br/>Air-side economizer<br/>PUE < 1.2"]
```
## Topologie sekundárního datového centra
Při plánování druhého DC je klíčová volba topologie podle vzdálenosti, RPO/RTO a rozpočtu.
### Klasifikace vzdáleností
| Kategorie | Vzdálenost | Latence (round-trip) | Use case |
|-----------|-----------|---------------------|----------|
| **Metro (Campus)** | 120 km | < 1 ms | Synchronní replikace, stretched cluster |
| **Metro** | 20100 km | 15 ms | Metro cluster, většinou sync replikace |
| **Regional** | 100500 km | 520 ms | Asynchronní replikace, warm standby |
| **Continent** | 5003000 km | 20100 ms | Asynchronní replikace, cold standby |
| **Global** | 3000+ km | > 100 ms | Pouze async, žádné real-time závislosti |
### Topologie podle provozního režimu
#### Active-Active (Hot-Hot)
```
DC-A (Primary) DC-B (Active)
┌────────────────────┐ ┌────────────────────┐
│ App Active │ │ App Active │
│ DB Active │◄─sync─►│ DB Active │
│ Users → LB → A │ │ Users → LB → B │
└────────────────────┘ └────────────────────┘
│ │
└──── Global Load Balancer ────┘
```
| Parametr | Hodnota |
|----------|---------|
| **RTO** | 0vteřiny (automatický failover, traffic se přesměruje) |
| **RPO** | 0 (sync replikace, commit je potvrzen až po zápisu do obou DC) |
| **Max distance** | < 100 km (latence < 5 ms RTT pro sync DB replikaci) |
| **Provozní náklady** | 2× (obě DC plně aktivní, obě plně vybavené) |
| **Výhody** | Nulový výpadek, okamžité přepnutí, plné využití obou DC |
| **Nevýhody** | Nutná synchronní replikace → limit vzdálenosti, komplexní networking, split-brain risk |
**Split-brain řešení**: STONITH (Shoot The Other Node In The Head), watchdog, quorum (3. node v 3. lokaci / cloud), fencing, SCSI-3 persistent reservation.
**Use case**: Finanční služby, telco, platební brány — kde i minuta výpadku = miliony.
#### Active-Passive (Hot-Warm, MetroCluster)
```
DC-A (Primary) DC-B (Standby)
┌────────────────────┐ ┌────────────────────┐
│ App Active │ │ App Standby │
│ DB Primary │──sync──►│ DB Standby │
│ Users → LB → A │ │ ~~~ (čeká) ~~~ │
│ DNS: A-record │ │ DNS: health check │
└────────────────────┘ └────────────────────┘
```
| Parametr | Hodnota |
|----------|---------|
| **RTO** | desítky vteřinminuty (DNS failover + startup App) |
| **RPO** | 0 (sync) nebo sekundy (async) |
| **Max distance** | sync < 100 km, async neomezeně |
| **Provozní náklady** | 1,51,8× (druhé DC má zmenšený nebo idle compute) |
| **MetroCluster** | Specifická implementace: FC SAN přes DWDM, sync mirror, automatický failover |
**MetroCluster** (NetApp, Dell EMC, HPE):
- Storage-based cluster se synchronním mirroringem mezi DC
- Automatic failover při selhání celého DC
- Vyžaduje dedikované DWDM nebo dark fiber propojení
- Typická vzdálenost: do 50 km (pro latenci < 1 ms RTT)
- Use case: enterprise storage, primary+secondary DC v metropolitní oblasti
#### Hot-Cold (Warm Standby → Cold)
```
DC-A (Primary) DC-B (Cold Standby)
┌────────────────────┐ ┌────────────────────┐
│ App Active │ │ ~~~ powered off ~~~│
│ DB Active │──async─►│ Backup storage │
│ Users → A │ │ ~~~ no compute ~~~│
└────────────────────┘ └────────────────────┘
```
| Parametr | Hodnota |
|----------|---------|
| **RTO** | hodinydny (nákup/najmutí HW, obnova z backupu) |
| **RPO** | hodiny (poslední backup) |
| **Max distance** | neomezena |
| **Provozní náklady** | 1,11,3× (jen storage a facility, compute až při failoveru) |
| **Typ use case** | Low-cost DR, compliance, poslední záchrana |
#### Pilot Light
```
DC-A (Primary) DC-B (Pilot Light)
┌────────────────────┐ ┌────────────────────┐
│ App Active │ │ ~~~ off ~~~ │
│ DB Active │──async─►│ DB replica (mini) │
│ Všechny služby │ │ Core services jen │
│ │ │ (DNS, LDAP, mon) │
└────────────────────┘ └────────────────────┘
Při DR: spin-up compute
z IaC, zbytek z backupu
```
- DC-B běží s minimem compute (jen core služby a DB replica)
- Aplikační vrstva se spin-up z IaC (Terraform, Ansible) až při DR
- Kompromis mezi náklady a RTO
### Srovnávací tabulka
| Topologie | RTO | RPO | Náklady (× primár) | Max distance | Failover |
|-----------|-----|-----|-------------------|-------------|----------|
| **Active-Active** | 0s | 0 | 2,0× | < 100 km | Auto (traffic) |
| **MetroCluster** | smin | 0 | 1,82,0× | < 50 km | Auto (storage) |
| **Active-Passive (sync)** | min | 0 | 1,51,8× | < 100 km | Polo-auto |
| **Active-Passive (async)** | minh | smin | 1,31,5× | neomezena | Polo-auto |
| **Pilot Light** | h | minh | 1,21,4× | neomezena | Manuální |
| **Warm Standby** | minh | smin | 1,51,8× | neomezena | Polo-auto |
| **Cold Standby** | dny | h | 1,11,3× | neomezena | Manuální |
### Stretched Cluster
```
┌──── Site A (50 km) ────┐ ┌──── Site B ──────────┐
│ ┌──────────────────┐ │ │ ┌──────────────────┐ │
│ │ ESXi / Hyper-V │ │ │ │ ESXi / Hyper-V │ │
│ │ VM │ │ │ │ VM (komplement) │ │
│ └────────┬─────────┘ │ │ └────────┬─────────┘ │
│ │ │ │ │ │
│ ┌────────▼─────────┐ │ │ ┌────────▼─────────┐ │
│ │ Storage (SAN) │──┼────┼──│ Storage (SAN) │ │
│ │ MetroCluster │ │ │ │ MetroCluster │ │
│ └──────────────────┘ │ │ └──────────────────┘ │
└────────────────────────┘ └────────────────────────┘
┌─────▼──────┐
│ vCenter / │
│ Cluster │
│ (single) │
└────────────┘
```
- Jeden cluster roztažený přes dvě lokality (single management domain)
- VM mohou live-migrovat mezi site (vMotion nad vzdálenost)
- Storage synchronně mirrorovaná (MetroCluster, VPLEX, vSAN延伸)
- **Požadavky**: dark fiber / DWDM, nízká latence (< 5 ms), vysoká spolehlivost linky
- **Riziko**: split-brain, brain drain (split-site cluster), závislost na síti
- **Use case**: enterprise s vlastní dark fiber mezi dvěma DC v metropolitní oblasti
### Rozhodovací strom
```mermaid
flowchart TD
Start(["Sekundární DC"]) --> RPO{"Požadované RPO?"}
RPO -->|"0 (žádná ztráta dat)"| SYNC{"Sync replikace možná?"}
SYNC -->|"Ano, < 100 km"| ACT{"Chceš nulový výpadek?"}
ACT -->|"Ano"| AA["Active-Active<br/>RTO=0, RPO=0, 2× náklady"]
ACT -->|"Ne"| AP["Active-Passive<br/>RTO=min, RPO=0, 1,5×"]
SYNC -->|"Ne, > 100 km"| ASYNC["Active-Passive (async)<br/>RTO=min, RPO=s, 1,3×"]
RPO -->|"minutyhodiny"| WARM{"Chceš rychlý failover?"}
WARM -->|"Ano"| PILOT["Pilot Light<br/>RTO=h, RPO=min, 1,2×"]
WARM -->|"Ne"| COLD["Cold Standby<br/>RTO=dny, RPO=h, 1,1×"]
Start --> DIST{"Vzdálenost mezi DC"}
DIST -->|"< 50 km, vlastní fiber"| MC["MetroCluster / Stretched Cluster<br/>Single management, sync storage"]
DIST -->|"50300 km"| REG["Regionální DR<br/>Active-Passive, async replikace"]
DIST -->|"> 300 km"| GLOBAL["Globální DR<br/>Cold standby, backup & restore"]
```
### Fyzická infrastruktura pro propojení DC
| Technologie | Bandwidth | Max distance | Latence | Use case |
|------------|-----------|-------------|---------|----------|
| **Dark fiber** | 100 GbE800 GbE | 1080 km (single-mode) | < 0,1 ms | MetroCluster, stretched cluster |
| **DWDM** | 400 GbE1,6 TbE (per lambda) | 80120 km (bez zesilovače) | < 0,5 ms | Metro, metro cluster |
| **CWDM** | 1025 GbE (per channel) | 1040 km | < 0,3 ms | Campus, menší metro |
| **MPLS L2VPN** | 10100 GbE | neomezena | 110 ms | Regional DR, async replikace |
| **Internet IPsec** | 110 GbE | neomezena | 550 ms | Cold standby, backup |
### Vliv jednotlivých technologií na výběr DC topologie
Volba topologie sekundárního DC není čistě infrastrukturní rozhodnutí — každá vrstva (DB, hypervisor, orchestrace, messaging) přináší vlastní omezení.
#### Databáze
| DB technologie | Sync replikace | Max distance | Auto-failover | Split-brain řešení | Poznámka |
|---------------|---------------|-------------|---------------|-------------------|----------|
| **PostgreSQL** | Synchronous commit (synchronous_standby_names) | < 100 km (latence < 10 ms) | Patroni / repmgr + etcd | Quorum (etcd, 3+ node) | Streaming replication, nutné wal_keep_segments |
| **MySQL** | Group Replication (multi-primary, single-primary) | < 100 km | MySQL InnoDB Cluster + MySQL Router | Paxos (Group Replication, 3+ node) | Semi-sync jako kompromis |
| **Oracle** | Data Guard (SYNC/FASTSYNC/ASYNC), RAC extended | sync < 100 km, async neomezena | Data Guard Broker / FSFO (Fast Start Failover) | Observer (3. node) | Far Sync pro vzdálená DC |
| **MSSQL** | AlwaysOn Availability Groups (SYNCHRONOUS_COMMIT) | < 100 km | AlwaysOn + Cluster quorum | File share majority / cloud witness | Multi-site cluster podpora |
| **MongoDB** | Majority write concern + journaling | < 100 km | Replica set auto-election | Arbitration node (voting member) | Priority-based failover |
| **Cassandra** | N/A (multi-master, eventual consistency) | neomezena | Ano (peer-to-peer) | Žádné (multi-master, gossip protokol) | Snitch-aware topologie, NetworkTopologyStrategy |
| **Redis** | Redis Sentinel / Redis Cluster (async) | neomezena (async) | Sentinel / Cluster failover | Quorum (Sentinel, majority) | PSYNC replikace, replication lag |
Klíčové omezení pro **sync replikaci**: latence < 5 ms RTT (commit musí počkat na potvrzení z obou DC). Při 100 km je RTT ~1 ms v pořádku. Při 1000 km (~10 ms RTT) sync replikace snižuje výkon transakcí o 80+ %.
Pro **Active-Active** jsou vhodné:
- **Cassandra / ScyllaDB** — nativní multi-DC, eventual consistency, žádný split-brain
- **MySQL Group Replication (multi-primary)** — 3+ DC pro kvorum
- **CockroachDB / TiDB** — nativní multi-region, ACID napříč DC
- **Redis Enterprise** — Active-Active (CRDT-based)
Pro **Active-Passive** jsou vhodné:
- **PostgreSQL + Patroni** — auto-failover, etcd kvorum
- **Oracle Data Guard** — FSFO, far sync pro vzdálené DC
- **MSSQL AlwaysOn** — cloud witness
- **MongoDB Replica Set** — arbitration node v 3. lokaci
#### Hypervisory
| Hypervisor | Cluster technologie | Stretched cluster | Max distance | Split-brain |
|-----------|-------------------|-------------------|-------------|-------------|
| **VMware vSphere** | vSAN延伸, Metro vCenter, Site Recovery Manager | Ano (vSAN延伸, Metro Cluster) | < 50 km (vSAN延伸), < 10 ms RTT | Fencing (STONITH), witness host |
| **Hyper-V** | Storage Replica + Failover Cluster | Ano (Cluster Sets) | < 50 km (sync), neomezena (async) | File share witness / cloud witness |
| **Proxmox VE** | Proxmox HA + Ceph | Omezeně (Ceph stretch cluster) | < 50 km (Ceph sync) | Ceph monitor quorum (3+ DC) |
| **XCP-ng / XenServer** | Xen Orchestra HA + SR (Storage Repository) replication | Omezeně | závisí na storage replikaci | — |
| **Nutanix AHV** | Metro Availability (sync), Async DR | Ano (Metro) | < 100 km (sync), neomezena (async) | Witness VM (cloud / 3. site) |
| **KVM / oVirt** | oVirt HA + GlusterFS / NFS | Omezeně | závisí na storage replikaci | — |
**vSAN延伸** specifické požadavky:
- Dedikovaná síť pro vSAN (25 GbE min., < 5 ms RTT)
- Witness host v 3. lokaci (nebo cloud witness)
- Všechny VM protokoly (FTT=1, mirroring striped)
- Storage policy: `site-A + site-B + witness`
#### Kubernetes a kontejnerové platformy
| Platforma | Multi-cluster DR | Replikace | Max distance | Failover |
|-----------|-----------------|-----------|-------------|----------|
| **Vanilla K8s** | KubeFed, Cluster API, Velero + Restic | Velero (backup/restore), Rook (Ceph) | neomezena | Manuální (Velero restore) |
| **OpenShift** | ACM (Advanced Cluster Management), Velero | OADP (OpenShift API for Data Protection) | neomezena | ACM failover (subscription) |
| **Rancher** | Rancher Multi-Cluster App, Velero | Longhorn (sync/async DR), Velero | neomezena | Polo-auto |
| **Google GKE** | Multi-cluster Services, Backup for GKE | Config Sync, Backup for GKE | neomezena | Manuální |
| **Azure AKS** | Azure ARC + Velero + Azure Traffic Manager | AKS backup (velero), Azure Site Recovery | neomezena | Manuální (Velero) |
| **AWS EKS** | EKS multi-cluster, Velero + S3 cross-region | Velero (S3), Rook (EBS snapshots) | neomezena | Manuální |
**Klíčové principy K8s DR:**
- **Aplikace musí být stateless** (nebo state externalizovaný do DB/storage)
- **Velero** — backup/restore celého clusteru (PV, resources, helm releases)
- **Rook/Ceph** — cross-region mirroring RBD volumes
- **KubeFed / ACM** — subscription-based deploy do více clusterů
- **Ingress/Gateway API** — traffic routing mezi clustery
- **External DNS** — DNS failover při výpadku clusteru
#### Messaging / streaming
| Platforma | Replikace | Topologie | DR podpora | Max distance |
|-----------|-----------|-----------|------------|-------------|
| **Apache Kafka** | MirrorMaker 2, Confluent Cluster Linking, KRaft quorum | Active-Passive (MM2), Active-Active (Cluster Linking) | MM2: async, Cluster Linking: async | neomezena |
| **RabbitMQ** | Classic Queue Mirroring, Quorum Queues | Active-Passive (Warm Standby) | Federation / Shovel (async) | neomezena |
| **Red Hat AMQ** | (Artemis) Cluster + HA | Active-Passive (shared store / replication) | Live-backup pair | < 100 km (sync) |
| **NATS** | NATS JetStream (cluster + cross-account) | Active-Active (Leaf nodes, cross-account) | Super-cluster, failover | neomezena |
| **Apache Pulsar** | BookKeeper (bookie rack-aware), geo-replication | Active-Active (geo-replication) | Built-in (cluster-level) | neomezena (async) |
| **AWS SQS/SNS** | Managed, AWS region pairs | Active-Active (multi-region) | Built-in (AWS managed) | neomezena |
| **Azure Service Bus** | Managed, paired region | Active-Passive (paired region) | Built-in (geo-recovery) | neomezena |
| **Oracle Service Bus (OSB)** | Oracle WebLogic Cluster + JDBC store + AQ | Active-Passive (WebLogic Cluster + Data Guard) | OSB/WLS cluster + Oracle RAC/Data Guard sync | < 100 km (Data Guard sync), neomezena (async) |
**Doporučení pro DR messagingu:**
- **Kafka**: použít Cluster Linking pro Active-Active, nebo MirrorMaker 2 pro Active-Passive; replikovat jen kritická témata
- **RabbitMQ**: Quorum Queues + Federation upstream pro DR; vyhnout se Classic Queue Mirroring (deprecated)
- **Pulsar**: nativní geo-replication, bookie rack-aware pro stretch cluster; nejjednodušší DR mezi messaging platformami
- **OSB**: WebLogic cluster + Oracle RAC/Data Guard; DR závisí na DB vrstvě, ne na OSB samotném
### Hlavní omezení per vrstva (shrnující tabulka)
| Vrstva | Omezující faktor pro sekundární DC | Max distance pro sync | Dopad na výběr topologie |
|--------|-----------------------------------|----------------------|--------------------------|
| **Storage** | Latence sync mirroru, DWDM náklady | < 50 km (MetroCluster) | Stretched cluster jen v metru |
| **Databáze** | Commit wait pro sync replikaci | < 100 km (5 ms RTT) | Active-Active jen s DB podporující multi-master |
| **Hypervisor** | Stretched cluster quorum + fencing | < 50 km (vSAN, 5 ms) | MetroCluster / stretched cluster |
| **Kubernetes** | Velero restore time, Rook mirror latency | neomezena (async) | Active-Passive, cold standby |
| **Messaging** | Replication lag, offset management | neomezena (async) | Active-Active (Kafka, Pulsar, NATS) nebo Active-Passive |
| **Network** | Dark fiber/DWDM náklady, latency | < 100 km (metro fiber) | Omezuje možnosti sync replikace |
| **Aplikace** | Stateful/stateless, connection draining | závisí na architektuře | Stateless app → libovolná topologie |
## Monitoring disků — S.M.A.R.T.
Self-Monitoring, Analysis and Reporting Technology — prediktivní monitoring HDD/SSD.
@@ -785,4 +1060,4 @@ OpenStack přináší do DC softwarovou abstrakční vrstvu, která umožňuje m
- Akademické / HPC clustery (Ironic, Cyborg, Manila)
- Government / regulated prostředí (on-prem, audit trail)
*Poslední revize: 2026-06-03*
*Poslední revize: 2026-06-12*

246
DC-MIGRATION.en.md Normal file
View File

@@ -0,0 +1,246 @@
# 🏗️ Data Center Migration
## Migration strategies
| Strategy | RTO | RPO | Risk | Cost | Duration | Description |
|-----------|-----|-----|--------|---------|-------------|-------|
| **Cold / Big Bang** | hoursdays | days | High | Low | days | Shut everything down, move, power up |
| **Phased / Wave** | minutes (per wave) | minutes | Medium | Medium | weeksmonths | Workloads moved in waves |
| **Rolling** | 0 (live) | 0 | Low | High | months | Live migration per VM/service |
| **Parallel Run** | 0 | 0 | Low | Very high | months | Both DCs operational, gradual cutover |
| **Pilot Light** | hours | minutes | Medium | Low | weeks | Critical services in new DC, rest migrates |
| **Lift & Shift** | hours | minutes | Medium | Low | weeks | VMs/servers moved without configuration changes |
| **Re-platform** | hours | minutes | Low | Medium | months | Optimization during migration (OS upgrade, resize) |
| **Re-architect** | 0 | 0 | Low | High | monthsyears | Application redesigned for new platform |
---
## Decision tree
```mermaid
flowchart TD
Start(["DC Migration"]) --> APP{"Application\nstateful?"}
APP -->|"Yes"| DOWNTIME{"Tolerates\ndowntime?"}
APP -->|"No"| ROLLING["Rolling / Parallel Run"]
DOWNTIME -->|"Yes, hours+"| COLD["Cold / Big Bang\nSimplest, cheapest\nRisk: all at once"]
DOWNTIME -->|"Yes, minutes"| PHASED["Phased / Wave\nBy application / business unit"]
DOWNTIME -->|"No (zero downtime)"| SYNC{"Sync replication\npossible?"}
SYNC -->|"Yes, < 100 km"| ROLLING
SYNC -->|"No"| PARALLEL["Parallel Run\nBoth DCs active, gradual cutover"]
ROLLING --> ROLL_HA{"VMware,\nHyper-V?"}
ROLL_HA -->|"Yes"| VMOTION["vMotion / Storage vMotion\nLive migration, 0 downtime"]
ROLL_HA -->|"No"| ROLL_REPL["Storage + DB replication\nGradual workload migration"]
```
---
## Migration phases
### 1. Discovery and assessment
| Task | Tools | Output |
|------|----------|--------|
| HW and SW inventory | RVTools, NetBox, CMDB | Server, VM, and service list |
| Dependency mapping | ServiceNow, AppDynamics, manual | Application dependency graph |
| Traffic analysis | NetFlow, sFlow, vRNI | Bandwidth, latency, peak usage |
| Performance baseline | Prometheus, Zabbix, vRealize | CPU/RAM/disk/network per workload |
| License audit | Flexera, SAM | Licenses, support, compliance |
**Output:** workload list with RTO/RPO, dependencies, and criticality.
### 2. Planning
- **Wave plan** — workload division into migration waves (1050 VMs per wave)
- **Dependency ordering** — DNS, NTP, LDAP, PKI first
- **Cutover window** — time window for switching (typically weekend)
- **Rollback plan** — conditions and procedure for reversal
- **Test plan** — what and how to test post-migration
- **Communication plan** — who, when, how is informed
### 3. New DC preparation
- **Infrastructure** — DNS, NTP, DHCP, LDAP/AD, PKI, monitoring (see [DATACENTERS.en.md](DATACENTERS.en.md) — deployment order)
- **Network** — BGP peering, VXLAN/VLAN, firewall rules, load balancers
- **Storage** — SAN zoning, NAS exports, Ceph cluster
- **Virtualization** — vCenter, Hyper-V cluster, Proxmox
### 4. Replication and synchronization
| Layer | Method | Tools |
|--------|--------|----------|
| **Storage (block)** | SAN sync/async mirror, LUN replication | NetApp SnapMirror, Dell EMC RecoverPoint, Pure ActiveCluster |
| **Storage (file)** | DFS-R, rsync, robocopy | Windows DFS, Rsync |
| **Storage (object)** | Cross-region replication | MinIO replication, S3 CRR |
| **Databases** | Log shipping, CDC, streaming replication | PostgreSQL Patroni, Oracle Data Guard, MSSQL AlwaysOn, MySQL Group Replication |
| **VM** | Storage vMotion, replication | VMware vSphere Replication, Hyper-V Replica, Zerto |
| **Kubernetes** | Velero + Restic, Rook Ceph mirror | Velero, Rook |
### 5. Workload migration
#### Wave migration (recommended for medium/large DCs)
```mermaid
gantt
title Wave migration
dateFormat YYYY-MM-DD
section Wave 1 - Core
DNS, NTP, LDAP :done, w1a, 2026-07-01, 3d
Monitoring + logging :done, w1b, after w1a, 2d
section Wave 2 - Network
Load balancers :active, w2a, 2026-07-06, 2d
Firewalls :active, w2b, 2026-07-08, 2d
section Wave 3 - Storage
NAS migration :w3a, 2026-07-10, 5d
SAN replication :w3b, 2026-07-10, 3d
section Wave 4 - Dev/Test
Dev VMs :w4a, 2026-07-15, 5d
section Wave 5 - Prod tier 3
Internal apps :w5a, 2026-07-22, 5d
section Wave 6 - Prod tier 2
Business apps :w6a, 2026-07-29, 5d
section Wave 7 - Prod tier 1
Critical apps :w7a, 2026-08-05, 5d
```
#### Typical single wave procedure:
1. **Day -7**: Sync data replication (initial seed)
2. **Day -1**: Incremental sync, final test
3. **Day 0 (cutover)**:
- Stop application in source DC
- Final sync (last delta)
- Start application in target DC
- DNS/Traffic switch
- Smoke test
4. **Day +1**: Monitoring (performance, errors, lag)
5. **Day +7**: Rollback window end (success confirmation)
### 6. Network strategies
#### IP re-addressing
| Approach | Description | Pros | Cons |
|---------|-------|--------|----------|
| **Keep IP** | Same IPs, BGP anycast or stretch VLAN | No application config changes | Stretched VLAN/L2 limitations |
| **Change IP** | New IP range, DNS/BGP routing change | Clean architecture | Config changes, DNS TTL |
| **NAT translation** | NAT between old and new IP space | No application changes | Latency, troubleshooting complexity |
**Keep IP** is only possible with:
- L2 stretch between DCs (VXLAN, OTV) — distance limited
- BGP anycast for VIPs (load balancers)
- Applications tolerant to ARP cache changes
#### DNS cutover
```
1. Lower TTL to 60300 s (one week ahead)
2. At cutover, change A/AAAA records to new IPs
3. Wait for propagation (per TTL)
4. Monitor traffic
```
#### Traffic steering
| Technique | Use case |
|----------|----------|
| **BGP** | Change AS path / local pref for traffic steering |
| **DNS** | Lower TTL, change A records |
| **Load balancer** | Change pool members, health check |
| **GSLB** | Global Server Load Balancing (F5 GTM, NSX ALB) |
| **Cloud DNS** | AWS Route53, Azure Traffic Manager, Google Cloud DNS |
### 7. Database migration
See individual DB files for details. Summary table:
| DB | Method | RPO | RTO | Note |
|----|--------|-----|-----|----------|
| **PostgreSQL** | Streaming replication + Patroni switchover | 0 (sync) / ~MB (async) | min | Patroni auto-failover |
| **MySQL** | Group Replication / async replication | 0 (sync) / seconds | min | InnoDB Cluster |
| **Oracle** | Data Guard switchover | 0 (sync) | min | Far sync for remote DCs |
| **MSSQL** | AlwaysOn AG failover | 0 (sync) | min | Cloud witness |
| **MongoDB** | Replica set election | seconds | < 1 min | Priority-based failover |
| **Cassandra** | Multi-DC replication | eventual | 0 | Native multi-master |
### 8. Testing
| Phase | What to test | Method |
|------|-------------|--------|
| **Pre-migration** | Application in new DC (isolated) | Dry run on replicated data |
| **Cutover** | Functionality, availability, latency | Smoke test, synthetic transactions |
| **Post-migration** | Performance, integration, monitoring | A/B comparison with baseline, canary traffic |
| **Rollback** | Return to old DC | Tested rollback plan |
### 9. Rollback plan
Each wave must have a defined rollback:
| Condition | Action |
|----------|------|
| Application fails to start in new DC | DNS switch back, stop replication |
| Performance worse than baseline (> 20 %) | Rollback, root cause analysis |
| Integration failure (API timeout, DB connection) | Rollback, dependency check |
| Security incident | Rollback, forensic analysis |
Rollback must be tested **before** the real cutover.
---
## Special cases
### Mainframe migration
- **IBM z/OS** — GDPS (Geographically Dispersed Parallel Sysplex)
- HyperSwap for storage mirroring
- Cross-system coupling facility (XCF)
- Often the last migrated component
### COTS applications (Oracle EBS, SAP)
- Require vendor-specific migration procedures
- Oracle EBS: Autoconfig, cloning (ADXLC)
- SAP: System Copy (Homogeneous / Heterogeneous), SWPM, SUM
- License re-licensing on HW change
### Cloud migration (On-prem → Cloud)
See [CLOUD.en.md](CLOUD.en.md) — migration strategies (6 Rs):
| Strategy | Description |
|-----------|-------|
| **Re-host (Lift & Shift)** | VM → Cloud VM (AWS MGN, Azure Migrate) |
| **Re-platform** | OS upgrade, managed DB (RDS, Cloud SQL) |
| **Re-architect** | Application rewritten as cloud-native |
| **Retire** | Decommission unnecessary applications |
| **Retain** | Application stays on-prem (review later) |
| **Repurchase** | SaaS replacement |
---
## Recommended approach per DC size
| DC Size | VM Count | Recommended strategy | Duration | Team |
|-------------|----------|---------------------|-------------|-----|
| **Small** | < 50 | Big Bang (weekend) | 24 days | 35 people |
| **Medium** | 50500 | Phased (510 waves) | 28 weeks | 510 people |
| **Large** | 5005000 | Phased + Rolling | 312 months | 1030 people |
| **Enterprise** | 5000+ | Parallel Run / Rolling | 1236 months | 30+ people |
---
## Related
- [DATACENTERS.en.md](DATACENTERS.en.md) — DC topologies, secondary DC, deployment order
- [CLOUD.en.md](CLOUD.en.md) — cloud migration strategies (6 Rs)
- [DR.en.md](DR.en.md) — disaster recovery, RTO/RPO
- [NETWORKING.en.md](NETWORKING.en.md) — BGP, DNS, VXLAN, traffic steering
- [STORAGE.en.md](STORAGE.en.md) — storage replication
## Sources
Links, books, and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
*Last revision: 2026-06-12*

246
DC-MIGRATION.md Normal file
View File

@@ -0,0 +1,246 @@
# 🏗️ Migrace datových center
## Strategie migrace
| Strategie | RTO | RPO | Riziko | Náklady | Doba trvání | Popis |
|-----------|-----|-----|--------|---------|-------------|-------|
| **Cold / Big Bang** | hodinydny | dny | Vysoké | Nízké | dny | Vše najednou vypnout, přesunout, zapnout |
| **Phased / Wave** | minuty (per wave) | minuty | Střední | Střední | týdnyměsíce | Workloady po vlnách |
| **Rolling** | 0 (live) | 0 | Nízké | Vysoké | měsíce | Live migration per VM/služba |
| **Parallel Run** | 0 | 0 | Nízké | Velmi vysoké | měsíce | Oba DC v provozu, postupný přechod |
| **Pilot Light** | hodiny | minuty | Střední | Nízké | týdny | Kritické služby v novém DC, ostatní se přesouvají |
| **Lift & Shift** | hodiny | minuty | Střední | Nízké | týdny | VM/servery přesunuty bez změny konfigurace |
| **Re-platform** | hodiny | minuty | Nízké | Střední | měsíce | Optimalizace během migrace (OS upgrade, resize) |
| **Re-architect** | 0 | 0 | Nízké | Vysoké | měsíceroky | Aplikace přepracována pro novou platformu |
---
## Rozhodovací strom
```mermaid
flowchart TD
Start(["Migrace DC"]) --> APP{"Aplikace\nstateful?"}
APP -->|"Ano"| DOWNTIME{"Toleruje\nvýpadek?"}
APP -->|"Ne"| ROLLING["Rolling / Parallel Run"]
DOWNTIME -->|"Ano, hodiny+"| COLD["Cold / Big Bang\nNejjednodušší, nejlevnější\nRiziko: vše najednou"]
DOWNTIME -->|"Ano, minuty"| PHASED["Phased / Wave\nPo aplikacích / byznys jednotkách"]
DOWNTIME -->|"Ne (zero downtime)"| SYNC{"Sync replikace\nmožná?"}
SYNC -->|"Ano, < 100 km"| ROLLING
SYNC -->|"Ne"| PARALLEL["Parallel Run\nOba DC aktivní, postupný cutover"]
ROLLING --> ROLL_HA{"VMware,\nHyper-V?"}
ROLL_HA -->|"Ano"| VMOTION["vMotion / Storage vMotion\nLive migration, 0 downtime"]
ROLL_HA -->|"Ne"| ROLL_REPL["Storage + DB replikace\nPostupný přesun workloadů"]
```
---
## Fáze migrace
### 1. Discovery a assessment
| Úkol | Nástroje | Výstup |
|------|----------|--------|
| Inventarizace HW a SW | RVTools, NetBox, CMDB | Seznam všech serverů, VM, služeb |
| Dependency mapping | ServiceNow, AppDynamics, manual | Aplikační dependency graf |
| Traffic analysis | NetFlow, sFlow, vRNI | BANDWIDTH, latency, peak usage |
| Výkonnostní baseline | Prometheus, Zabbix, vRealize | CPU/RAM/disk/network per workload |
| Licenční audit | Flexera, SAM | Licence, support, compliance |
**Výstupem je:** seznam workloadů s RTO/RPO, závislostmi a kritičností. Bez toho nelze naplánovat migraci.
### 2. Plánování
- **Wave plán** — rozdělení workloadů do migračních vln (1050 VM na vlnu)
- **Závislostní řazení** — DNS, NTP, LDAP, PKI musí být první
- **Cutover okno** — časové okno pro přepnutí (typicky víkend)
- **Rollback plán** — podmínky a postup pro vrácení
- **Testovací plán** — co a jak testovat po migraci
- **Komunikační plán** — kdo, kdy, jak je informován
### 3. Příprava nového DC
- **Infrastruktura** — DNS, NTP, DHCP, LDAP/AD, PKI, monitoring (viz [DATACENTERS.md](DATACENTERS.md) — deployment order)
- **Network** — BGP peering, VXLAN/VLAN, firewall pravidla, load balancery
- **Storage** — SAN zoning, NAS exports, Ceph cluster
- **Virtualizace** — vCenter, Hyper-V cluster, Proxmox
### 4. Replikace a synchronizace
| Vrstva | Metoda | Nástroje |
|--------|--------|----------|
| **Storage (block)** | SAN sync/async mirror, LUN replication | NetApp SnapMirror, Dell EMC RecoverPoint, Pure ActiveCluster |
| **Storage (file)** | DFS-R, rsync, robocopy | Windows DFS, Rsync |
| **Storage (object)** | Cross-region replication | MinIO replication, S3 CRR |
| **Databáze** | Log shipping, CDC, streaming replication | PostgreSQL Patroni, Oracle Data Guard, MSSQL AlwaysOn, MySQL Group Replication |
| **VM** | Storage vMotion, replication | VMware vSphere Replication, Hyper-V Replica, Zerto |
| **Kubernetes** | Velero + Restic, Rook Ceph mirror | Velero, Rook |
### 5. Migrace workloadů
#### Wave migrace (doporučeno pro střední/větší DC)
```mermaid
gantt
title Wave migrace
dateFormat YYYY-MM-DD
section Wave 1 - Core
DNS, NTP, LDAP :done, w1a, 2026-07-01, 3d
Monitoring + logging :done, w1b, after w1a, 2d
section Wave 2 - Network
Load balancers :active, w2a, 2026-07-06, 2d
Firewalls :active, w2b, 2026-07-08, 2d
section Wave 3 - Storage
NAS migrace :w3a, 2026-07-10, 5d
SAN replication :w3b, 2026-07-10, 3d
section Wave 4 - Dev/Test
Dev VMs :w4a, 2026-07-15, 5d
section Wave 5 - Prod tier 3
Internal apps :w5a, 2026-07-22, 5d
section Wave 6 - Prod tier 2
Business apps :w6a, 2026-07-29, 5d
section Wave 7 - Prod tier 1
Critical apps :w7a, 2026-08-05, 5d
```
#### Typický postup jedné vlny:
1. **Den -7**: Sync replikace dat (initial seed)
2. **Den -1**: Incremental sync, final test
3. **Den 0 (cutover)**:
- Zastavení aplikace ve zdrojovém DC
- Final sync (poslední delta)
- Start aplikace v cílovém DC
- DNS/Traffic switch
- Smoke test
4. **Den +1**: Monitorování (výkon, chyby, lag)
5. **Den +7**: Rollback window end (potvrzení úspěchu)
### 6. Síťové strategie
#### IP re-addressing
| Přístup | Popis | Výhody | Nevýhody |
|---------|-------|--------|----------|
| **Keep IP** | Stejné IP, BGP anycast nebo stretch VLAN | Není třeba měnit konfiguraci aplikací | Stretched VLAN/L2 omezení |
| **Change IP** | Nový IP rozsah, DNS/BGP routing změna | Čistá architektura | Změny konfigurací, DNS TTL |
| **NAT překlad** | NAT mezi starým a novým IP spacem | Bez změny aplikací | Latence, komplexita troubleshooting |
**Keep IP** je možný jen:
- L2 stretch mezi DC (VXLAN, OTV) — omezeno vzdáleností
- BGP anycast pro VIP (load balancery)
- Aplikace tolerující ARP cache změny
#### DNS cutover
```
1. Snížit TTL na 60300 s (týden předem)
2. Při cutoveru změnit A/AAAA záznamy na nové IP
3. Počkat na propagaci (dle TTL)
4. Monitorovat traffic
```
#### Traffic steering
| Technika | Use case |
|----------|----------|
| **BGP** | Změna AS path / local pref pro přesměrování trafficu |
| **DNS** | Snížení TTL, change A records |
| **Load balancer** | Změna pool members, health check |
| **GSLB** | Global Server Load Balancing (F5 GTM, NSX ALB) |
| **Cloud DNS** | AWS Route53, Azure Traffic Manager, Google Cloud DNS |
### 7. Databázová migrace
Viz detail v jednotlivých DB souborech. Tabulka shrnutí:
| DB | Metoda | RPO | RTO | Poznámka |
|----|--------|-----|-----|----------|
| **PostgreSQL** | Streaming replication + Patroni switchover | 0 (sync) / ~MB (async) | min | Patroni auto-failover |
| **MySQL** | Group Replication / async replication | 0 (sync) / sekundy | min | InnoDB Cluster |
| **Oracle** | Data Guard switchover | 0 (sync) | min | Far sync pro vzdálené DC |
| **MSSQL** | AlwaysOn AG failover | 0 (sync) | min | Cloud witness |
| **MongoDB** | Replica set election | sekundy | < 1 min | Priority-based failover |
| **Cassandra** | Multi-DC replication | eventual | 0 | Nativní multi-master |
### 8. Testování
| Fáze | Co testovat | Metoda |
|------|-------------|--------|
| **Pre-migrace** | Aplikace v novém DC (izolovaně) | Dry run na replikovaných datech |
| **Cutover** | Funkčnost, dostupnost, latence | Smoke test, synthetic transactions |
| **Post-migrace** | Výkon, integrace, monitoring | A/B comparison s baseline, canary traffic |
| **Rollback** | Návrat ke starému DC | Testovaný rollback plán |
### 9. Rollback plán
Každá vlna musí mít definovaný rollback:
| Podmínka | Akce |
|----------|------|
| Aplikace nestartuje v novém DC | Přepnutí DNS zpět, zastavení replikace |
| Výkon horší než baseline (o > 20 %) | Rollback, analýza příčiny |
| Integrační selhání (API timeout, DB connection) | Rollback, dependency check |
| Bezpečnostní incident | Rollback, forenzní analýza |
Rollback by měl být otestován **před** reálným cutoverem.
---
## Speciální případy
### Mainframe migrace
- **IBM z/OS** — GDPS (Geographically Dispersed Parallel Sysplex)
- HyperSwap pro storage mirroring
- Cross-system coupling facility (XCF)
- Často poslední migrovaná komponenta
### COTS aplikace (Oracle EBS, SAP)
- Vyžadují specifické migrační postupy výrobce
- Oracle EBS: Autoconfig, cloning (ADXLC)
- SAP: System Copy (Homogeneous / Heterogeneous), SWPM, SUM
- Licenční re-licensing při změně HW
### Cloud migrace (On-prem → Cloud)
Viz [CLOUD.md](CLOUD.md) — migrační strategie (6 Rs):
| Strategie | Popis |
|-----------|-------|
| **Re-host (Lift & Shift)** | VM → Cloud VM (AWS MGN, Azure Migrate) |
| **Re-platform** | OS upgrade, managed DB (RDS, Cloud SQL) |
| **Re-architect** | Aplikace přepsána na cloud-native |
| **Retire** | Zastavení nepotřebných aplikací |
| **Retain** | Aplikace zůstává on-prem (revize později) |
| **Repurchase** | SaaS náhrada |
---
## Doporučený postup per velikost DC
| Velikost DC | Počet VM | Doporučená strategie | Doba trvání | Tým |
|-------------|----------|---------------------|-------------|-----|
| **Small** | < 50 | Big Bang (víkend) | 24 dny | 35 lidí |
| **Medium** | 50500 | Phased (510 wave) | 28 týdnů | 510 lidí |
| **Large** | 5005000 | Phased + Rolling | 312 měsíců | 1030 lidí |
| **Enterprise** | 5000+ | Parallel Run / Rolling | 1236 měsíců | 30+ lidí |
---
## Související
- [DATACENTERS.md](DATACENTERS.md) — DC topologie, sekundární DC, deployment order
- [CLOUD.md](CLOUD.md) — cloud migrační strategie (6 Rs)
- [DR.md](DR.md) — disaster recovery, RTO/RPO
- [NETWORKING.md](NETWORKING.md) — BGP, DNS, VXLAN, traffic steering
- [STORAGE.md](STORAGE.md) — storage replikace
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Poslední revize: 2026-06-12*

336
DR.en.md Normal file
View File

@@ -0,0 +1,336 @@
# 🔄 Disaster Recovery and Business Continuity
## Terminology
| Abbreviation | Meaning | Description |
|---------|--------|-------|
| **RTO** | Recovery Time Objective | Maximum time from outage to service recovery |
| **RPO** | Recovery Point Objective | Maximum acceptable data loss (time since last backup) |
| **MTD** | Maximum Tolerable Downtime | Total outage duration an organization can survive |
| **WRT** | Work Recovery Time | Time needed for full operations recovery after IT restoration |
| **MTBF** | Mean Time Between Failures | Mean time between failures |
| **MTTR** | Mean Time To Repair | Mean time to repair |
| **SLA** | Service Level Agreement | Contractual availability commitment |
| **SLO** | Service Level Objective | Internal availability target |
| **SLI** | Service Level Indicator | Measured availability value |
### Relationship between RTO, RPO, MTD, WRT
```
Outage ──── RPO ────► Data restored ──── RTO ────► Service running ──── WRT ────► Full operations
│ │ │
▼ ▼ ▼
Lost data Time without service Time to full capacity
MTD = RTO + WRT (max. time the business tolerates)
```
---
## Uptime calculation
### Nines table
| Level | Uptime | Downtime / year | Downtime / month | Downtime / week |
|--------|--------|---------------|------------------|------------------|
| 90 % (one nine) | 0.9 | 36.5 days | 72 h | 16.8 h |
| 99 % (two nines) | 0.99 | 3.65 days | 7.2 h | 1.68 h |
| 99.5 % | 0.995 | 1.83 days | 3.6 h | 50.4 min |
| 99.9 % (three nines) | 0.999 | 8.76 h | 43.2 min | 10.1 min |
| 99.95 % | 0.9995 | 4.38 h | 21.6 min | 5.04 min |
| 99.99 % (four nines) | 0.9999 | 52.6 min | 4.32 min | 1.01 min |
| 99.995 % | 0.99995 | 26.3 min | 2.16 min | 30.2 s |
| 99.999 % (five nines) | 0.99999 | 5.26 min | 25.9 s | 6.05 s |
| 99.9999 % (six nines) | 0.999999 | 31.6 s | 2.59 s | 0.605 s |
### Calculation
```
Availability = (Total time - Downtime) / Total time × 100 %
Example:
Year = 365 × 24 × 60 = 525,600 minutes
Target: 99.9 % → allowed downtime = 525,600 × (1 - 0.999) = 525.6 minutes = 8.76 h
Combined availability (chain of dependencies):
A_web = 99.9 % (3 nines)
A_api = 99.99 % (4 nines)
A_db = 99.999 % (5 nines)
A_total = 0.999 × 0.9999 × 0.99999 = 0.99889 ≈ 99.89 % (less than 3 nines!)
Parallel availability (redundancy):
A_total = 1 - (1 - A_1) × (1 - A_2) × ... × (1 - A_n)
Example: 2 servers with 99% availability
A_total = 1 - (1-0.99) × (1-0.99) = 1 - 0.01 × 0.01 = 0.9999 (99.99 %)
```
### Calculator
```python
def uptime_percent_to_downtime(pct, period_days=365):
"""Convert uptime percentage to downtime in given period."""
total_minutes = period_days * 24 * 60
allowed_downtime = total_minutes * (1 - pct / 100)
return allowed_downtime # minutes
def downtime_to_uptime_percent(downtime_minutes, period_days=365):
"""Convert downtime in minutes to uptime percentage."""
total_minutes = period_days * 24 * 60
return (1 - downtime_minutes / total_minutes) * 100
def combined_availability(availabilities):
"""Combined availability (series-connected components)."""
result = 1.0
for a in availabilities:
result *= a
return result
def redundant_availability(availabilities):
"""Redundant availability (parallel components)."""
result = 1.0
for a in availabilities:
result *= (1 - a)
return 1 - result
```
### Calculation fallacies
- **Combined availability is not a sum** — adding another dependency always reduces total availability
- **Redundancy is not free** — adding a standby component requires failure detection + failover (MTTR does not improve automatically)
- **SLA is not a guarantee** — providers often calculate SLA as a monthly average, not per-incident
- **Measurement is key** — without SLI, SLO cannot be verified; "unmeasured availability does not exist"
- **Planned maintenance** — sometimes counted as uptime, sometimes not (depends on SLA definition)
---
## DR scenarios
### Classification
| Category | Scenario | Typical RTO | Typical RPO | Frequency |
|-----------|--------|-------------|-------------|-----------|
| **Site** | Entire DC / region outage | hours | minutes | Low |
| **Infrastructure** | HW failure (storage, switch, server) | minuteshours | seconds | Medium |
| **Software** | OS, application, DB failure | minutes | seconds | High |
| **Data** | Data corruption, deletion, cryptolocker | hours | backup point | Lowmedium |
| **Human** | Wrong deployment, config change | minuteshours | seconds | Medium |
| **Security** | Attack, breach, ransomware | days | before attack | Low |
| **Network** | Connectivity outage, DDoS | minuteshours | N/A | Medium |
| **Cloud provider** | Regional outage (AWS, Azure, GCP) | hours | minutes | Very low |
### Scenario details
#### Site / Region failure
| Aspect | Description |
|--------|-------|
| **Cause** | Blackout, fire, flood, earthquake, cloud provider outage |
| **Prevention** | Multi-AZ architecture, multi-region deployment, active-active |
| **Mitigation** | Automatic DNS failover (Route53, Azure Traffic Manager), replica in DR region |
| **Testing** | Game day: shut down primary region, verify automatic failover |
#### Data corruption / human error
| Aspect | Description |
|--------|-------|
| **Cause** | Wrong SQL command (DELETE without WHERE), accidentally deleted bucket, bad migration |
| **Prevention** | RBAC, MFA for destructive operations, change management, SQL peer review |
| **Mitigation** | Point-in-time recovery (PITR), transaction log replay, immutable backups |
| **Testing** | Restore backup to isolated environment, verify data integrity |
#### Ransomware / cyber attack
| Aspect | Description |
|--------|-------|
| **Cause** | Attack on production systems, data encryption, exfiltration |
| **Prevention** | Immutable backups (object lock), air-gapped backups, network segmentation |
| **Mitigation** | Restore from clean backup, rebuild infrastructure from IaC |
| **Testing** | Regular restore in isolated network, verify backup is not infected |
---
## Prevention — strategies
### Backup strategies
| Approach | Description | Use case |
|---------|-------|----------|
| **3-2-1 rule** | 3 copies, 2 different media, 1 off-site | Universal |
| **3-2-1-0** | + 0 errors after restore (testing) | Enterprise, compliance |
| **GFS (Grandfather-Father-Son)** | Daily, weekly, monthly rotation | Long-term archive |
| **Incremental forever** | Full backup 1×, then only changes | Large data volumes |
| **Reverse incremental** | Full + incremental, full is always current | Fast recovery |
### Backup methods
| Method | RPO | RTO | Storage | Suitable for |
|--------|-----|-----|----------|------------|
| **Full backup** | Last full | Full restore time | Large | Small data, weekly |
| **Incremental** | Last incremental | Full + all incrementals | Small | Large data, daily |
| **Differential** | Last diff | Full + last diff | Medium | Compromise |
| **Snapshot** | Snapshot point-in-time | seconds | Copy-on-write | VM, storage array |
| **Continuous (CDC)** | < 1 s | Seconds | Log stream | DB (binlog, WAL) |
| **PITR** | Any point in time | Depends on volume | Full + WAL | RDS, PostgreSQL, SQL Server |
### Backup immutability
Key protection against ransomware:
| Technique | Description |
|----------|-------|
| **Object Lock (WORM)** | Backup cannot be deleted or overwritten for a defined retention period (S3 Object Lock, Azure Blob Immutable) |
| **Air gap** | Backup is physically separated from the production network (offline disk, tape, cloud without VPN) |
| **Isolated backup network** | Backup traffic goes through a dedicated network without access from production VLAN |
| **Out-of-band access** | Backup management console is not accessible from the production network |
---
## DR architectures
### Multi-AZ (Single region)
```
Region ┌────────────────────────────────────┐
│ AZ-1 AZ-2 │
│ ┌──────────┐ ┌──────────┐ │
│ │ App │ │ App │ │
│ └─────┬────┘ └─────┬────┘ │
│ │ │ │
│ ┌─────▼────────────────▼─────┐ │
│ │ Load Balancer (cross-AZ) │ │
│ └─────────────┬──────────────┘ │
│ │ │
│ ┌─────────────▼──────────────┐ │
│ │ DB Primary (AZ-1) │ │
│ │ DB Standby (AZ-2) │ │
│ │ Synchronous replication │ │
│ └────────────────────────────┘ │
└────────────────────────────────────┘
```
- RTO: minutes (automatic failover)
- RPO: 0 (sync replication)
- Protection: against AZ failure, not region failure
### Multi-Region
```
Region A (Primary) Region B (DR)
┌─────────────────────┐ ┌─────────────────────┐
│ ┌───────────────┐ │ │ ┌───────────────┐ │
│ │ App + DB │ │ │ │ App + DB │ │
│ │ Active │──┼──Async───────┼─►│ Standby │ │
│ └───────────────┘ │ replication │ └───────────────┘ │
│ │ │ │ │ │
│ ┌──────▼───────┐ │ │ ┌──────▼───────┐ │
│ │ DNS / GSLB │ │ │ │ DNS / GSLB │ │
│ └──────┬───────┘ │ │ └──────┬───────┘ │
└─────────┼──────────┘ └─────────┼──────────┘
│ │
└──────────── Traffic Manager ───────┘
```
| Variant | RTO | RPO | Cost | Failover |
|----------|-----|-----|---------|----------|
| **Active-Passive** | minuteshours | seconds | Medium | Manual / auto |
| **Active-Active** | seconds | < 1 s | High | Automatic (DNS) |
| **Pilot Light** | tens of minutes | minutes | Low | Manual scaling |
| **Warm Standby** | minutes | seconds | High | Auto (reduced copy) |
| **Backup & Restore** | hours | 24 h | Low | Manual |
### On-prem → Cloud DR (Hybrid)
```
On-prem DC Cloud (DR)
┌─────────────────────┐ ┌─────────────────────┐
│ ┌───────────────┐ │ │ ┌───────────────┐ │
│ │ Application │ │ │ │ VM / App │ │
│ │ + DB │ │ │ │ + DB replica │ │
│ └───────┬───────┘ │ │ └───────┬───────┘ │
│ │ │ │ │ │
│ ┌───────▼───────┐ │ site-to-site│ ┌───────▼───────┐ │
│ │ Backup proxy │──┼────VPN───────┼─►│ Backup store │ │
│ └───────────────┘ │ │ └───────────────┘ │
│ │ │ │
│ ┌───────────────┐ │ │ ┌───────────────┐ │
│ │ Tape / NAS │ │ │ │ Veeam / Zerto│ │
│ └───────────────┘ │ │ └───────────────┘ │
└─────────────────────┘ └─────────────────────┘
```
- **RTO**: tens of minutes (depends on VM startup)
- **RPO**: minuteshours (depends on replication tool)
- **Tools**: Veeam, Zerto, Azure Site Recovery, AWS MGN, Commvault
- **Use case**: enterprise with on-prem DC that needs DR without a second DC
---
## DR testing
### Test types
| Type | Description | Frequency | Risk |
|-----|-------|-----------|--------|
| **Tabletop exercise** | Manual scenario walkthrough, no impact on production | Monthly | None |
| **Walkthrough** | Runbook verification, ensure everyone knows what to do | Quarterly | None |
| **Component test** | Test of a single component (e.g., restore one DB) | Monthly | Low |
| **Integrated test** | Test of the entire stack in isolated environment | Quarterly | Low |
| **Full failover test** | Production failover to DR site | Annually | High |
| **Chaos experiment** | Targeted fault injection into production | Continuous | Medium |
### Runbook structure
Each DR scenario should have a runbook:
```yaml
scenario: "Region A failure"
triggers:
- "CloudWatch alarm: Region A health check 5× timeout"
- "PagerDuty incident P0"
decision_tree: |
1. Verify: is Region A really unavailable? (check from 3 different locations)
2. Decide: is RTO at risk? If < 30 % RTO remaining → failover
3. Failover: run playbook `dr-failover-region-b`
4. Verification: smoke tests in Region B
5. Communication: status page + stakeholders
rollback: |
1. After Region A recovery → replicate changes from B back to A
2. Repoint DNS to A
3. Verify data consistency
4. Shut down Region B (or keep as hot standby)
contacts:
primary: "on-call@example.com"
escalation: "infra-lead@example.com"
management: "vp-engineering@example.com"
```
---
## Best practices
- **Test recovery, not backup** — a backup without tested recovery is not a backup
- **Automate DR** — Terraform / Ansible for DR environment spin-up, DNS failover
- **Document runbooks** — every scenario, contact, decision tree
- **Expect failure** — design for failure, don't expect everything to work
- **Don't underestimate WRT** — service recovery does not mean full operations (data warming, cache, connections)
- **Align RTO/RPO with business** — technical capabilities must match business requirements
- **Monitor SLI** — without data, SLO cannot be verified
- **DR is not just IT** — communication, PR, legal, compliance
---
## Related
- [CLOUD.en.md](CLOUD.en.md) — cloud DR strategy, AWS/Azure/GCP specific
- [DATACENTERS.en.md](DATACENTERS.en.md) — DC redundancy, Tier classification
- [MONITORING.en.md](MONITORING.en.md) — alerting, SLI/SLO/SLA
- [CICD.en.md](CICD.en.md) — deployment strategy, rollback
- [STORAGE.en.md](STORAGE.en.md) — backup storage, replication
## Sources
Odkazy, knihy a standardy: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
*Last revised: 2026-06-11*

336
DR.md Normal file
View File

@@ -0,0 +1,336 @@
# 🔄 Disaster Recovery a Business Continuity
## Terminologie
| Zkratka | Význam | Popis |
|---------|--------|-------|
| **RTO** | Recovery Time Objective | Maximální doba od výpadku do obnovení služby |
| **RPO** | Recovery Point Objective | Maximální přípustná ztráta dat (čas od poslední zálohy) |
| **MTD** | Maximum Tolerable Downtime | Celková doba výpadku, kterou organizace přežije |
| **WRT** | Work Recovery Time | Čas potřebný k plnému obnovení provozu po obnovení IT |
| **MTBF** | Mean Time Between Failures | Střední doba mezi poruchami |
| **MTTR** | Mean Time To Repair | Střední doba opravy |
| **SLA** | Service Level Agreement | Smluvní závazek dostupnosti |
| **SLO** | Service Level Objective | Interní cíl dostupnosti |
| **SLI** | Service Level Indicator | Naměřená hodnota dostupnosti |
### Vztah RTO, RPO, MTD, WRT
```
Výpadek ──── RPO ────► Obnova dat ──── RTO ────► Služba běží ──── WRT ────► Plný provoz
│ │ │
▼ ▼ ▼
Ztracená data Čas bez služby Čas do plného výkonu
MTD = RTO + WRT (max. doba, kterou firma toleruje)
```
---
## Výpočet uptimu
### Tabulka devítek
| Úroveň | Uptime | Downtime / rok | Downtime / měsíc | Downtime / týden |
|--------|--------|---------------|------------------|------------------|
| 90 % (jedna devítka) | 0.9 | 36,5 dne | 72 h | 16,8 h |
| 99 % (dvě devítky) | 0.99 | 3,65 dne | 7,2 h | 1,68 h |
| 99,5 % | 0.995 | 1,83 dne | 3,6 h | 50,4 min |
| 99,9 % (tři devítky) | 0.999 | 8,76 h | 43,2 min | 10,1 min |
| 99,95 % | 0.9995 | 4,38 h | 21,6 min | 5,04 min |
| 99,99 % (čtyři devítky) | 0.9999 | 52,6 min | 4,32 min | 1,01 min |
| 99,995 % | 0.99995 | 26,3 min | 2,16 min | 30,2 s |
| 99,999 % (pět devítek) | 0.99999 | 5,26 min | 25,9 s | 6,05 s |
| 99,9999 % (šest devítek) | 0.999999 | 31,6 s | 2,59 s | 0,605 s |
### Výpočet
```
Dostupnost = (Celkový čas - Downtime) / Celkový čas × 100 %
Příklad:
Rok = 365 × 24 × 60 = 525 600 minut
Cíl: 99,9 % → povolený downtime = 525 600 × (1 - 0,999) = 525,6 minut = 8,76 h
Složená dostupnost (řetězec závislostí):
A_web = 99,9 % (3 devítky)
A_api = 99,99 % (4 devítky)
A_db = 99,999 % (5 devítek)
A_celkem = 0,999 × 0,9999 × 0,99999 = 0,99889 ≈ 99,89 % (méně než 3 devítky!)
Paralelní dostupnost (redundance):
A_celkem = 1 - (1 - A_1) × (1 - A_2) × ... × (1 - A_n)
Příklad: 2 servery s 99% dostupností
A_celkem = 1 - (1-0,99) × (1-0,99) = 1 - 0,01 × 0,01 = 0,9999 (99,99 %)
```
### Kalkulačka
```python
def uptime_percent_to_downtime(pct, period_days=365):
"""Převede procento uptimu na downtime v daném období."""
total_minutes = period_days * 24 * 60
allowed_downtime = total_minutes * (1 - pct / 100)
return allowed_downtime # minutes
def downtime_to_uptime_percent(downtime_minutes, period_days=365):
"""Převede downtime v minutách na procento uptimu."""
total_minutes = period_days * 24 * 60
return (1 - downtime_minutes / total_minutes) * 100
def combined_availability(availabilities):
"""Složená dostupnost (sériově zapojené komponenty)."""
result = 1.0
for a in availabilities:
result *= a
return result
def redundant_availability(availabilities):
"""Paralelní dostupnost (redundantní komponenty)."""
result = 1.0
for a in availabilities:
result *= (1 - a)
return 1 - result
```
### Fallacies výpočtu
- **Složená dostupnost není součet** — přidání další závislosti vždy snižuje celkovou dostupnost
- **Redundance není zadarmo** — přidání standby komponenty vyžaduje detekci selhání + failover (MTTR se nezlepší automaticky)
- **SLA není garance** — poskytovatelé často počítají SLA jako měsíční průměr, ne per-incident
- **Měření je klíčové** — bez SLI nelze ověřit SLO; "nedoměřená dostupnost neexistuje"
- **Plánovaná odstávka** — někdy se počítá do uptimu, někdy ne (záleží na definici SLA)
---
## DR scénáře
### Klasifikace
| Kategorie | Scénář | Typický RTO | Typické RPO | Frekvence |
|-----------|--------|-------------|-------------|-----------|
| **Site** | Výpadek celého DC / regionu | hodiny | minuty | Nízká |
| **Infrastructure** | Selhání HW (storage, switch, server) | minutyhodiny | sekundy | Střední |
| **Software** | Selhání OS, aplikace, DB | minuty | vteřiny | Vysoká |
| **Data** | Poškození dat, delete, cryptolocker | hodiny | okamžik zálohy | Nízkástřední |
| **Human** | Chybný deployment, config change | minutyhodiny | vteřiny | Střední |
| **Security** | Útok, breach, ransomware | dny | před útokem | Nízká |
| **Network** | Výpadek konektivity, DDoS | minutyhodiny | N/A | Střední |
| **Cloud provider** | Regionální výpadek (AWS, Azure, GCP) | hodiny | minuty | Velmi nízká |
### Detail scénářů
#### Site / Region failure
| Aspekt | Popis |
|--------|-------|
| **Příčina** | Blackout, požár, povodeň, zemětřesení, výpadek cloud providera |
| **Prevence** | Multi-AZ architektura, multi-region deployment, active-active |
| **Mitigace** | Automatický DNS failover (Route53, Azure Traffic Manager), replica v DR regionu |
| **Testování** | Game day: vypnout primární region, ověřit automatický failover |
#### Data corruption / human error
| Aspekt | Popis |
|--------|-------|
| **Příčina** | Chybný SQL příkaz (DELETE bez WHERE), omylem smazaný bucket, chybná migrace |
| **Prevence** | RBAC, MFA pro destructive operace, change management, peer review SQL |
| **Mitigace** | Point-in-time recovery (PITR), transaction log replay, immutable backups |
| **Testování** | Obnova zálohy do izolovaného prostředí, ověření integrity dat |
#### Ransomware / cyber attack
| Aspekt | Popis |
|--------|-------|
| **Příčina** | Útok na produkční systémy, zašifrování dat, exfiltrace |
| **Prevence** | Immutable backups (object lock), air-gapped backups, network segmentation |
| **Mitigace** | Obnova z čisté zálohy, re-build infrastructure from IaC |
| **Testování** | Pravidelná obnova v izolované síti, ověření že backup není infikován |
---
## Prevence — strategie
### Backup strategie
| Aproach | Popis | Use case |
|---------|-------|----------|
| **3-2-1 pravidlo** | 3 kopie, 2 různá média, 1 off-site | Univerzální |
| **3-2-1-0** | + 0 chyb po obnově (testování) | Enterprise, compliance |
| **GFS (Grandfather-Father-Son)** | Denní, týdenní, měsíční rotace | Dlouhodobý archiv |
| **Incremental forever** | Plná záloha 1×, pak jen změny | Velké objemy dat |
| **Reverse incremental** | Plná + inkrementální, plná je vždy aktuální | Rychlá obnova |
### Zálohovací metody
| Metoda | RPO | RTO | Úložiště | Vhodné pro |
|--------|-----|-----|----------|------------|
| **Full backup** | Poslední full | Doba obnovy full | Velké | Malá data, weekly |
| **Incremental** | Poslední inkrement | Full + všechny inkrementy | Malé | Velká data, daily |
| **Differential** | Poslední diff | Full + poslední diff | Střední | Kompromis |
| **Snapshot** | Okamžik snapshotu | vteřiny | Copy-on-write | VM, storage array |
| **Continuous (CDC)** | < 1 s | Sekundy | Log stream | DB (binlog, WAL) |
| **PITR** | Libovolný bod v čase | Dle objemu | Full + WAL | RDS, PostgreSQL, SQL Server |
### Imunabilita backupů
Klíčová ochrana proti ransomwaru:
| Technika | Popis |
|----------|-------|
| **Object Lock (WORM)** | Backup nelze smazat ani přepsat po defined retention period (S3 Object Lock, Azure Blob Immutable) |
| **Air gap** | Backup je fyzicky oddělený od produkční sítě (offline disk, tape, cloud bez VPN) |
| **Isolated backup network** | Backup traffic jde přes dedikovanou síť bez přístupu z produkční VLAN |
| **Out-of-band access** | Backup management console není dostupná z produkční sítě |
---
## DR architektury
### Multi-AZ (Single region)
```
Region ┌────────────────────────────────────┐
│ AZ-1 AZ-2 │
│ ┌──────────┐ ┌──────────┐ │
│ │ App │ │ App │ │
│ └─────┬────┘ └─────┬────┘ │
│ │ │ │
│ ┌─────▼────────────────▼─────┐ │
│ │ Load Balancer (cross-AZ) │ │
│ └─────────────┬──────────────┘ │
│ │ │
│ ┌─────────────▼──────────────┐ │
│ │ DB Primary (AZ-1) │ │
│ │ DB Standby (AZ-2) │ │
│ │ Synchronous replication │ │
│ └────────────────────────────┘ │
└────────────────────────────────────┘
```
- RTO: minuty (automatický failover)
- RPO: 0 (sync replication)
- Ochrana: proti selhání AZ, nikoliv regionu
### Multi-Region
```
Region A (Primary) Region B (DR)
┌─────────────────────┐ ┌─────────────────────┐
│ ┌───────────────┐ │ │ ┌───────────────┐ │
│ │ App + DB │ │ │ │ App + DB │ │
│ │ Active │──┼──Async───────┼─►│ Standby │ │
│ └───────────────┘ │ replikace │ └───────────────┘ │
│ │ │ │ │ │
│ ┌──────▼───────┐ │ │ ┌──────▼───────┐ │
│ │ DNS / GSLB │ │ │ │ DNS / GSLB │ │
│ └──────┬───────┘ │ │ └──────┬───────┘ │
└─────────┼──────────┘ └─────────┼──────────┘
│ │
└──────────── Traffic Manager ───────┘
```
| Varianta | RTO | RPO | Náklady | Failover |
|----------|-----|-----|---------|----------|
| **Active-Passive** | minutyhodiny | sekundy | Střední | Manuální / auto |
| **Active-Active** | sekundy | < 1 s | Vysoké | Automatický (DNS) |
| **Pilot Light** | desítky minut | minuty | Nízké | Manuální škálování |
| **Warm Standby** | minuty | sekundy | Vysoké | Auto (zmenšená kopie) |
| **Backup & Restore** | hodiny | 24 h | Nízké | Manuální |
### On-prem → Cloud DR (Hybrid)
```
On-prem DC Cloud (DR)
┌─────────────────────┐ ┌─────────────────────┐
│ ┌───────────────┐ │ │ ┌───────────────┐ │
│ │ Aplikace │ │ │ │ VM / Aplikace│ │
│ │ + DB │ │ │ │ + DB replica │ │
│ └───────┬───────┘ │ │ └───────┬───────┘ │
│ │ │ │ │ │
│ ┌───────▼───────┐ │ site-to-site│ ┌───────▼───────┐ │
│ │ Backup proxy │──┼────VPN───────┼─►│ Backup store │ │
│ └───────────────┘ │ │ └───────────────┘ │
│ │ │ │
│ ┌───────────────┐ │ │ ┌───────────────┐ │
│ │ Tape / NAS │ │ │ │ Veeam / Zerto│ │
│ └───────────────┘ │ │ └───────────────┘ │
└─────────────────────┘ └─────────────────────┘
```
- **RTO**: desítky minut (závisí na startup VM)
- **RPO**: minutyhodiny (závisí na replikačním nástroji)
- **Nástroje**: Veeam, Zerto, Azure Site Recovery, AWS MGN, Commvault
- **Use case**: enterprise s on-prem DC, které potřebuje DR bez druhého DC
---
## DR testování
### Typy testů
| Typ | Popis | Frekvence | Riziko |
|-----|-------|-----------|--------|
| **Tabletop exercise** | Manuální procházení scénáře, žádný dopad na produkci | Měsíčně | Žádné |
| **Walkthrough** | Verifikace runbooku, kontrola že všichni ví co dělat | Kvartálně | Žádné |
| **Component test** | Test jedné komponenty (např. obnova jedné DB) | Měsíčně | Nízké |
| **Integrated test** | Test celého stacku v izolovaném prostředí | Kvartálně | Nízké |
| **Full failover test** | Produkční failover do DR site | Ročně | Vysoké |
| **Chaos experiment** | Cílené vnášení poruch do produkce | Průběžně | Střední |
### Runbook struktura
Každý DR scénář by měl mít runbook:
```yaml
scenario: "Region A failure"
triggers:
- "CloudWatch alarm: Region A health check 5× timeout"
- "PagerDuty incident P0"
decision_tree: |
1. Ověřit: je Region A opravdu nedostupný? (check z 3 různých lokací)
2. Rozhodnout: je RTO v ohrožení? Pokud zbývá < 30 % RTO → failover
3. Failover: spustit playbook `dr-failover-region-b`
4. Verifikace: smoke testy v Region B
5. Komunikace: status page + stakeholders
rollback: |
1. Po obnovení Region A → replikace změn z B zpět do A
2. Repoint DNS na A
3. Ověřit konzistenci dat
4. Vypnout Region B (nebo ponechat jako hot standby)
contacts:
primary: "on-call@example.com"
escalation: "infra-lead@example.com"
management: "vp-engineering@example.com"
```
---
## Best practices
- **Testuj obnovu, ne zálohu** — backup bez testované obnovy není backup
- **Automatizuj DR** — Terraform / Ansible pro spin-up DR prostředí, DNS failover
- **Dokumentuj runbooky** — každý scénář, kontakt, rozhodovací strom
- **Počítej se selháním** — design for failure, nečekej že všechno poběží
- **Nepodceňuj WRT** — obnova služby neznamená plný provoz (data warming, cache, connections)
- **Slaď RTO/RPO s businessem** — technické možnosti musí odpovídat obchodním požadavkům
- **Monitoruj SLI** — bez dat nelze ověřit SLO
- **DR není jen IT** — komunikace, PR, právní, regulace
---
## Související
- [CLOUD.md](CLOUD.md) — cloud DR strategie, AWS/Azure/GCP specific
- [DATACENTERS.md](DATACENTERS.md) — DC redundance, Tier klasifikace
- [MONITORING.md](MONITORING.md) — alerting, SLI/SLO/SLA
- [CICD.md](CICD.md) — deployment strategie, rollback
- [STORAGE.md](STORAGE.md) — backup storage, replication
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Poslední revize: 2026-06-11*

View File

@@ -112,6 +112,12 @@ NVLink topologie (GPU direct) PCIe topologie (CPU mediated)
- **Denoising**: AI-accelerated denoising on GPU
- **Farm rendering**: Deadline, Qube! (job scheduler)
## GPU pricing
Detailed pricing comparisons (purchase price, cloud on-demand, $/M token inference cost, $/GB HBM, price trends 2024→2026) see:
- [AI-INFRASTRUCTURE.en.md — GPU pricing and price/performance](AI-INFRASTRUCTURE.en.md#gpu-pricing-and-priceperformance)
## GPU server form factors
| Form factor | GPU count | Power | Cooling | Example |
@@ -144,6 +150,6 @@ Cyborg is an OpenStack service for managing accelerators (GPU, FPGA, DPU, NPU).
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
*Last revision: 2026-06-03*

6
GPU.md
View File

@@ -112,6 +112,12 @@ NVLink topologie (GPU direct) PCIe topologie (CPU mediated)
- **Denoising**: AI-accelerated denoising na GPU
- **Farm rendering**: Deadline, Qube! (job scheduler)
## Ceny GPU
Detailní cenová srovnání (nákupní cena, cloud on-demand, $/M token inferenčních nákladů, $/GB HBM, cenový vývoj 2024→2026) viz:
- [AI-INFRASTRUCTURE.md — Ceny GPU a poměr cena/výkon](AI-INFRASTRUCTURE.md#ceny-gpu-a-poměr-cenavýkon)
## GPU server form factors
| Form factor | GPU count | Power | Cooling | Příklad |

View File

@@ -4,9 +4,9 @@ This file has been split into separate areas:
| Area | File |
|--------|--------|
| 🔧 Server hardware — components and architecture | [SERVER-HW.md](SERVER-HW.md) |
| 🎮 GPU — architecture, models, virtualization | [GPU.md](GPU.md) |
| ⚙️ Server configuration — best practices by workload | [SERVER-CONFIG.md](SERVER-CONFIG.md) |
| 📦 Provisioning — boot, installation, server management | [PROVISIONING.md](PROVISIONING.md) |
| 🔧 Server hardware — components and architecture | [SERVER-HW.en.md](SERVER-HW.en.md) |
| 🎮 GPU — architecture, models, virtualization | [GPU.en.md](GPU.en.md) |
| ⚙️ Server configuration — best practices by workload | [SERVER-CONFIG.en.md](SERVER-CONFIG.en.md) |
| 📦 Provisioning — boot, installation, server management | [PROVISIONING.en.md](PROVISIONING.en.md) |
*Last revision: 2026-06-03*

View File

@@ -24,7 +24,7 @@
- **VM — Virtual Machine** — full virtualization, own kernel
- **Container** — shared host kernel, lighter (Docker, LXC)
- **Paravirtualization** — guest OS knows it runs in a VM (better I/O performance)
- **NUMA** — Non-Uniform Memory Access, CPU/memory allocation optimization (see [SERVER-HW.md](SERVER-HW.md#numa))
- **NUMA** — Non-Uniform Memory Access, CPU/memory allocation optimization (see [SERVER-HW.en.md](SERVER-HW.en.md#numa))
- **Overcommit** — allocating more vCPU/RAM than physically available (ratio management)
- **Live Migration** — moving a running VM between hosts (vSphere vMotion, Hyper-V Live Migration)
- **HA (High Availability)** — VM restart on another host upon failure
@@ -86,20 +86,22 @@ According to Foundry/CIO.com survey (2025): **56%** of organizations plan to red
#### Target Platforms — Comparison
| Criterion | Proxmox VE | Nutanix AHV | Microsoft Hyper-V | Red Hat OpenShift Virtualization |
|-----------|-----------|-------------|-------------------|----------------------------------|
| **Hypervisor** | KVM + LXC | KVM (fork) | Hyper-V | KVM (KubeVirt) |
| **License** | Open source (free), support ~€500/host/year | Per node subscription (3060% savings vs VCF) | Windows Server license (Standard/Datacenter) | OpenShift subscription (core-based) |
| **Live Migration** | Live Migration (Proxmox 8+) | AHV Live Migration | Live Migration (SMB/RDMA) | KubeVirt (VMI live migration) |
| **HA** | Proxmox HA (watchdog, fencing) | Built-in HA (Prism) | Hyper-V HA (WS Failover Cluster) | OpenShift HA (self-healing) |
| **Storage** | ZFS, Ceph, LVM | AOS (hybrid/SSD, erasure coding) | S2D, CSV, ReFS | OCS, Ceph, LSO |
| **Backup** | Proxmox Backup Server (free) | Native snapshot + DR | Windows Server Backup / Veeam | OpenShift APIs + OADP |
| **Price (3 years, 3 hosts)** | $0 + support $1,500 | ~$45,00060,000 | $0 (Hyper-V Server free) or Windows Server license | ~$90,000+ (OpenShift) |
| **Price (3 years, 10 hosts)** | $0 + support $5,000 | ~$150,000200,000 | Windows Server Datacenter for unlimited VMs | ~$300,000+ (OpenShift) |
| **Migration difficulty** | Medium (VMDK → QCOW2, VirtIO drivers) | Low (Nutanix Move tool) | Medium (V2V converter, SCVMM) | High (Kubernetes learning curve) |
| **Linux support** | Excellent (native KVM) | Excellent (KVM-based) | Good (LIS drivers) | Excellent (KVM + OpenShift) |
| **Windows support** | Good (VirtIO drivers) | Excellent (ALAS drivers, svpd) | Excellent (native) | Good (KubeVirt + VirtIO) |
| **GPU passthrough** | VFIO (excellent) | GPU passthrough | DDA (Direct Device Assignment) | VFIO + GPU Operator |
| Criterion | Proxmox VE | Nutanix AHV | Microsoft Hyper-V | Red Hat OpenShift Virtualization | **Sangfor aSV (HCI)** |
|-----------|-----------|-------------|-------------------|----------------------------------|----------------------|
| **Hypervisor** | KVM + LXC | KVM (fork) | Hyper-V | KVM (KubeVirt) | **KVM (aSV)** |
| **License** | Open source (free), support ~€500/host/year | Per node subscription (3060% savings vs VCF) | Windows Server license (Standard/Datacenter) | OpenShift subscription (core-based) | **Per node (Enterprise Pro), all-inclusive** |
| **Live Migration** | Live Migration (Proxmox 8+) | AHV Live Migration | Live Migration (SMB/RDMA) | KubeVirt (VMI live migration) | **Yes** |
| **HA** | Proxmox HA (watchdog, fencing) | Built-in HA (Prism) | Hyper-V HA (WS Failover Cluster) | OpenShift HA (self-healing) | **Built-in HA** |
| **Storage** | ZFS, Ceph, LVM | AOS (hybrid/SSD, erasure coding) | S2D, CSV, ReFS | OCS, Ceph, LSO | **aSAN (distributed SDS, locality-aware)** |
| **Backup** | Proxmox Backup Server (free) | Native snapshot + DR | Windows Server Backup / Veeam | OpenShift APIs + OADP | **Built-in backup + CDP** |
| **Price (3 years, 3 hosts)** | $0 + support $1,500 | ~$45,00060,000 | $0 (Hyper-V Server free) or Windows Server license | ~$90,000+ (OpenShift) | **~$15,00025,000** |
| **Price (3 years, 10 hosts)** | $0 + support $5,000 | ~$150,000200,000 | Windows Server Datacenter for unlimited VMs | ~$300,000+ (OpenShift) | **~$50,00080,000** |
| **Migration difficulty** | Medium (VMDK → QCOW2, VirtIO drivers) | Low (Nutanix Move tool) | Medium (V2V converter, SCVMM) | High (Kubernetes learning curve) | **Low (VMware import tool)** |
| **Linux support** | Excellent (native KVM) | Excellent (KVM-based) | Good (LIS drivers) | Excellent (KVM + OpenShift) | **Excellent (KVM-based)** |
| **Windows support** | Good (VirtIO drivers) | Excellent (ALAS drivers, svpd) | Excellent (native) | Good (KubeVirt + VirtIO) | **Good (VirtIO drivers)** |
| **GPU passthrough** | VFIO (excellent) | GPU passthrough | DDA (Direct Device Assignment) | VFIO + GPU Operator | **vGPU support (standard)** |
| **Integrated security** | — | — | — | — | **Yes (NGFW, IPS, WAF, EDR — aSEC)** |
| **Min. cluster (3 copies)** | 3 (Ceph) | 3 | 23 | 3 | **3** |
#### Migration Tools
@@ -112,8 +114,47 @@ According to Foundry/CIO.com survey (2025): **56%** of organizations plan to red
| **virt-v2v** | VMware ESXi, Xen, Hyper-V | KVM (libvirt) | Open source CLI tool, disk + driver conversion (virtio), suitable for bulk migration |
| **Windows Admin Center VM Conversion Extension** | VMware ESXi | Hyper-V | Microsoft WAC extension, free, GUI-based, bulk migration |
| **Platform9 vJailbreak** | VMware ESXi | OpenStack / KVM | In-place migration (no swing gear), open source |
| **Sangfor VMware Import Tool** | VMware ESXi | Sangfor aSV (HCI) | VMware import tool, disk + driver conversion, can retain network config |
#### TCO Comparison — Example: 3 hosts (2× 20C CPU), 50 VMs
#### Cross-Hypervisor Migration Matrix
Comprehensive overview of all source→target pairs with methods, tools, limitations, and complexity.
| Source → Target | Method | Tools | Complexity | Limitations |
|-------------|--------|----------|-----------|---------|
| **VMware → Proxmox** | Disk conversion VMDK→QCOW2, driver reinstall | Proxmox Import Wizard, Veeam, StarWind, virt-v2v | Medium | VirtIO drivers required, UEFI not supported in Import Wizard (< 8.1), snapshots must be removed |
| **VMware → Hyper-V** | Disk conversion VMDK→VHDX, driver reinstall | StarWind, WAC Converter, SCVMM, Microsoft MTC | Medium | Integration Services required, network config differences (VMXNET3 → Hyper-V Synthetic) |
| **VMware → KVM/XCP-ng** | Disk conversion VMDK→raw/QCOW2, driver swap | virt-v2v, StarWind | Medium | VirtIO drivers, UEFI support (OVMF), host passthrough compatibility |
| **VMware → Nutanix AHV** | Automated migration via Move appliance | Nutanix Move, Veeam | Low | AHV is also KVM — minimal issues, retain IP/MAC, UEFI support |
| **VMware → Sangfor aSV** | Import via VMware Import Tool, disk + driver conversion | Sangfor VMware Import Tool | Low | Built-in tool, retain network config, UEFI support |
| **VMware → OpenStack** | In-place or swing | Platform9 vJailbreak, virt-v2v + Glance | High | Network redesign (Neutron), storage (Cinder), image format (Glance) required |
| **Hyper-V → VMware** | Disk conversion VHDX→VMDK, driver reinstall | StarWind, virt-v2v, VMware vCenter Converter (standalone) | Medium | VMware Tools required, network driver change (VMXNET3), UEFI/secure boot issues |
| **Hyper-V → Proxmox** | Disk conversion VHDX→QCOW2, driver swap | StarWind, virt-v2v, qemu-img | MediumHigh | VirtIO drivers, integration services → guest agent, secure boot issues |
| **Hyper-V → KVM/XCP-ng** | Disk conversion VHDX→raw/QCOW2 | virt-v2v, qemu-img | Medium | VirtIO drivers, Linux generic drivers usually work |
| **Hyper-V → Nutanix AHV** | Automated migration | Nutanix Move | LowMedium | Similar to VMware→Nutanix, UEFI support, retain IP |
| **Proxmox → VMware** | Export OVF/OVA, qemu-img convert | qemu-img (QCOW2→VMDK), ovftool, manual OVF export | High | VMware Tools required, storage format differences, no live migration, downtime required |
| **Proxmox → Hyper-V** | qemu-img convert, driver reinstall | qemu-img, manual VHDX conversion | High | Hyper-V Integration Services required, no automated tool, edge case |
| **Proxmox → KVM/XCP-ng** | Direct QCOW2 (same format), XML edit | libvirt, virsh dumpxml/define | Medium | libvirt XML/QEMU args differences (storage pool, network), validation required |
| **Proxmox → Nutanix AHV** | qemu-img + manual import | qemu-img, Nutanix Image Service CLI | High | No hot tool, conversion + manual VM reconfiguration required |
| **XCP-ng → VMware** | Disk conversion VHD→VMDK | qemu-img, StarWind, virt-v2v | High | VMware Tools required, paravirtualization differences (Xen PV vs VMware) |
| **XCP-ng → Proxmox** | Disk conversion or direct VHD | qemu-img, manual import | Medium | Disk conversion, VHD format not native in Proxmox |
| **XCP-ng → Hyper-V** | Disk conversion VHD→VHDX (direct) | StarWind, qemu-img | Medium | VHD/VHDX compatible, Integration Services required |
| **Nutanix AHV → VMware** | Export + conversion | qemu-img, Nutanix Export, VMware vCenter Converter | High | VMware Tools, AHV is KVM → usually easier than Hyper-V→VMware |
| **Nutanix AHV → Proxmox** | qemu-img + manual import | qemu-img, Nutanix self-service restore | Medium | AFS disks → QCOW2, metadata must be reconstructed |
| **Nutanix AHV → Hyper-V** | qemu-img + manual | qemu-img, StarWind | High | Edge case, no hot tool |
| **OpenStack → (any)** | Glance export + qemu-img | glance image-download, qemu-img, ovftool | MediumHigh | Image format (raw/QCOW2), metadata (flavor, security groups) must be recreated |
| **Sangfor aSV → (any)** | qemu-img conversion + manual | qemu-img, manual OVF/OVA export | MediumHigh | KVM-based → conversion to QCOW2/VMDK/VHDX via qemu-img, metadata must be recreated |
| **(any) → Sangfor aSV** | aSV API import + VMware Import Tool | Sangfor VMware Import Tool (for VMware), manual qemu-img import for others | Medium | KVM-based → standard formats supported, import tool for VMware only |
**Migration success keys:**
- **Drivers** — each platform requires its own paravirtual drivers (VMware Tools, VirtIO, Hyper-V Integration Services, Xen Tools). Always swap after migration.
- **UEFI / Secure Boot** — not all combinations support UEFI (Proxmox Import Wizard < 8.1 does not). Test UEFI VMs before migration.
- **Snapshots** — snapshots must be removed (merged) before migration. Most tools only migrate flat disks.
- **Network** — MAC addresses, IP addresses, VLAN tagging — verify after migration. Some tools (Nutanix Move, VMware Converter) can retain MAC.
- **Storage format** — VMDK ↔ VHDX ↔ QCOW2 ↔ raw are inter-convertible via `qemu-img`, but metadata differs (snapshots, backing files).
- **Live migration** — no live migration exists between different hypervisors. Downtime is always required (minutes to hours depending on VM size).
- **Migration temperature** — the "colder" the VM (fewer changes), the easier the migration. Real-time database applications require a separate DB migration plan.
| Platform | Year 1 | 3 Years Total | Note |
|-----------|--------|---------------|----------|
@@ -123,6 +164,7 @@ According to Foundry/CIO.com survey (2025): **56%** of organizations plan to red
| **Nutanix AHV** (average) | ~$18,000 | ~$54,000 | Per node subscription, estimate |
| **Hyper-V** (Windows Server Datacenter) | $12,400 | $37,200 | One-time license per core, without SA |
| **Hyper-V** (Azure Stack HCI) | ~$7,200 | ~$21,600 | ~$10/core/month, 120 cores |
| **Sangfor HCI** (Enterprise Pro) | ~$5,0008,000 | ~$15,00025,000 | Per node, all-inclusive, 3 nodes |
**Real-world example from Spiceworks (2026)**: A user reports VMware Essentials+ increasing from $1,900/year to $14,000/year (VVF) — a 7.4× increase.
@@ -142,8 +184,9 @@ According to Foundry/CIO.com survey (2025): **56%** of organizations plan to red
3. Select target platform (1-2 candidates)
├─ Proxmox: lowest TCO, Linux-heavy shops
├─ Nutanix: enterprise HCI, low migration difficulty
├─ Hyper-V: Windows-centric, Azure hybrid
└─ OpenShift: Kubernetes-first, platform engineering
├─ Hyper-V: Windows-centric, Azure hybrid
├─ Sangfor: HCI all-in-one, security-first, VMware exit (SMB/mid-market)
└─ OpenShift: Kubernetes-first, platform engineering
4. Plan migration phases
├─ Wave 1: non-critical (dev/test, 1-2 months)
@@ -269,9 +312,71 @@ Hardware ──> QEMU (I/O emulation) + KVM (kernel module, virtualization)
- Load KVM modules: `kvm`, `kvm_intel`/`kvm_amd`, `vfio-pci`
- Optimize storage: raw/LVM (avoid qcow2 for performance workloads)
## Sangfor aSV (HCI)
[Chinese vendor](https://www.sangfor.com) — KVM-based hypervisor, part of Sangfor HCI stack (aSV + aSAN + aNet + aSEC). Distributed through partners in EMEA.
### Stack architecture
| Component | Role |
|-----------|------|
| **aSV** | Hypervisor (KVM-based) |
| **aSAN** | Distributed SDS (locality-aware, data tiering, dedup, compression) |
| **aNet** | Network virtualization (distributed switches and routers, WYDIWYG visual editor) |
| **aSEC** | Security (NGFW, IPS, WAF, EDR, east-west segmentation) |
| **Sangfor Cloud Platform** | Management orchestrator, unified dashboard |
### Key features
| Feature | Detail |
|-----------|--------|
| **Hypervisor** | KVM (aSV) — custom fork with HCI extensions |
| **License** | Enterprise Pro — per node, all-inclusive (compute + storage + network + security) |
| **Min. cluster** | 3 nodes (3 data copies) |
| **Live Migration** | Yes |
| **HA** | Built-in HA |
| **Storage** | aSAN — locality-aware, data tiering (SSD + HDD), dedup, compression, erasure coding |
| **Backup** | Built-in backup + CDP — no 3rd party needed |
| **Security** | Integrated NGFW, IPS, WAF, EDR — no external appliances |
| **VDI** | aDesk — integrated VDI solution |
| **Kubernetes** | SKE (Sangfor Kubernetes Engine) |
| **Migration** | Sangfor VMware Import Tool (from vCenter), qemu-img for others |
| **vGPU** | Standard support (no extra license) |
### Comparison with VMware
| Feature | Sangfor | VMware |
|---------|---------|--------|
| **License** | Per node, all-inclusive | Multi-tier (vSphere + vSAN + NSX + Aria) |
| **vGPU** | Included (standard) | Enterprise Plus only |
| **Backup + CDP** | Built-in | 3rd party or extra license |
| **Security (NGFW, IPS, WAF)** | Built-in (aSEC) | NSX + 3rd party |
| **Network management** | WYDIWYG visual editor | NSX Manager (more complex) |
| **Min. cluster (3 copies)** | 3 nodes | 5 nodes (vSAN) |
| **Data locality** | Yes | No |
| **SSD life prediction** | Yes | No |
### Use case
- **VMware exit** — VMware replacement for SMB and mid-market
- **Greenfield HCI** — new DCs, branch offices, remote sites
- **VDI** — aDesk integrated with HCI
- **Security-first** — organizations requiring integrated security
- **Asia-Pacific / EMEA** — strongest in Asia, expanding to Europe
### Risks and limitations
| Risk | Detail |
|--------|--------|
| **Geopolitical** | Chinese vendor — possible regulatory restrictions (GDPR, EU, NATO, government) |
| **Ecosystem** | Smaller community than VMware/Proxmox, less documentation and ISV certifications |
| **Support** | Primary support from Asia, local partner critical |
| **Vendor lock-in** | Closed ecosystem (aSV + aSAN + aNet + aSEC), harder to mix with 3rd party |
| **References in CZ/EU** | Very limited — pilot required before production |
## Storage in Hypervisors
See also: [STORAGE.md](STORAGE.md) — detailed overview of storage protocols and configurations.
See also: [STORAGE.en.md](STORAGE.en.md) — detailed overview of storage protocols and configurations.
| Type | Description | Protocols |
|-----|-------|-----------|
@@ -443,7 +548,7 @@ For telco, large private clouds, MANO/NFVI environments.
## Resources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
### Recommended Reading

View File

@@ -86,20 +86,22 @@ Dle Foundry/CIO.com průzkumu (2025): **56 %** organizací plánuje snížit vyu
#### Cílové platformy — srovnání
| Kritérium | Proxmox VE | Nutanix AHV | Microsoft Hyper-V | Red Hat OpenShift Virtualization |
|-----------|-----------|-------------|-------------------|----------------------------------|
| **Hypervisor** | KVM + LXC | KVM (fork) | Hyper-V | KVM (KubeVirt) |
| **Licence** | Open source (free), support ~€500/host/rok | Per node subscription (3060 % savings oproti VCF) | Windows Server license (Standard/Datacenter) | OpenShift subscription (core-based) |
| **Live Migration** | Live Migration (Proxmox 8+) | AHV Live Migration | Live Migration (SMB/RDMA) | KubeVirt (VMI live migration) |
| **HA** | Proxmox HA (watchdog, fencing) | Built-in HA (Prism) | Hyper-V HA (WS Failover Cluster) | OpenShift HA (self-healing) |
| **Storage** | ZFS, Ceph, LVM | AOS (hybrid/SSD, erasure coding) | S2D, CSV, ReFS | OCS, Ceph, LSO |
| **Backup** | Proxmox Backup Server (free) | Native snapshot + DR | Windows Server Backup / Veeam | OpenShift APIs + OADP |
| **Cena (3 roky, 3 hosty)** | $0 + support $1 500 | ~$45 00060 000 | $0 (Hyper-V Server zdarma) nebo Windows Server lic. | ~$90 000+ (OpenShift) |
| **Cena (3 roky, 10 hostů)** | $0 + support $5 000 | ~$150 000200 000 | Windows Server Datacenter pro neomezené VM | ~$300 000+ (OpenShift) |
| **Náročnost migrace** | Střední (VMDK → QCOW2, VirtIO drivery) | Nízká (Nutanix Move tool) | Střední (V2V converter, SCVMM) | Vysoká (Kubernetes learning curve) |
| **Linux podpora** | Výborná (nativní KVM) | Výborná (KVM-based) | Dobrá (LIS drivers) | Výborná (KVM + OpenShift) |
| **Windows podpora** | Dobrá (VirtIO drivers) | Výborná (ALAS drivers, svpd) | Výborná (nativní) | Dobrá (KubeVirt + VirtIO) |
| **GPU passthrough** | VFIO (výborná) | GPU passthrough | DDA (Direct Device Assignment) | VFIO + GPU Operator |
| Kritérium | Proxmox VE | Nutanix AHV | Microsoft Hyper-V | Red Hat OpenShift Virtualization | **Sangfor aSV (HCI)** |
|-----------|-----------|-------------|-------------------|----------------------------------|----------------------|
| **Hypervisor** | KVM + LXC | KVM (fork) | Hyper-V | KVM (KubeVirt) | **KVM (aSV)** |
| **Licence** | Open source (free), support ~€500/host/rok | Per node subscription (3060 % savings oproti VCF) | Windows Server license (Standard/Datacenter) | OpenShift subscription (core-based) | **Per node (Enterprise Pro), vše v ceně** |
| **Live Migration** | Live Migration (Proxmox 8+) | AHV Live Migration | Live Migration (SMB/RDMA) | KubeVirt (VMI live migration) | **Ano** |
| **HA** | Proxmox HA (watchdog, fencing) | Built-in HA (Prism) | Hyper-V HA (WS Failover Cluster) | OpenShift HA (self-healing) | **Built-in HA** |
| **Storage** | ZFS, Ceph, LVM | AOS (hybrid/SSD, erasure coding) | S2D, CSV, ReFS | OCS, Ceph, LSO | **aSAN (distribuovaný SDS, locality-aware)** |
| **Backup** | Proxmox Backup Server (free) | Native snapshot + DR | Windows Server Backup / Veeam | OpenShift APIs + OADP | **Built-in backup + CDP (Continuous Data Protection)** |
| **Cena (3 roky, 3 hosty)** | $0 + support $1 500 | ~$45 00060 000 | $0 (Hyper-V Server zdarma) nebo Windows Server lic. | ~$90 000+ (OpenShift) | **~$15 00025 000** |
| **Cena (3 roky, 10 hostů)** | $0 + support $5 000 | ~$150 000200 000 | Windows Server Datacenter pro neomezené VM | ~$300 000+ (OpenShift) | **~$50 00080 000** |
| **Náročnost migrace** | Střední (VMDK → QCOW2, VirtIO drivery) | Nízká (Nutanix Move tool) | Střední (V2V converter, SCVMM) | Vysoká (Kubernetes learning curve) | **Nízká (nástroje pro VMware import)** |
| **Linux podpora** | Výborná (nativní KVM) | Výborná (KVM-based) | Dobrá (LIS drivers) | Výborná (KVM + OpenShift) | **Výborná (KVM-based)** |
| **Windows podpora** | Dobrá (VirtIO drivers) | Výborná (ALAS drivers, svpd) | Výborná (nativní) | Dobrá (KubeVirt + VirtIO) | **Dobrá (VirtIO drivers)** |
| **GPU passthrough** | VFIO (výborná) | GPU passthrough | DDA (Direct Device Assignment) | VFIO + GPU Operator | **vGPU support (standard)** |
| **Integrovaná bezpečnost** | — | — | — | — | **Ano (NGFW, IPS, WAF, EDR — aSEC)** |
| **Min. cluster (3 kopie)** | 3 (Ceph) | 3 | 23 | 3 | **3** |
#### Migrační nástroje
@@ -112,6 +114,47 @@ Dle Foundry/CIO.com průzkumu (2025): **56 %** organizací plánuje snížit vyu
| **virt-v2v** | VMware ESXi, Xen, Hyper-V | KVM (libvirt) | Open source CLI nástroj, konverze disků + driverů (virtio), vhodný pro hromadnou migraci |
| **Windows Admin Center VM Conversion Extension** | VMware ESXi | Hyper-V | Microsoft WAC extension, free, GUI-based, hromadná migrace |
| **Platform9 vJailbreak** | VMware ESXi | OpenStack / KVM | In-place migration (bez swing gear), open source |
| **Sangfor VMware Import Tool** | VMware ESXi | Sangfor aSV (HCI) | Nástroj pro import VM z vCenter, konverze disků + driverů, možnost retain network config |
#### Matice migrací napříč hypervisory
Komplexní přehled všech dvojic zdroj → cíl s metodami, nástroji, omezeními a obtížností.
| Zdroj → Cíl | Metoda | Nástroje | Obtížnost | Omezení |
|-------------|--------|----------|-----------|---------|
| **VMware → Proxmox** | Disk konverze VMDK→QCOW2, reinstalace driverů | Proxmox Import Wizard, Veeam, StarWind, virt-v2v | Střední | Nutné VirtIO drivery, UEFI nepodporováno v Import Wizard (< 8.1), nutno odstranit snapshoty |
| **VMware → Hyper-V** | Disk konverze VMDK→VHDX, reinstalace driverů | StarWind, WAC Converter, SCVMM, Microsoft MTC | Střední | Integration Services nutné, rozdíly v síťové konfiguraci (VMXNET3 → Hyper-V Synthetic) |
| **VMware → KVM/XCP-ng** | Disk konverze VMDK→raw/QCOW2, driver swap | virt-v2v, StarWind | Střední | VirtIO drivers, UEFI support (OVMF), host passthrough musí být kompatibilní |
| **VMware → Nutanix AHV** | Automatizovaná migrace přes Move appliance | Nutanix Move, Veeam | Nízká | AHV je také KVM minimální problémy, retain IP/MAC, podpora UEFI |
| **VMware → Sangfor aSV** | Import přes VMware Import Tool, konverze disků + driverů | Sangfor VMware Import Tool | Nízká | Built-in nástroj, retain network config, support UEFI |
| **VMware → OpenStack** | In-place nebo swing | Platform9 vJailbreak, virt-v2v + Glance | Vysoká | Nutný redesign networking (Neutron), storage (Cinder), image format (Glance) |
| **Hyper-V → VMware** | Disk konverze VHDX→VMDK, reinstalace driverů | StarWind, virt-v2v, VMware vCenter Converter (standalone) | Střední | VMware Tools nutné, síťový driver change (VMXNET3), UEFI/secure boot issues |
| **Hyper-V → Proxmox** | Disk konverze VHDX→QCOW2, driver swap | StarWind, virt-v2v, qemu-img | StředníVysoká | VirtIO drivers, integration services → guest agent, secure boot issues |
| **Hyper-V → KVM/XCP-ng** | Disk konverze VHDX→raw/QCOW2 | virt-v2v, qemu-img | Střední | VirtIO drivers, Linux generické drivery obvykle fungují |
| **Hyper-V → Nutanix AHV** | Automatizovaná migrace | Nutanix Move | NízkáStřední | Obdobné jako VMware→Nutanix, support UEFI, retain IP |
| **Proxmox → VMware** | Export OVF/OVA, qemu-img convert | qemu-img (QCOW2→VMDK), ovftool, manuální OVF export | Vysoká | VMware Tools nutné, rozdíly v storage formátech, bez live migration, nutný downtime |
| **Proxmox → Hyper-V** | qemu-img convert, reinstalace driverů | qemu-img, manuální VHDX konverze | Vysoká | Hyper-V Integration Services nutné, žádný automatizovaný nástroj, edge case |
| **Proxmox → KVM/XCP-ng** | Direct QCOW2 (stejný formát), úprava XML | libvirt, virsh dumpxml/define | Střední | Rozdíly v libvirt XML/QEMU args (storage pool, síť), nutná validace |
| **Proxmox → Nutanix AHV** | qemu-img + manuální import | qemu-img, Nutanix Image Service CLI | Vysoká | Žádný hot nástroj, nutná konverze + manuální rekonfigurace VM |
| **XCP-ng → VMware** | Disk konverze VHD→VMDK | qemu-img, StarWind, virt-v2v | Vysoká | VMware Tools nutné, rozdíly v paravirtualizaci (Xen PV vs VMware) |
| **XCP-ng → Proxmox** | Disk konverze nebo direct VHD | qemu-img, manuální import | Střední | Konverze disků, formát VHD není nativní v Proxmox |
| **XCP-ng → Hyper-V** | Disk konverze VHD→VHDX (přímá) | StarWind, qemu-img | Střední | VHD/VHDX kompatibilní, nutné Integration Services |
| **Nutanix AHV → VMware** | Export + konverze | qemu-img, Nutanix Export, VMware vCenter Converter | Vysoká | VMware Tools, AHV je KVM → obvykle jednodušší než Hyper-V→VMware |
| **Nutanix AHV → Proxmox** | qemu-img + manuální import | qemu-img, Nutanix self-service restore | Střední | Disky z AFS → QCOW2, metadata nutno rekonstruovat |
| **Nutanix AHV → Hyper-V** | qemu-img + manuální | qemu-img, StarWind | Vysoká | Edge case, žádný hot nástroj |
| **OpenStack → (any)** | Glance export + qemu-img | glance image-download, qemu-img, ovftool | StředníVysoká | Image formát (raw/QCOW2), metadata (flavor, security groups) nutno znovu vytvořit |
| **Sangfor aSV → (any)** | qemu-img konverze + manuální | qemu-img, manuální OVF/OVA export | StředníVysoká | KVM-based → konverze do QCOW2/VMDK/VHDX přes qemu-img, metadata nutno znovu vytvořit |
| **(any) → Sangfor aSV** | aSV API import + VMware Import Tool | Sangfor VMware Import Tool (pro VMware), manuální qemu-img import pro ostatní | Střední | KVM-based → podpora standardních formátů, import tool jen pro VMware |
**Klíče k úspěšné migraci:**
- **Drivery** — každá platforma vyžaduje vlastní paravirtual drivers (VMware Tools, VirtIO, Hyper-V Integration Services, Xen Tools). Po migraci vždy vyměnit.
- **UEFI / Secure Boot** — ne všechny kombinace podporují UEFI (Proxmox Import Wizard < 8.1 nepodporuje). Při migraci UEFI VM raději testovat.
- **Snapshoty** — snapshots musí být před migrací odstraněny (sloučeny). Většina nástrojů migruje jen flat disky.
- **Síť** — MAC adresy, IP adresy, VLAN tagging — po migraci zkontrolovat. Některé nástroje (Nutanix Move, VMware Converter) umí retain MAC.
- **Storage format** — VMDK ↔ VHDX ↔ QCOW2 ↔ raw jsou vzájemně konvertovatelné přes `qemu-img`, ale liší se v metadatech (snapshots, backing files).
- **Live migration** — mezi různými hypervisory neexistuje live migration. Vždy je potřeba downtime (minuty až hodiny podle velikosti VM).
- **Teplota migrace** — čím "chladnější" VM (méně změn), tím snazší migrace. Aplikace s databází v reálném čase vyžadují samostatný DB migrační plán.
#### TCO srovnání — příklad: 3 hosty (2× 20C CPU), 50 VM
@@ -123,6 +166,7 @@ Dle Foundry/CIO.com průzkumu (2025): **56 %** organizací plánuje snížit vyu
| **Nutanix AHV** (průměr) | ~$18 000 | ~$54 000 | Per node subscription, odhad |
| **Hyper-V** (Windows Server Datacenter) | $12 400 | $37 200 | Jednorázová licence per core, bez SA |
| **Hyper-V** (Azure Stack HCI) | ~$7 200 | ~$21 600 | ~$10/core/měsíc, 120 cores |
| **Sangfor HCI** (Enterprise Pro) | ~$5 0008 000 | ~$15 00025 000 | Per node, vše v ceně, 3 uzly |
**Reálný příklad ze Spiceworks (2026)**: Uživatel hlásí navýšení VMware Essentials+ z $1 900/rok na $14 000/rok (VVF) — nárůst 7.4×.
@@ -142,8 +186,9 @@ Dle Foundry/CIO.com průzkumu (2025): **56 %** organizací plánuje snížit vyu
3. Vyber cílovou platformu (1-2 kandidáty)
├─ Proxmox: nejnižší TCO, Linux-heavy shops
├─ Nutanix: enterprise HCI, nízká náročnost migrace
├─ Hyper-V: Windows-centric, Azure hybrid
└─ OpenShift: Kubernetes-first, platform engineering
├─ Hyper-V: Windows-centric, Azure hybrid
├─ Sangfor: HCI all-in-one, security-first, VMware exit (SMB/mid-market)
└─ OpenShift: Kubernetes-first, platform engineering
4. Naplánuj migrační fáze
├─ Wave 1: non-critical (dev/test, 1-2 měsíce)
@@ -269,6 +314,72 @@ Hardware ──> QEMU (emulace I/O) + KVM (kernel module, virtualization)
- Naložit KVM moduly: `kvm`, `kvm_intel`/`kvm_amd`, `vfio-pci`
- Optimalizovat storage: raw/LVM (vyhnout se qcow2 u výkonových workloadů)
## Sangfor aSV (HCI)
[Čínský vendor](https://www.sangfor.com) — KVM-based hypervisor, součást Sangfor HCI stacku (aSV + aSAN + aNet + aSEC). V ČR distribuován přes partnery.
### Architektura stacku
| Komponenta | Role |
|-----------|------|
| **aSV** | Hypervisor (KVM-based) |
| **aSAN** | Distributed SDS (locality-aware, data tiering, dedup, compression) |
| **aNet** | Network virtualization (distribuované switche a routery, WYDIWYG vizuální editor) |
| **aSEC** | Bezpečnost (NGFW, IPS, WAF, EDR, east-west segmentation) |
| **Sangfor Cloud Platform** | Management orchestrator, unified dashboard |
### Klíčové vlastnosti
| Vlastnost | Detail |
|-----------|--------|
| **Hypervisor** | KVM (aSV) — vlastní fork s rozšířeními pro HCI |
| **Licence** | Enterprise Pro — per node, vše v ceně (compute + storage + network + security) |
| **Min. cluster** | 3 uzly (3 kopie dat) |
| **Live Migration** | Ano |
| **HA** | Built-in HA |
| **Storage** | aSAN — locality-aware (data locality), data tiering (SSD + HDD), dedup, compression, erasure coding |
| **Backup** | Built-in backup + CDP (Continuous Data Protection) — bez nutnosti 3rd party |
| **Security** | Integrated NGFW, IPS, WAF, EDR — bez externích appliance |
| **VDI** | aDesk — integrované VDI řešení |
| **Kubernetes** | SKE (Sangfor Kubernetes Engine) |
| **Migrace** | Sangfor VMware Import Tool (z vCenter), qemu-img pro ostatní |
| **vGPU** | Standardní podpora (bez extra licence) |
### Srovnání s VMware
| Feature | Sangfor | VMware |
|---------|---------|--------|
| **Licence** | Per node, vše v ceně | Vícestupňová (vSphere + vSAN + NSX + Aria) |
| **vGPU** | V ceně (standard) | Jen v Enterprise Plus |
| **Backup + CDP** | Built-in | 3rd party nebo extra licence |
| **Security (NGFW, IPS, WAF)** | Built-in (aSEC) | NSX + 3rd party (Palo Alto, Check Point) |
| **Network management** | WYDIWYG vizuální editor | NSX Manager (složitější) |
| **Min. cluster (3 kopie)** | 3 uzly | 5 uzlů (vSAN) |
| **Data locality** | Ano | Ne |
| **SSD life prediction** | Ano | Ne |
### Use case
- **VMware exit** — náhrada za VMware v SMB a mid-market
- **Greenfield HCI** — nové DC, branch offices, remote sites
- **VDI** — aDesk integrovaný s HCI
- **Security-first** — organizace vyžadující integrovanou bezpečnost (NGFW, IPS, WAF)
- **Asie-Pacific / EMEA** — nejsilnější v Asii, expanding do Evropy
### Rizika a omezení
| Riziko | Detail |
|--------|--------|
| **Geopolitické** | Čínský vendor — možné regulatory restrictions (GDPR, EU, NATO, government) |
| **Ekosystém** | Menší komunita než VMware/Proxmox, méně dokumentace a ISV certifikací |
| **Support** | Support primárně z Asie, lokální partner kritický |
| **Vendor lock-in** | Uzavřený ekosystém (aSV + aSAN + aNet + aSEC), těžší mix s 3rd party |
| **Reference v ČR** | Velmi omezené — nutný pilot před produkcí |
### Migrace na/z Sangfor
Viz matice migrací výše v této sekci. Pro VMware → Sangfor existuje dedikovaný import nástroj. Pro ostatní hypervisory standardní qemu-img.
## Storage v hypervizorech
Viz také: [STORAGE.md](STORAGE.md) — detailní přehled storage protokolů a konfigurací.

View File

@@ -4,9 +4,9 @@ This file has been split into separate areas:
| Area | File |
|--------|--------|
| 🖥️ Hypervisors and virtualization | [HYPERVISORS.md](HYPERVISORS.md) |
| 🏭 Data centers | [DATACENTERS.md](DATACENTERS.md) |
| 💾 Storage | [STORAGE.md](STORAGE.md) |
| 🔧 Hardware and servers | [HARDWARE.md](HARDWARE.md) |
| 🖥️ Hypervisors and virtualization | [HYPERVISORS.en.md](HYPERVISORS.en.md) |
| 🏭 Data centers | [DATACENTERS.en.md](DATACENTERS.en.md) |
| 💾 Storage | [STORAGE.en.md](STORAGE.en.md) |
| 🔧 Hardware and servers | [HARDWARE.en.md](HARDWARE.en.md) |
*Last revision: 2026-06-03*

299
KUBERNETES.en.md Normal file
View File

@@ -0,0 +1,299 @@
# ☸ Kubernetes — architecture, platforms, Cluster API
## Overview
Kubernetes (K8s) is an open-source container orchestrator — the de facto standard for deploying, scaling, and managing containerized applications. Built on declarative configuration and control loops (reconciliation).
## Kubernetes deployment methods
| Method | Description | Control plane | Best for |
|--------|-------------|--------------|----------|
| **kubeadm** | Official K8s cluster bootstrap tool | Self-managed (stacked/external etcd) | On-prem, lab, learning |
| **K3s** | Lightweight K8s (Rancher), single binary, embedded etcd/SQLite | Self-managed | Edge, IoT, low-resource, HA with embedded etcd |
| **RKE2** | Rancher Kubernetes Engine 2, CIS-hardened, FIPS-ready | Self-managed | Enterprise on-prem, air-gapped, regulatory |
| **OpenShift** | Red Hat enterprise K8s + operator lifecycle + SDN + routing | Self-managed (RHCOS) | Enterprise, multicluster, platform engineering |
| **Vanilla K8s (CAPI)** | Cluster API — declarative provisioning and lifecycle management | Self-managed (CAPI managed) | Fleet management, GitOps, multi-provider |
| **EKS** (AWS) | Managed K8s | AWS managed | AWS cloud-native, least ops |
| **AKS** (Azure) | Managed K8s | Azure managed | Azure cloud-native |
| **GKE** (GCP) | Managed K8s, auto-pilot, autopilot modes | GCP managed | GCP cloud-native |
| **SKE** (Sangfor) | Managed K8s on Sangfor HCI | Vendor managed | Sangfor HCI ecosystem |
---
## Cluster API (CAPI)
### What is Cluster API
Cluster API is a Kubernetes sub-project (SIG Cluster-Lifecycle) that brings declarative APIs for provisioning, upgrading, and operating Kubernetes clusters. Instead of Terraform scripts or manual `kubeadm`, you define clusters as Kubernetes Custom Resources — `Cluster`, `Machine`, `MachineDeployment`, etc.
Core principle: **A Kubernetes cluster that manages Kubernetes clusters.**
### Architecture
```
┌─────────────────────────────────────────┐
│ Management Cluster │
│ │
│ ┌──────────────────────────────────┐ │
│ │ CAPI Controllers │ │
│ │ ┌──────┐ ┌──────┐ ┌─────────┐ │ │
│ │ │ Infra│ │Bootstrap│ │Control │ │ │
│ │ │ Prov │ │ Prov │ │Plane Pr │ │ │
│ │ └──────┘ └──────┘ └─────────┘ │ │
│ └──────────────────────────────────┘ │
│ │
│ CR: Cluster, Machine, MachineDeployment│
│ ... │
└────────────────┬────────────────────────┘
│ CAPI controller
│ creates / manages
┌────────┴────────┐
▼ ▼
┌───────────────┐ ┌───────────────┐
│ Workload │ │ Workload │
│ Cluster (dev) │ │ Cluster (prod)│
│ ┌───┐ ┌───┐ │ │ ┌───┐ ┌───┐ │
│ │ CP│ │ W │ │ │ │ CP│ │ W │ │
│ └───┘ └───┘ │ │ └───┘ └───┘ │
└───────────────┘ └───────────────┘
```
- **Management cluster** — a Kubernetes cluster running CAPI controllers. Can be a dedicated small admin cluster.
- **Workload (managed) cluster** — Kubernetes clusters managed by CAPI; each is a CRD inside the management cluster.
- **Machine** — abstraction of a compute unit (VM, bare metal) that becomes a K8s node.
### Key CRDs (Custom Resource Definitions)
| CRD | API group | Purpose |
|-----|-----------|---------|
| **Cluster** | `cluster.x-k8s.io` | Cluster representation (infra ref, control plane ref, networking) |
| **Machine** | `cluster.x-k8s.io` | Individual node (VM/BM instance) |
| **MachineDeployment** | `cluster.x-k8s.io` | Declarative scaling and rolling update of workers |
| **MachineSet** | `cluster.x-k8s.io` | Replica set for Machines (lower-level) |
| **MachineHealthCheck** | `cluster.x-k8s.io` | Auto-remediation (replace unhealthy nodes) |
| **ClusterClass** | `cluster.x-k8s.io` | Cluster template for reuse |
| **KubeadmControlPlane** | `controlplane.cluster.x-k8s.io` | Kubeadm-managed control plane (stacked/external etcd) |
| **KubeadmConfig / KubeadmConfigTemplate** | `bootstrap.cluster.x-k8s.io` | Bootstrap configuration (kubeadm init/join) |
### Provider model
CAPI uses a three-layer provider model:
#### 1. Infrastructure Provider
Creates and manages infrastructure (VM, networks, LB, storage).
| Provider | Platform | Status |
|----------|----------|--------|
| **AWS (CAPA)** | AWS EC2, VPC, ELB, EKS | Stable, SIG-sponsored |
| **Azure (CAPZ)** | Azure VM, VNet, LB, AKS | Stable, SIG-sponsored |
| **GCP (CAPG)** | GCP Compute, VPC, GKE | Beta |
| **vSphere (CAPV)** | VMware vSphere | Stable |
| **OpenStack (CAPO)** | OpenStack compute/network | Stable |
| **Metal3** | Bare metal (Ironic) | Stable |
| **Docker (CAPD)** | Docker containers (development) | Tilt/Dev only |
| **Akamai (Linode)** | Linode | Community |
| **Azure Stack HCI** | Azure Stack HCI | Community |
| **cloudscale** | cloudscale.ch | Community |
| **Exoscale** | Exoscale | Community |
| **IBM Cloud** | IBM Cloud | Community |
| **Equinix Metal** | Equinix (ex Packet) | Community |
| **Hetzner** | Hetzner Cloud | Community |
| **OpenNebula** | OpenNebula | Community |
#### 2. Bootstrap Provider
Handles K8s initialization on a node (kubeadm init/join, TLS certs, tokens).
| Provider | Description |
|----------|-------------|
| **Kubeadm** (built-in) | Standard kubeadm init/join, supports stacked/external etcd |
| **EKS** | Bootstrap for EKS managed control plane (AWS) |
| **K3s** | Lightweight K8s bootstrap (edge, IoT) |
| **RKE2** | Rancher K8s bootstrap, CIS-hardened |
| **Talos** | API-driven bootstrap (Sidero Labs), immutable OS |
| **k0smotron** | K0s-based bootstrap + hosted control plane |
| **MicroK8s** | Canonical MicroK8s bootstrap |
| **Canonical Kubernetes** | Canonical K8s (snap-based) |
#### 3. Control Plane Provider
Manages control plane nodes.
| Provider | Description |
|----------|-------------|
| **KubeadmControlPlane** (built-in) | Kubeadm-managed CP, stacked/external etcd |
| **EKS** | AWS EKS managed control plane |
| **Kamaji** | Hosted control plane (CP runs as deployment in management cluster) |
| **K3s** | K3s control plane (edge-optimized) |
| **RKE2** | RKE2 control plane |
| **Talos** | Talos control plane, API-based management |
| **k0smotron** | Hosted control plane (k0s-based) |
| **Nested** | Nested virtualization control plane |
### ClusterClass and Managed Topologies
ClusterClass (stable since CAPI v1beta1, CAPI v1.0+) allows defining a **cluster template**:
```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: ClusterClass
metadata:
name: standard-aws-cluster
spec:
controlPlane:
ref:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlaneTemplate
name: aws-cp-tmpl
machineInfrastructure:
ref:
kind: AWSMachineTemplate
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
name: aws-cp-machine-tmpl
workers:
machineDeployments:
- class: default-worker
template:
bootstrap:
ref:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
name: aws-worker-bootstrap-tmpl
infrastructure:
ref:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSMachineTemplate
name: aws-worker-machine-tmpl
variables:
- name: instanceType
required: true
schema:
openAPIV3Schema:
type: string
enum: ["t3.large", "m5.large", "m5.xlarge"]
```
Then create a cluster with variable overrides:
```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: dev-team-alpha
namespace: clusters
spec:
topology:
class: standard-aws-cluster
version: v1.30.2
controlPlane:
replicas: 1
workers:
machineDeployments:
- class: default-worker
name: md-0
replicas: 2
variables:
- name: instanceType
value: "m5.xlarge"
```
### Cluster lifecycle with CAPI
| Phase | Action | CAPI mechanism |
|-------|--------|----------------|
| **Create** | `kubectl apply -f cluster.yaml` | Controller creates infra (VM, network), runs kubeadm init/join bootstrap |
| **Scale** | Update `replicas` in MachineDeployment | Controller creates/removes Machine → VM → node join/drain |
| **Upgrade** | Change `version` in KubeadmControlPlane / MachineDeployment | Rolling update: new CP node → upgrade → old drain & delete. Workers: MachineDeployment rolling update |
| **Health check** | MachineHealthCheck | If node unhealthy > timeout, controller creates replacement Machine |
| **Delete** | `kubectl delete cluster` | Controller drains, deletes VMs, cleans up infrastructure |
| **Template update** | Change AWSMachineTemplate / KubeadmConfigTemplate | New Machines use the new template; existing Machines only affected via rolling update |
### Auto-remediation (MachineHealthCheck)
```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineHealthCheck
metadata:
name: prod-mhc
namespace: clusters
spec:
clusterName: prod-us-east
selector:
matchLabels:
cluster.x-k8s.io/deployment-name: prod-us-east-workers
unhealthyConditions:
- type: Ready
status: "False"
timeout: 5m
- type: Ready
status: Unknown
timeout: 5m
maxUnhealthy: "40%"
nodeStartupTimeout: 10m
```
### CAPI + GitOps
CAPI integrates naturally with GitOps:
- **ArgoCD** — Cluster and MachineDeployment manifests in Git repo, ArgoCD applies them to the management cluster
- **Flux** — `Kustomization` + `OCIRepository` for CAPI objects
- **Crossplane** — can be combined: Crossplane provisions cloud resources (VPC, subnets), CAPI manages K8s clusters on top
Pattern: a dedicated "fleet management" cluster running CAPI + ArgoCD. All workload clusters are defined as YAML in Git.
### CAPI for on-prem
| Provider | Use case | Note |
|----------|----------|------|
| **Metal3** (Ironic) | Bare metal provisioning (PXE, IPMI, Redfish) | Automatically provisions BM servers as K8s nodes |
| **CAPV (vSphere)** | VMware VMs as K8s nodes | Most common enterprise on-prem |
| **CAPO (OpenStack)** | OpenStack VMs as K8s nodes | OpenStack-native |
| **Nutanix (CAPNX)** | Nutanix AHV/Prism | Community provider |
### CAPI for edge
| Provider | Use case | Note |
|----------|----------|------|
| **K3s bootstrap + control plane** | Lightweight K8s on edge devices | Single binary, SQLite/embedded etcd |
| **RKE2 bootstrap + control plane** | Enterprise edge, air-gapped | CIS-hardened, FIPS |
| **Talos** | Immutable OS, API-driven | Minimal footprint, no SSH |
| **k0smotron** | Hosted control plane for edge clusters | CP runs in management cluster, worker on edge |
### CAPI vs alternatives
| Tool | Approach | CAPI advantage | CAPI disadvantage |
|------|----------|----------------|-------------------|
| **Terraform/Pulumi** | Imperative/declarative IaC | CAPI is K8s-native — same tool for apps and clusters; GitOps ready | Terraform has broader non-K8s resource support |
| **kubeadm** | Manual or scripted | CAPI automates full lifecycle including upgrades and remediation | Higher complexity, requires management cluster |
| **Rancher** | Web UI + API for K8s cluster management | CAPI is open-source, vendor-neutral | Rancher has GUI, monitoring, app catalog |
| **OpenShift Hive/ACM** | Red Hat Advanced Cluster Management | CAPI is standard (SIG) — wider provider ecosystem | ACM has governance, policy, compliance |
### Limitations and maturity
- **Management cluster is SPOF** — needs its own HA and backup (etcd snapshots, certificates)
- **CAPI is not a cluster autoscaler** — it handles cluster lifecycle, not pod auto-scaling within a cluster (use Cluster Autoscaler separately)
- **Provider maturity varies** — AWS/Azure/vSphere stable, GCP/OpenStack beta, some community providers alpha
- **etcd backup is not built-in** — must be handled externally (Velero, etcd snapshot)
- **CAPI does not handle applications** — only K8s cluster lifecycle (monitoring, logging, ingress is user-managed)
- **Learning curve** — requires understanding management cluster, provider model, CRDs
- **CAPI v1.13+ (2026)** — stable release, v1beta1 API is GA, ClusterClass stable, EKS/AKS/GKE managed control plane support
### Recommended production CAPI stack
| Component | Recommendation |
|-----------|---------------|
| **Management cluster** | K3s (small footprint) or kubeadm (3 nodes HA) |
| **Infra provider** | CAPA (AWS) / CAPV (vSphere) / CAPO (OpenStack) — based on platform |
| **Bootstrap/CP provider** | Kubeadm or RKE2 |
| **GitOps** | ArgoCD or Flux |
| **Backup** | Velero + restic/Ceph |
| **Cluster autoscaler** | Cluster Autoscaler (via CAPI integration) |
| **Network** | Cilium (CAPI-native, support) |
| **Secrets** | External Secrets Operator / Sealed Secrets |
| **Monitoring** | Prometheus + Grafana (kube-prometheus-stack) |
| **Ingress** | ingress-nginx / Kong / Traefik |
## Sources
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
*Last revision: 2026-06-18*

299
KUBERNETES.md Normal file
View File

@@ -0,0 +1,299 @@
# ☸ Kubernetes — architektura, platformy, Cluster API
## Přehled
Kubernetes (K8s) je open-source orchestrátor kontejnerů — de facto standard pro nasazování, škálování a správu containerizovaných aplikací. Postaven na modelu deklarativní konfigurace a control loopů (reconciliation).
## Způsoby nasazení Kubernetes
| Metoda | Popis | Správa control plane | Vhodné pro |
|--------|-------|---------------------|------------|
| **kubeadm** | Oficiální nástroj pro bootstrap K8s clusteru | Self-managed (stacked/external etcd) | On-prem, lab, learning |
| **K3s** | Lightweight K8s (Rancher), single binary, embedded etcd/SQLite | Self-managed | Edge, IoT, low-resource, HA s embedded etcd |
| **RKE2** | Rancher Kubernetes Engine 2, CIS-hardened, FIPS-ready | Self-managed | Enterprise on-prem, air-gapped, regulatory |
| **OpenShift** | Red Hat enterprise K8s + operator lifecycle + SDN + routing | Self-managed (RHCOS) | Enterprise, multicluster, platform engineering |
| **Vanilla K8s (CAPI)** | Cluster API — deklarativní provisioning a lifecycle management | Self-managed (CAPI managed) | Fleet management, GitOps, multi-provider |
| **EKS** (AWS) | Managed K8s | AWS managed | AWS cloud-native, nejméně ops |
| **AKS** (Azure) | Managed K8s | Azure managed | Azure cloud-native |
| **GKE** (GCP) | Managed K8s, auto-pilot, autopilot modes | GCP managed | GCP cloud-native |
| **SKE** (Sangfor) | Managed K8s on Sangfor HCI | Vendor managed | Sangfor HCI ekosystém |
---
## Cluster API (CAPI)
### Co je Cluster API
Cluster API je Kubernetes sub-projekt (SIG Cluster-Lifecycle), který přináší deklarativní API pro provisioning, upgrade a operace Kubernetes clusterů. Místo Terraform skriptů nebo manuálního `kubeadm` definujete cluster jako Kubernetes Custom Resources — `Cluster`, `Machine`, `MachineDeployment` atd.
Princip: **Kubernetes cluster, který spravuje Kubernetes clustery.**
### Architektura
```
┌─────────────────────────────────────────┐
│ Management Cluster │
│ │
│ ┌──────────────────────────────────┐ │
│ │ CAPI Controllers │ │
│ │ ┌──────┐ ┌──────┐ ┌─────────┐ │ │
│ │ │ Infra│ │Bootstrap│ │Control │ │ │
│ │ │ Prov │ │ Prov │ │Plane Pr │ │ │
│ │ └──────┘ └──────┘ └─────────┘ │ │
│ └──────────────────────────────────┘ │
│ │
│ CR: Cluster, Machine, MachineDeployment│
│ ... │
└────────────────┬────────────────────────┘
│ CAPI controller
│ vytváří / spravuje
┌────────┴────────┐
▼ ▼
┌───────────────┐ ┌───────────────┐
│ Workload │ │ Workload │
│ Cluster (dev) │ │ Cluster (prod)│
│ ┌───┐ ┌───┐ │ │ ┌───┐ ┌───┐ │
│ │ CP│ │ W │ │ │ │ CP│ │ W │ │
│ └───┘ └───┘ │ │ └───┘ └───┘ │
└───────────────┘ └───────────────┘
```
- **Management cluster** — Kubernetes cluster, kde běží CAPI controllery. Může to být vyhrazený "admin" cluster (často velmi malý).
- **Workload (managed) cluster** — Kubernetes clustery, které CAPI spravuje. Každý je reprezentován jako CRD v management clusteru.
- **Machine** — abstrakce compute jednotky (VM, bare metal), která se stane K8s uzlem.
### Klíčové CRD (Custom Resource Definitions)
| CRD | API skupina | Účel |
|-----|------------|------|
| **Cluster** | `cluster.x-k8s.io` | Reprezentace clusteru (infra reference, control plane ref, networking) |
| **Machine** | `cluster.x-k8s.io` | Jednotlivý uzel (VM/BM instance) |
| **MachineDeployment** | `cluster.x-k8s.io` | Deklarativní škálování a rolling update workerů |
| **MachineSet** | `cluster.x-k8s.io` | Replica set pro Machiny (lower-level) |
| **MachineHealthCheck** | `cluster.x-k8s.io` | Auto-remediaci (automatické nahrazení unhealthy uzlu) |
| **ClusterClass** | `cluster.x-k8s.io` | Šablona pro vytváření clusterů |
| **KubeadmControlPlane** | `controlplane.cluster.x-k8s.io` | Control plane managed kubeadm (stacked/external etcd) |
| **KubeadmConfig / KubeadmConfigTemplate** | `bootstrap.cluster.x-k8s.io` | Bootstrap konfigurace (kubeadm init/join) |
### Provider model
CAPI používá třívrstvý provider model:
#### 1. Infrastructure Provider
Vytváří a spravuje infrastrukturu (VM, sítě, LB, storage).
| Provider | Platforma | Status |
|----------|-----------|--------|
| **AWS (CAPA)** | AWS EC2, VPC, ELB, EKS | Stable, SIG-sponsored |
| **Azure (CAPZ)** | Azure VM, VNet, LB, AKS | Stable, SIG-sponsored |
| **GCP (CAPG)** | GCP Compute, VPC, GKE | Beta |
| **vSphere (CAPV)** | VMware vSphere | Stable |
| **OpenStack (CAPO)** | OpenStack compute/network | Stable |
| **Metal3** | Bare metal (Ironic) | Stable |
| **Docker (CAPD)** | Docker containers (development) | Tilt/Dev only |
| **Akamai (Linode)** | Linode | Community |
| **Azure Stack HCI** | Azure Stack HCI | Community |
| **cloudscale** | cloudscale.ch | Community |
| **Exoscale** | Exoscale | Community |
| **IBM Cloud** | IBM Cloud | Community |
| **Equinix Metal** | Equinix (ex Packet) | Community |
| **Hetzner** | Hetzner Cloud | Community |
| **OpenNebula** | OpenNebula | Community |
#### 2. Bootstrap Provider
Zajišťuje inicializaci K8s na node (kubeadm init/join, TLS certs, tokeny).
| Provider | Popis |
|----------|-------|
| **Kubeadm** (vestavěný) | Standardní kubeadm init/join, podpora stacked/external etcd |
| **EKS** | Bootstrap pro EKS managed control plane (AWS) |
| **K3s** | Lightweight K8s bootstrap (edge, IoT) |
| **RKE2** | Rancher K8s bootstrap, CIS-hardened |
| **Talos** | API-driven bootstrap (Sidero Labs), immutable OS |
| **k0smotron** | K0s-based bootstrap + hosted control plane |
| **MicroK8s** | Canonical MicroK8s bootstrap |
| **Canonical Kubernetes** | Canonical K8s (snap-based) |
#### 3. Control Plane Provider
Spravuje control plane uzly.
| Provider | Popis |
|----------|-------|
| **KubeadmControlPlane** (vestavěný) | Kubeadm-managed CP, stacked/external etcd |
| **EKS** | AWS EKS managed control plane |
| **Kamaji** | Hosted control plane (CP běží jako deployment v management clusteru) |
| **K3s** | K3s control plane (edge-optimized) |
| **RKE2** | RKE2 control plane |
| **Talos** | Talos control plane, API-based management |
| **k0smotron** | Hosted control plane (k0s-based) |
| **Nested** | Nested virtualization control plane |
### ClusterClass a Managed Topologies
ClusterClass (stabilní od CAPI v1beta1, CAPI v1.0+) umožňuje definovat **šablonu clusteru**:
```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: ClusterClass
metadata:
name: standard-aws-cluster
spec:
controlPlane:
ref:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlaneTemplate
name: aws-cp-tmpl
machineInfrastructure:
ref:
kind: AWSMachineTemplate
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
name: aws-cp-machine-tmpl
workers:
machineDeployments:
- class: default-worker
template:
bootstrap:
ref:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
name: aws-worker-bootstrap-tmpl
infrastructure:
ref:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSMachineTemplate
name: aws-worker-machine-tmpl
variables:
- name: instanceType
required: true
schema:
openAPIV3Schema:
type: string
enum: ["t3.large", "m5.large", "m5.xlarge"]
```
Pak lze vytvořit cluster s přetížením proměnných:
```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: dev-team-alpha
namespace: clusters
spec:
topology:
class: standard-aws-cluster
version: v1.30.2
controlPlane:
replicas: 1
workers:
machineDeployments:
- class: default-worker
name: md-0
replicas: 2
variables:
- name: instanceType
value: "m5.xlarge"
```
### Životní cyklus clusteru s CAPI
| Fáze | Akce | CAPI mechanismus |
|------|------|------------------|
| **Create** | `kubectl apply -f cluster.yaml` | Controller vytvoří infra (VM, network), provede bootstrap kubeadm init/join |
| **Scale** | Upravit `replicas` v MachineDeployment | Controller vytvoří/odstraní Machine → VM → node join/drain |
| **Upgrade** | Změnit `version` v KubeadmControlPlane / MachineDeployment | Rolling update: nový CP node → upgrade → starý drain a delete. Workers: MachineDeployment rolling update |
| **Health check** | MachineHealthCheck | Pokud node unhealthy > timeout, controller vytvoří náhradní Machine |
| **Delete** | `kubectl delete cluster` | Controller provede drain, delete VMs, cleanup infrastruktury |
| **Template update** | Změna AWSMachineTemplate / KubeadmConfigTemplate | Stroj se vytvoří s novou šablonou; stávající Machiny se dotýká jen přes rolling update |
### Auto-remediace (MachineHealthCheck)
```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineHealthCheck
metadata:
name: prod-mhc
namespace: clusters
spec:
clusterName: prod-us-east
selector:
matchLabels:
cluster.x-k8s.io/deployment-name: prod-us-east-workers
unhealthyConditions:
- type: Ready
status: "False"
timeout: 5m
- type: Ready
status: Unknown
timeout: 5m
maxUnhealthy: "40%"
nodeStartupTimeout: 10m
```
### CAPI + GitOps
CAPI se přirozeně integruje s GitOps:
- **ArgoCD** — Cluster a MachineDeployment manifesty v Git repozitáři, ArgoCD je aplikuje na management cluster
- **Flux** — `Kustomization` + `OCIRepository` pro CAPI objekty
- **Crossplane** — lze kombinovat: Crossplane pro provisioning cloud resources (VPC, subnets), CAPI pro K8s cluster na nich
Vzor: vyhrazený "fleet management" cluster, na kterém běží CAPI + ArgoCD. Všechny workload clustery jsou definované jako YAML v Gitu.
### CAPI pro on-prem
| Provider | Use case | Poznámka |
|----------|----------|----------|
| **Metal3** (Ironic) | Bare metal provisioning (PXE, IPMI, Redfish) | Automatické provisionování BM serverů jako K8s nodes |
| **CAPV (vSphere)** | VMware VM jako K8s nodes | Většina enterprise on-prem |
| **CAPO (OpenStack)** | OpenStack VM jako K8s nodes | OpenStack-native |
| **Nutanix (CAPNX)** | Nutanix AHV/Prism | Community provider |
### CAPI pro edge
| Provider | Use case | Poznámka |
|----------|----------|----------|
| **K3s bootstrap + control plane** | Lightweight K8s na edge zařízeních | Single binary, SQLite/embedded etcd |
| **RKE2 bootstrap + control plane** | Enterprise edge, air-gapped | CIS-hardened, FIPS |
| **Talos** | Immutable OS, API-driven | Minimal footprint, no SSH |
| **k0smotron** | Hosted control plane pro edge clustery | CP běží v management clusteru, worker na edge |
### CAPI vs alternativy
| Nástroj | Přístup | CAPI výhoda | CAPI nevýhoda |
|---------|---------|-------------|---------------|
| **Terraform/Pulumi** | Imperativní/declarativní IaC | CAPI je K8s-native — stejný nástroj pro appky i clustery; GitOps ready | Terraform má širší podporu non-K8s resources |
| **kubeadm** | Manuální nebo skriptovaný | CAPI automatizuje celý lifecycle včetně upgradů a remediací | Vyšší komplexita, nutný management cluster |
| **Rancher** | Web UI + API pro správu K8s clusterů | CAPI je open-source, vendor-neutral | Rancher má GUI, monitoring, katalog appek |
| **OpenShift Hive/ACM** | Red Hat Advanced Cluster Management | CAPI je standardní (SIG) — širší provider ecosystem | ACM má governance, policy, compliance |
### Limitations a maturity
- **Management cluster je SPOF** — musí mít vlastní HA a backup (etcd zálohy, certifikáty)
- **CAPI není cluster autoscaler** — řeší lifecycle clusterů, ne auto-scaling podů v rámci clusteru (používá se Cluster Autoscaler samostatně)
- **Provider maturity se liší** — AWS/Azure/vSphere stabilní, GCP/OpenStack beta, některé community providers alpha
- **etcd backup není built-in** — nutné řešit externě (Velero, etcd snapshot)
- **CAPI neřeší aplikace** — pouze lifecycle K8s clusterů (monitoring, logging, ingress si řídí uživatel)
- **Learning curve** — nutnost management clusteru, pochopení provider modelu, CRDs
- **CAPI v1.13+ (2026)** — stable release, v1beta1 API je GA, ClusterClass stable, EKS/AKS/GKE managed control plane podpora
### Doporučený stack pro CAPI v produkci
| Komponenta | Doporučení |
|------------|------------|
| **Management cluster** | K3s (malý footprint) nebo kubeadm (3 nodes HA) |
| **Infra provider** | CAPA (AWS) / CAPV (vSphere) / CAPO (OpenStack) — dle platformy |
| **Bootstrap/CP provider** | Kubeadm nebo RKE2 |
| **GitOps** | ArgoCD nebo Flux |
| **Backup** | Velero + restic/Ceph |
| **Cluster autoscaler** | Cluster Autoscaler (přes CAPI integration) |
| **Network** | Cilium (CAPI-native, podpora) |
| **Secrets** | External Secrets Operator / Sealed Secrets |
| **Monitoring** | Prometheus + Grafana (kube-prometheus-stack) |
| **Ingress** | ingress-nginx / Kong / Traefik |
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Poslední revize: 2026-06-18*

275
MESSAGING.en.md Normal file
View File

@@ -0,0 +1,275 @@
# 📨 Messaging and streaming platforms
## Platform overview
| Platform | Type | Language | Protocol | Persistence | Use case |
|-----------|-----|-------|----------|-------------|----------|
| **Apache Kafka** | Distributed event store | Java/Scala | Binary (TCP) | Disk (log) | Event streaming, data pipeline, log aggregation |
| **RabbitMQ** | Message broker | Erlang | AMQP 0-9-1, MQTT, STOMP | Disk / RAM | Application messaging, task queue, RPC |
| **Apache Pulsar** | Distributed messaging + streaming | Java | Binary (TCP) + REST | Disk (segmented log) | Streaming + queue in one, multi-tenant |
| **NATS** | Lightweight messaging | Go | NATS protocol (TCP) | Memory / JetStream (disk) | Microservices, IoT, edge, low-latency |
| **AWS SQS** | Managed queue | — | HTTPS | Managed | Decoupling services, serverless |
| **AWS SNS** | Managed pub/sub | — | HTTPS, SQS, Lambda, email | Managed | Push notifications, fanout |
| **Azure Service Bus** | Managed messaging | — | AMQP, HTTPS | Managed | Enterprise messaging, sessions, transactions |
| **Google Pub/Sub** | Managed streaming | — | gRPC, REST | Managed | Event-driven, data pipeline |
| **Red Hat AMQ 7** (Artemis) | Message broker | Java | AMQP, MQTT, STOMP, OpenWire | Disk | Enterprise, JMS, high-availability |
| **Oracle Service Bus (OSB)** | Enterprise ESB | Java | HTTP/S, JMS, SOAP, REST, MQ, FTP, AQ | Managed (WebLogic) | Enterprise integration, SOA, protocol mediation, routing |
---
## Platform details
### Apache Kafka
**Architecture:**
```
Producer ──► Topic ──► Partition ──► Consumer Group
├── Partition 0 (Leader) ──► Broker 1
├── Partition 1 (Follower) ──► Broker 2
└── Partition 2 (Follower) ──► Broker 3
```
| Concept | Description |
|---------|-------|
| **Topic** | Logical message category |
| **Partition** | Append-only log, ordered sequence of messages |
| **Broker** | Server in Kafka cluster |
| **Producer** | Publishes messages to topic |
| **Consumer** | Reads messages from partition (within consumer group) |
| **Consumer Group** | Group of consumers sharing topic reading |
| **Offset** | Position in partition (tracked by consumer) |
| **KRaft** | Controller quorum (replaces Zookeeper from Kafka 3.x) |
**Replication and HA:**
| Parameter | Value |
|----------|---------|
| Replication factor | 23 (typically 3 for production) |
| ISR (In-Sync Replicas) | Number of replicas keeping up with leader |
| Min ISR | Minimum ISR for acknowledging writes (acks=all) |
| acks=0 | Fire-and-forget (fastest, possible data loss) |
| acks=1 | Write acknowledged by leader (compromise) |
| acks=all | Write acknowledged by all ISR (safest) |
| Leader failover | Automatic election of new leader from ISR |
**Important configuration:**
```properties
# Production
replication.factor=3
min.insync.replicas=2
default.replication.factor=3
# Retention
log.retention.hours=168 # 7 days
log.retention.bytes=-1 # unlimited (or limit)
log.segment.bytes=1073741824 # 1 GB per segment
# Performance
num.partitions=3 # adjust per need (scale-out)
compression.type=snappy # (snappy, gzip, lz4, zstd)
```
**Partitioning strategies:**
| Strategy | Key | Advantage | Disadvantage |
|----------|------|--------|----------|
| Round-robin | null | Even distribution | Per-key ordering lost |
| Key-based | user_id, order_id | Same key → same partition | Uneven distribution (hot keys) |
| Custom partitioner | Custom logic | Per use-case optimization | More complex maintenance |
### RabbitMQ
**Architecture:**
```
Producer ──► Exchange ──► Binding ──► Queue ──► Consumer
┌───────────┼───────────┐
▼ ▼ ▼
Direct Topic Fanout
Exchange Exchange Exchange
```
| Concept | Description |
|---------|-------|
| **Exchange** | Receives messages from producer, routes to queue |
| **Binding** | Exchange → queue link with routing key |
| **Queue** | FIFO message queue (consumed by consumer) |
| **Virtual Host (vhost)** | Tenant isolation within a single cluster |
| **Publisher Confirm** | Broker acknowledges message receipt |
| **Consumer Ack** | Consumer acknowledges message processing |
**Exchange types:**
| Type | Routing | Use case |
|-----|---------|----------|
| **Direct** | routing_key = binding_key | Task queue, point-to-point |
| **Topic** | routing_key match binding pattern (wildcard `*`, `#`) | Pub/sub with filtering |
| **Fanout** | All bound queues | Broadcast, event notification |
| **Headers** | AMQP headers match | Complex routing (not routing key dependent) |
**Queue types:**
```properties
# Classic Queue (deprecated in production)
x-queue-type: classic
# Quorum Queue (recommended for production)
x-queue-type: quorum
x-quorum-initial-group-size: 3
x-dead-letter-exchange: dlx
# Stream Queue (for large backlogs)
x-queue-type: stream
x-max-length-bytes: 1073741824
```
**HA and clustering:**
| Mode | Description | Use case |
|-------|-------|----------|
| **Quorum Queues** | Raft-based replication (35 node), auto failover | Production, HA messaging |
| **Federation** | Async message forwarding between independent RabbitMQ clusters | Multi-region, DR |
| **Shovel** | Point-to-point message forwarding (Federation at queue level) | Migration, specific routing |
| **Warm Standby (DR)** | Secondary cluster, started on failover | Cold DR |
### Apache Pulsar
**Unique architecture (compute/storage separation):**
```
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Producer │ │ Consumer │ │ Consumer │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
┌──────▼───────────────────▼───────────────────▼──────┐
│ Broker (stateless) │
│ Subscription: Exclusive / Shared / Failover │
└──────────────────────┬──────────────────────────────┘
┌──────────────────────▼──────────────────────────────┐
│ BookKeeper (stateful storage) │
│ ├── Bookie 1 ├── Bookie 2 ├── Bookie 3 ├── ... │
│ └── Ledger (append-only, segmented log) │
└─────────────────────────────────────────────────────┘
```
| Concept | Description |
|---------|-------|
| **Topic** | Logical category (partitioned or non-partitioned) |
| **Subscription** | Delivery mode (Exclusive, Shared, Failover, Key_Shared) |
| **Ledger** | Storage unit in BookKeeper (append-only) |
| **Bookie** | Storage node (BookKeeper) |
| **Managed Ledger** | Segmented log with cache and retention |
**Advantages over Kafka:**
- Compute/storage separation — independent scaling
- Geo-replication built-in (native)
- Multi-tenant (namespaces, isolation)
- TTL, retry, dead letter topic (built-in)
- Read-at-least-once / effectively-once
### NATS
| Feature | Description |
|---------|-------|
| **Core NATS** | Pub/sub, request-reply, < 1 ms latency |
| **JetStream** | Persistence, exactly-once, key-value store, object store |
| **Leaf nodes** | Hierarchical cluster connection |
| **Super-cluster** | Multi-region clustering (global) |
**Use case:** IoT, edge computing, microservices communication, low-latency messaging.
### Oracle Service Bus (OSB)
Part of Oracle SOA Suite, runs on WebLogic Server. Enterprise service bus for integration in Oracle-heavy environments.
| Concept | Description |
|---------|-------|
| **Proxy Service** | Inbound endpoint (HTTP, JMS, MQ, SOAP, REST) |
| **Business Service** | Target backend service |
| **Pipeline** | Message processing — routing, transformation, validation |
| **Split-Join** | Parallel/sequential orchestration of multiple services |
| **Reporting** | Message tracking, SLA monitoring |
**Key features:**
- **Protocol mediation** — translation between SOAP/REST/JMS/MQ/FTP
- **Message transformation** — XSLT, XQuery, MFL (non-XML)
- **Throttling, SLA, alerting** — built-in
- **Oracle AQ (Advanced Queuing)** — integration with Oracle DB queues
- **XPath, XQuery, XSLT 2.0/3.0** — native support
- **Error handling** — fault policies, error queues, retry
**Use case:** Enterprise SOA, Oracle DB → Kafka bridging, legacy mainframe wrapping, B2B integration.
**Alternatives:** IBM Integration Bus (IIB), MuleSoft Anypoint, WSO2 EI, Apache Camel / ServiceMix.
---
## Platform comparison
### Performance and scaling
| Platform | Max throughput | Latency (P99) | Messages/s (1 broker) | Scaling |
|-----------|--------------|---------------|-------------------------|-----------|
| **Kafka** | > 1 GB/s | 210 ms | ~1,000,000 | Partitions (horizontal) |
| **Pulsar** | > 1 GB/s | 515 ms | ~1,000,000 | Brokers + Bookies |
| **RabbitMQ** | ~100 MB/s | < 1 ms (RAM) | ~100,000 | Clustering (node) |
| **NATS** | > 10 GB/s | < 0.5 ms | ~10,000,000 | Clustering + Leaf nodes |
| **OSB** | < 1 GB/s | 10100 ms | ~10,000 | Vertical (WebLogic cluster)
### Delivery guarantees
| Platform | At most once | At least once | Exactly once | Ordering |
|-----------|-------------|---------------|-------------|----------|
| **Kafka** | Yes | Yes (acks=all + min.insync) | Yes (idempotent + transactional) | Per partition |
| **Pulsar** | Yes | Yes | Yes (dedup + transactional) | Per partition |
| **RabbitMQ** | Yes | Yes (Publisher Confirm + Consumer Ack) | Limited | Per queue |
| **NATS** | Yes | Yes (JetStream) | Limited | Per subject |
| **OSB** | Yes | Yes (XA transactions, exactly-once delivery) | Yes (XA + WS-AT) | Per pipeline |
### When to use what
| Use case | Recommended platform | Reasoning |
|----------|---------------------|------------|
| **Event sourcing / audit log** | Kafka, Pulsar | Append-only log, high throughput, replay |
| **CDC (Change Data Capture)** | Kafka (Kafka Connect + Debezium) | Connector ecosystem |
| **Task queue (job processing)** | RabbitMQ, SQS | Dead letter, retry, priority, scheduling |
| **API messaging / microservices** | NATS, RabbitMQ | Low latency, simplicity |
| **Data pipeline (ETL)** | Kafka (KSQL, Kafka Streams) | Stream processing in platform |
| **IoT / Edge** | NATS, MQTT (RabbitMQ) | Lightweight, leaf nodes |
| **Enterprise SOA / EAI** | OSB, IBM IIB, MuleSoft | Protocol mediation, XA, B2B, legacy wrapping |
| **Multi-tenant cloud** | Pulsar | Native multi-tenant, geo-replication |
| **Serverless / event-driven** | SQS/SNS, Pub/Sub | Managed, auto-scaling |
---
## DR and high availability
See [DATACENTERS.en.md](DATACENTERS.en.md) — section "Impact of individual technologies on DC topology selection" for detailed DR mapping per platform.
### Best practices
- **Don't lose messages in queue** — prefer acknowledgement-based consumption (not auto-ack)
- **Dead letter queue** — every main queue has a DLQ for undeliverable messages
- **Monitor lag** — consumer lag is a key metric (Kafka: `kafka.consumer:consumer_lag`)
- **Idempotent consumer** — same message may be delivered twice
- **Retry with backoff** — exponential backoff on processing failure
- **Schema registry** — avoid deserialization errors (Avro, Protobuf, JSON Schema)
- **Encryption** — TLS in transit, encryption at rest (Kafka: cluster-side + topic-level)
---
## Related
- [DATACENTERS.en.md](DATACENTERS.en.md) — DR topology, per-platform mapping
- [CLOUD.en.md](CLOUD.en.md) — managed messaging (SQS, SNS, Service Bus, Pub/Sub)
## Sources
Links, books, and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
*Last revision: 2026-06-12*

275
MESSAGING.md Normal file
View File

@@ -0,0 +1,275 @@
# 📨 Messaging a streaming platformy
## Přehled platformem
| Platforma | Typ | Jazyk | Protokol | Persistence | Use case |
|-----------|-----|-------|----------|-------------|----------|
| **Apache Kafka** | Distributed event store | Java/Scala | Binary (TCP) | Disk (log) | Event streaming, data pipeline, log aggregation |
| **RabbitMQ** | Message broker | Erlang | AMQP 0-9-1, MQTT, STOMP | Disk / RAM | Aplikační messaging, task queue, RPC |
| **Apache Pulsar** | Distributed messaging + streaming | Java | Binary (TCP) + REST | Disk (segmented log) | Streaming + queue v jednom, multi-tenant |
| **NATS** | Lightweight messaging | Go | NATS protocol (TCP) | Memory / JetStream (disk) | Microservices, IoT, edge, low-latency |
| **AWS SQS** | Managed queue | — | HTTPS | Managed | Decoupling services, serverless |
| **AWS SNS** | Managed pub/sub | — | HTTPS, SQS, Lambda, email | Managed | Push notifications, fanout |
| **Azure Service Bus** | Managed messaging | — | AMQP, HTTPS | Managed | Enterprise messaging, sessions, transactions |
| **Google Pub/Sub** | Managed streaming | — | gRPC, REST | Managed | Event-driven, data pipeline |
| **Red Hat AMQ 7** (Artemis) | Message broker | Java | AMQP, MQTT, STOMP, OpenWire | Disk | Enterprise, JMS, high-availability |
| **Oracle Service Bus (OSB)** | Enterprise ESB | Java | HTTP/S, JMS, SOAP, REST, MQ, FTP, AQ | Managed (WebLogic) | Enterprise integration, SOA, protocol mediation, routing |
---
## Detail platformem
### Apache Kafka
**Architektura:**
```
Producer ──► Topic ──► Partition ──► Consumer Group
├── Partition 0 (Leader) ──► Broker 1
├── Partition 1 (Follower) ──► Broker 2
└── Partition 2 (Follower) ──► Broker 3
```
| Koncept | Popis |
|---------|-------|
| **Topic** | Logická kategorie zpráv |
| **Partition** | Append-only log, ordered sequence of messages |
| **Broker** | Server v Kafka clusteru |
| **Producer** | Publikuje zprávy do topicu |
| **Consumer** | Čte zprávy z partition (v rámci consumer group) |
| **Consumer Group** | Skupina consumerů sdílejících čtení topicu |
| **Offset** | Pozice v partition (sledovaná consumerem) |
| **KRaft** | Controller quorum (nahrazuje Zookeeper od Kafka 3.x) |
**Replikace a HA:**
| Parametr | Hodnota |
|----------|---------|
| Replication factor | 23 (typicky 3 pro produkci) |
| ISR (In-Sync Replicas) | Počet replik, které drží krok s leaderem |
| Min ISR | Minimální počet ISR pro potvrzení zápisu (acks=all) |
| acks=0 | Fire-and-forget (nejrychlejší, možná ztráta dat) |
| acks=1 | Zápis potvrzen leaderem (kompromis) |
| acks=all | Zápis potvrzen všemi ISR (nejbezpečnější) |
| Leader failover | Automatický výběr nového leadera z ISR |
**Důležité konfigurace:**
```properties
# Produkce
replication.factor=3
min.insync.replicas=2
default.replication.factor=3
# Retention
log.retention.hours=168 # 7 dní
log.retention.bytes=-1 # neomezeno (nebo limit)
log.segment.bytes=1073741824 # 1 GB per segment
# Performance
num.partitions=3 # podle potřeb (scale-out)
compression.type=snappy # (snappy, gzip, lz4, zstd)
```
**Partitioning strategies:**
| Strategy | Klíč | Výhoda | Nevýhoda |
|----------|------|--------|----------|
| Round-robin | null | Rovnoměrné rozložení | Ztráta pořadí per klíč |
| Key-based | user_id, order_id | Zprávy se stejným klíčem → stejná partition | Nerovnoměrné rozložení (hot keys) |
| Custom partitioner | Vlastní logika | Optimalizace per use case | Složitější na údržbu |
### RabbitMQ
**Architektura:**
```
Producer ──► Exchange ──► Binding ──► Queue ──► Consumer
┌───────────┼───────────┐
▼ ▼ ▼
Direct Topic Fanout
Exchange Exchange Exchange
```
| Koncept | Popis |
|---------|-------|
| **Exchange** | Přijímá zprávy od producera, routuje do queue |
| **Binding** | Vazba exchange → queue s routing key |
| **Queue** | FIFO fronta zpráv (consumer čte) |
| **Virtual Host (vhost)** | Izolace tenantů v rámci jednoho clusteru |
| **Publisher Confirm** | Potvrzení že broker zprávu přijal |
| **Consumer Ack** | Potvrzení že consumer zprávu zpracoval |
**Exchange typy:**
| Typ | Routing | Use case |
|-----|---------|----------|
| **Direct** | routing_key = binding_key | Task queue, point-to-point |
| **Topic** | routing_key match binding pattern (wildcard `*`, `#`) | Pub/sub s filtrováním |
| **Fanout** | Všem bindovaným queue | Broadcast, event notification |
| **Headers** | AMQP headers match | Komplexní routing (není závislý na routing key) |
**Queue typy:**
```properties
# Classic Queue (deprecated v produkci)
x-queue-type: classic
# Quorum Queue (doporučeno pro produkci)
x-queue-type: quorum
x-quorum-initial-group-size: 3
x-dead-letter-exchange: dlx
# Stream Queue (pro large backlogs)
x-queue-type: stream
x-max-length-bytes: 1073741824
```
**HA a clustering:**
| Režim | Popis | Use case |
|-------|-------|----------|
| **Quorum Queues** | Raft-based replikace (35 node), auto failover | Produkce, HA messaging |
| **Federation** | Async forwarding zpráv mezi nezávislými RabbitMQ clustery | Multi-region, DR |
| **Shovel** | Point-to-point forwarding zpráv (Federation na úrovni queue) | Migrace, specifický routing |
| **Warm Standby (DR)** | Druhý cluster, start až při failoveru | Cold DR |
### Apache Pulsar
**Unikátní architektura (compute/storage separation):**
```
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Producer │ │ Consumer │ │ Consumer │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
┌──────▼───────────────────▼───────────────────▼──────┐
│ Broker (stateless) │
│ Subscription: Exclusive / Shared / Failover │
└──────────────────────┬──────────────────────────────┘
┌──────────────────────▼──────────────────────────────┐
│ BookKeeper (stateful storage) │
│ ├── Bookie 1 ├── Bookie 2 ├── Bookie 3 ├── ... │
│ └── Ledger (append-only, segmented log) │
└─────────────────────────────────────────────────────┘
```
| Koncept | Popis |
|---------|-------|
| **Topic** | Logická kategorie (partitioned nebo non-partitioned) |
| **Subscription** | Způsob doručení (Exclusive, Shared, Failover, Key_Shared) |
| **Ledger** | Storage unit v BookKeeper (append-only) |
| **Bookie** | Storage node (BookKeeper) |
| **Managed Ledger** | Segmentovaný log s cache a retention |
**Výhody oproti Kafce:**
- Compute/storage separation — nezávislé škálování
- Geo-replication built-in (nativní)
- Multi-tenant (namespaces, isolation)
- TTL, retry, dead letter topic (built-in)
- Read-at-least-once / effectively-once
### NATS
| Feature | Popis |
|---------|-------|
| **Core NATS** | Pub/sub, request-reply, < 1 ms latence |
| **JetStream** | Persistence, exactly-once, key-value store, object store |
| **Leaf nodes** | Hierarchické propojení clusterů |
| **Super-cluster** | Multi-region clustering (global) |
**Use case:** IoT, edge computing, microservices communication, low-latency messaging.
### Oracle Service Bus (OSB)
Součást Oracle SOA Suite, běží na WebLogic Serveru. Enterprise service bus pro integraci v Oracle-heavy prostředích.
| Koncept | Popis |
|---------|-------|
| **Proxy Service** | Vstupní endpoint (HTTP, JMS, MQ, SOAP, REST) |
| **Business Service** | Cílový backend service |
| **Pipeline** | Message processing — routing, transformation, validation |
| **Split-Join** | Parallel/sequential orchestration více služeb |
| **Reporting** | Message tracking, SLA monitoring |
**Klíčové vlastnosti:**
- **Protocol mediation** — překlad mezi SOAP/REST/JMS/MQ/FTP
- **Message transformation** — XSLT, XQuery, MFL (neXML)
- **Throttling, SLA, alerting** — built-in
- **Oracle AQ (Advanced Queuing)** — integrace s Oracle DB frontami
- **XPath, XQuery, XSLT 2.0/3.0** — nativní podpora
- **Error handling** — fault policies, error queues, retry
**Use case:** Enterprise SOA, Oracle DB → Kafka bridging, legacy mainframe wrapping, B2B integration.
**Alternativy:** IBM Integration Bus (IIB), MuleSoft Anypoint, WSO2 EI, Apache Camel / ServiceMix.
---
## Srovnání platformem
### Výkon a škálování
| Platforma | Max throughput | Latence (P99) | Počet zpráv/s (1 broker) | Škálování |
|-----------|--------------|---------------|-------------------------|-----------|
| **Kafka** | > 1 GB/s | 210 ms | ~1 000 000 | Partitions (horizontální) |
| **Pulsar** | > 1 GB/s | 515 ms | ~1 000 000 | Brokers + Bookies |
| **RabbitMQ** | ~100 MB/s | < 1 ms (RAM) | ~100 000 | Clustering (node) |
| **NATS** | > 10 GB/s | < 0,5 ms | ~10 000 000 | Clustering + Leaf nodes |
| **OSB** | < 1 GB/s | 10100 ms | ~10 000 | Vertikální (WebLogic cluster)
### Delivery guarantees
| Platforma | At most once | At least once | Exactly once | Ordering |
|-----------|-------------|---------------|-------------|----------|
| **Kafka** | Ano | Ano (acks=all + min.insync) | Ano (idempotent + transactional) | Per partition |
| **Pulsar** | Ano | Ano | Ano (dedup + transactional) | Per partition |
| **RabbitMQ** | Ano | Ano (Publisher Confirm + Consumer Ack) | Omezeně | Per queue |
| **NATS** | Ano | Ano (JetStream) | Omezeně | Per subject |
| **OSB** | Ano | Ano (XA transactions, exactly-once delivery) | Ano (XA + WS-AT) | Per pipeline |
### Kdy co použít
| Use case | Doporučená platforma | Zdůvodnění |
|----------|---------------------|------------|
| **Event sourcing / audit log** | Kafka, Pulsar | Append-only log, high throughput, replay |
| **CDC (Change Data Capture)** | Kafka (Kafka Connect + Debezium) | Ekosystém konektorů |
| **Task queue (job processing)** | RabbitMQ, SQS | Dead letter, retry, priority, scheduling |
| **API messaging / microservices** | NATS, RabbitMQ | Nízká latence, jednoduchost |
| **Data pipeline (ETL)** | Kafka (KSQL, Kafka Streams) | Stream processing v platformě |
| **IoT / Edge** | NATS, MQTT (RabbitMQ) | Lightweight, leaf nodes |
| **Enterprise SOA / EAI** | OSB, IBM IIB, MuleSoft | Protocol mediation, XA, B2B, legacy wrapping |
| **Multi-tenant cloud** | Pulsar | Nativní multi-tenant, geo-replication |
| **Serverless / event-driven** | SQS/SNS, Pub/Sub | Managed, auto-scaling |
---
## DR a vysoká dostupnost
Viz [DATACENTERS.md](DATACENTERS.md) — sekce "Vliv jednotlivých technologií na výběr DC topologie" pro detail DR mapping per platforma.
### Best practices
- **Neztrať zprávu v queue** — preferovat aknowledge-based consumption (ne auto-ack)
- **Dead letter queue** — každá hlavní queue má DLQ pro nedoručitelné zprávy
- **Monitoring lag** — consumer lag je klíčová metrika (Kafka: `kafka.consumer:consumer_lag`)
- **Idempotentní consumer** — stejná zpráva může být doručena dvakrát
- **Retry s backoff** — exponenciální backoff při selhání zpracování
- **Schema registry** — vyhnout se deserialization errors (Avro, Protobuf, JSON Schema)
- **Šifrování** — TLS in transit, encryption at rest (Kafka: cluster-side + topic-level)
---
## Související
- [DATACENTERS.md](DATACENTERS.md) — DR topologie, per-platforma mapping
- [CLOUD.md](CLOUD.md) — managed messaging (SQS, SNS, Service Bus, Pub/Sub)
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Poslední revize: 2026-06-12*

View File

@@ -111,6 +111,6 @@ MongoDB changed its license in 2018 from GNU AGPL v3 to **SSPL** (Server Side Pu
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
*Last revision: 2026-06-03*

View File

@@ -497,6 +497,6 @@ OpenStack provides several services for telemetry and monitoring:
## Sources
Links, books and standards: [sources/monitoring/sources.md](sources/monitoring/sources.md)
Links, books and standards: [sources/monitoring/sources.en.md](sources/monitoring/sources.en.md)
*Last revision: 2026-06-03*

View File

@@ -131,7 +131,7 @@ ProxySQL is an advanced proxy for MySQL with sophisticated routing:
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
### Recommended reading

View File

@@ -302,7 +302,7 @@ Anycast detail:
## Cloud Networking Resilience (2026)
See also: [CLOUD.md](CLOUD.md) — cloud architecture, multi-AZ, hybrid cloud connectivity.
See also: [CLOUD.en.md](CLOUD.en.md) — cloud architecture, multi-AZ, hybrid cloud connectivity.
### Cell-based Architectures
@@ -577,7 +577,7 @@ In a private DC, Zero Trust is deployed via:
## Resources
Links, books and standards: [sources/networking/sources.md](sources/networking/sources.md)
Links, books and standards: [sources/networking/sources.en.md](sources/networking/sources.en.md)
- **MTU alignment** — consistent MTU across the entire path, check ICMP blocking for PMTUD
- **IP planning** — RFC 1918 (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), avoid overlaps for peering

View File

@@ -195,7 +195,7 @@ Tip: For RAC, consider smaller CPUs (e.g., 64C instead of 96C) — license cost
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
### Recommended reading

337
OS.en.md Normal file
View File

@@ -0,0 +1,337 @@
# Operating Systems
> Overview of Linux distributions and Microsoft Windows for server, container, and AI/GPU workloads, including support lifecycle, EOL dates, and comparison.
---
## Distribution overview
| Distribution | Family | Package manager | Init | Security | Reference platform |
|-------------|--------|----------------|------|----------|-------------------|
| **Ubuntu LTS** | Debian | apt (deb) | systemd | AppArmor | NVIDIA DGX, widest AI/GPU support |
| **Debian** | Debian | apt (deb) | systemd | AppArmor | General-purpose server, stability |
| **RHEL** | Red Hat | dnf (rpm) | systemd | SELinux | Enterprise standard, SAP, Oracle DB |
| **Rocky Linux** | Red Hat | dnf (rpm) | systemd | SELinux | RHEL binary compatible (free) |
| **AlmaLinux** | Red Hat | dnf (rpm) | systemd | SELinux | RHEL binary compatible (free) |
| **SLES** | SUSE | zypper (rpm) | systemd | AppArmor | HPC, SAP, mainframe |
| **OpenSUSE Leap** | SUSE | zypper (rpm) | systemd | AppArmor | Desktop, development |
| **OpenSUSE Tumbleweed** | SUSE | zypper (rpm) | systemd | AppArmor | Rolling release, bleeding edge |
| **Fedora** | Red Hat | dnf (rpm) | systemd | SELinux | Desktop, technology preview |
| **Arch Linux** | Independent | pacman | systemd | — | Rolling, power users |
| **Alpine Linux** | Independent | apk | OpenRC | — | Container image, embedded |
| **Flatcar Container Linux** | Independent | — (image-based) | systemd | — | K8s worker node, minimal footprint |
| **Bottlerocket** | Independent | — (image-based) | systemd | — | AWS K8s, minimal footprint |
---
## Support lifecycle and EOL dates
> **Standard:** base support (bug fixes, security). **LTS/ELS:** extended support (security only).
> ESM = Ubuntu Extended Security Maintenance, EUS = RHEL Extended Update Support, LTSS = SUSE Long Term Service Pack Support.
### Ubuntu LTS
| Version | Release | Standard support | ESM / Ubuntu Pro | Note |
|---------|---------|-----------------|------------------|------|
| **20.04 LTS** (Focal) | 2020-04 | End 2025-04 | End 2030-04 | Last release with Python 2 |
| **22.04 LTS** (Jammy) | 2022-04 | End 2027-04 | End 2032-04 | NVIDIA DGX standard |
| **24.04 LTS** (Noble) | 2024-04 | End 2029-04 | End 2034-04 | Latest GPU/CUDA support |
| **26.04 LTS** (planned) | 2026-04 | End 2031-04 | End 2036-04 | — |
### RHEL
| Version | Release | Full support | Maintenance support | Extended life cycle |
|---------|---------|-------------|-------------------|-------------------|
| **7** | 2014-06 | End 2019-08 | End 2024-06 | End 2028-06 (ELS) |
| **8** | 2019-05 | End 2024-05 | End 2029-05 | End 2034-06 (ELS) |
| **9** | 2022-05 | End 2027-05 | End 2032-05 | End 2037-06 (ELS) |
| **10** (planned) | 2025 | End 2029 | End 2034 | — |
### Rocky Linux / AlmaLinux
| Version | Release | Support until | RHEL compatible | Note |
|---------|---------|-------------|-----------------|------|
| **8** | 2021-06 | 2029-05 | Yes (since RHEL 8.4) | Alma/Rocky |
| **9** | 2022-07 | 2032-05 | Yes (since RHEL 9.0) | Alma/Rocky |
### Debian
| Version | Release | Full support | LTS support | ELTS (paid) |
|---------|---------|-------------|-------------|-------------|
| **11** (Bullseye) | 2021-08 | 2024-08 | End 2026-08 | End 2028-08 |
| **12** (Bookworm) | 2023-06 | 2026-06 | End 2028-06 | End 2030-06 |
| **13** (Trixie) | 2025 (expected) | ~3 years post-release | ~5 years post-release | — |
### SLES
| Version | Release | General support | LTSS | Note |
|---------|---------|---------------|------|------|
| **15 SP3** | 2021-06 | End 2024-12 | End 2027-12 | — |
| **15 SP4** | 2022-06 | End 2025-12 | End 2028-12 | — |
| **15 SP5** | 2023-06 | End 2026-12 | End 2029-12 | Current SP |
| **15 SP6** | 2024-10 | End 2027-12 | End 2030-12 | — |
### Fedora
| Version | Release | EOL | Note |
|---------|---------|-----|------|
| **38** | 2023-04 | 2024-05 | — |
| **39** | 2023-11 | 2024-12 | — |
| **40** | 2024-04 | 2025-05 | — |
| **41** | 2024-11 | 2025-12 | — |
Fedora releases a new version every ~6 months, EOL ~13 months after release. Serves as upstream for RHEL.
### Alpine Linux
| Version | Release | EOL |
|---------|---------|-----|
| **3.18** | 2023-05 | 2025-05 |
| **3.19** | 2023-12 | 2025-12 |
| **3.20** | 2024-05 | 2026-05 |
| **3.21** | 2024-12 | 2026-12 |
---
## Kernel version per distribution
| Distribution | Kernel (default) | Kernel (HWE/enhanced) | Note |
|------------|-----------------|----------------------|------|
| Ubuntu 22.04 LTS | 5.15 (GA) | 6.5+ (HWE) | HWE from 22.04.2 |
| Ubuntu 24.04 LTS | 6.8 | — | — |
| RHEL 8 | 4.18 | — | Backported features |
| RHEL 9 | 5.14 | — | Backported features |
| RHEL 10 | 6.11+ (expected) | — | — |
| Rocky/Alma 8 | 4.18 | — | Same as RHEL 8 |
| Rocky/Alma 9 | 5.14 | — | Same as RHEL 9 |
| Debian 11 | 5.10 | 6.1 (backports) | — |
| Debian 12 | 6.1 | — | — |
| SLES 15 SP5 | 5.14 | — | — |
| SLES 15 SP6 | 6.4 | — | — |
| Fedora 40 | 6.8+ | — | Rolling upstream |
| Alpine 3.20 | 6.6 | — | — |
---
## Use case comparison
| Use case | Recommended distribution | Rationale |
|----------|------------------------|-----------|
| **AI/GPU cluster (DGX)** | Ubuntu 22.04 LTS / DGX OS | NVIDIA standard, CUDA, MLNX_OFED |
| **Enterprise K8s (OpenShift)** | RHEL 9 / RHCOS | Red Hat support, GPU Operator |
| **Vanilla K8s (on-prem)** | Ubuntu 22.04 LTS + Flatcar (workers) | Community support, minimal worker image |
| **HPC cluster (Slurm)** | Rocky Linux 9 / Ubuntu 22.04 | EL ecosystem + Lustre, or Ubuntu |
| **Traditional enterprise DB (Oracle, SAP)** | RHEL 9 / SLES 15 | Vendor certification |
| **Container host** | Ubuntu 22.04 / Alpine | Broad image compatibility / min size |
| **Development / desktop** | Fedora / Ubuntu 24.04 / OpenSUSE Tumbleweed | Latest packages, HW support |
| **Embedded / IoT** | Debian / Alpine / Yocto | Minimal footprint, stability |
| **Edge inference** | Ubuntu (ARM) / NVIDIA JetPack | Jetson, GPU support |
| **Mainframe (IBM z/Arch)** | SLES 15 / RHEL 9 | IBM certification |
---
## Package management comparison
| Feature | apt (Debian/Ubuntu) | dnf (RHEL/Rocky/Alma/Fedora) | zypper (SUSE) | pacman (Arch) | apk (Alpine) |
|---------|--------------------|------------------------------|---------------|---------------|-------------|
| **Package format** | .deb | .rpm | .rpm | .pkg.tar.zst | .apk |
| **Repo management** | /etc/apt/sources.list | /etc/yum.repos.d/ | /etc/zypp/repos.d/ | /etc/pacman.conf | /etc/apk/repositories |
| **Lock file** | — (apt-mark hold) | — (exclude) | — (lock) | — (IgnorePkg) | — |
| **Transactional update** | No | Yes (dnf history) | Yes (zypper history) | No | No |
| **Rollback** | No (manual) | Yes (dnf history rollback) | Yes (snapper + zypper) | No | No |
| **Delta updates** | Yes (apt-xapian) | Yes (deltarpm) | Yes (zsync) | No | No |
| **Version (as of 2025)** | apt 2.7+ | dnf 4.18+ | zypper 1.14+ | pacman 6.1+ | apk 2.14+ |
---
## Security model comparison
| Feature | SELinux (RHEL derivatives) | AppArmor (Ubuntu/Debian/SUSE) |
|---------|--------------------------|-------------------------------|
| **Type** | Mandatory Access Control (MAC) | Mandatory Access Control (MAC) |
| **Labeling** | Context-based (user:role:type) | Path-based (profile per executable) |
| **Configuration** | Policy (modules, booleans) | Profiles (text, in /etc/apparmor.d/) |
| **Modes** | Enforcing / Permissive / Disabled | Enforce / Complain / Disabled |
| **Learning curve** | Steep (complex policies) | Moderate (simpler profiles) |
| **Default in** | RHEL, Rocky, Alma, Fedora | Ubuntu, Debian, SLES, OpenSUSE |
| **Use case** | Enterprise multi-tenant, regulated | General-purpose server, app containment |
| **Container integration** | SELinux labels on container | AppArmor profile on container |
Additional layers:
- **seccomp** — syscall filtering (default in containerd, Docker)
- **Capabilities** — Linux capabilities (drop all except required)
- **cgroups v2** — resource isolation (CPU, memory, IO, PID)
- **User namespaces** — rootless containers (Podman, Docker rootless)
---
## Recommended migration path for EOL distributions
| From | To | Recommended approach |
|------|-----|---------------------|
| Ubuntu 20.04 (EOL 2025) | Ubuntu 22.04 or 24.04 | `do-release-upgrade` or fresh install |
| RHEL 7 (EOL 2024) | RHEL 8 or 9 | `leapp` upgrade, or fresh install |
| Rocky/Alma 8 | Rocky/Alma 9 | `dnf upgrade --releasever=9` |
| Debian 11 (EOL LTS 2026) | Debian 12 | `apt full-upgrade` + new sources.list |
| SLES 15 SP4 (EOL 2025) | SLES 15 SP6 | `zypper migration` |
| Fedora 40 (EOL 2025) | Fedora 42+ | `dnf system-upgrade` |
---
## Microsoft Windows
### Windows Server — editions
| Edition | Price (approx) | Core limits | VM rights | Use case |
|---------|---------------|-------------|-----------|----------|
| **Datacenter** | ~$6,155 (2025) | Unlimited | Unlimited Windows VMs per host | Virtualization, SDDC, S2D, HCI |
| **Standard** | ~$1,069 (2025) | 2 CPU, unlimited cores | 2 Windows VMs + Hyper-V host | General server, AD, file server |
| **Essentials** | ~$501 (2025) | 1 CPU, max 10 users | — | Small business (≤25 users) |
| **Azure Edition** | Pay-as-you-go | Per Azure VM | Per Azure | Azure-only, hotpatching |
Licensing: Windows Server Standard and Datacenter are licensed **per core** (min 16 core/server + 8 core/VM).
### Windows Server — support lifecycle
> **Mainstream:** regular updates (bug fixes, security, features). **Extended:** security updates only (free).
> **ESU:** Extended Security Updates (paid tier, ~$45300/core/year).
| Version | Release | Mainstream support | Extended support | ESU | Note |
|---------|---------|------------------|-----------------|-----|------|
| **2012 R2** | 2013-11 | 2018-10 | 2023-10 | End 2026-10 (year 3) | ESU paid, final year |
| **2016** | 2016-10 | 2022-01 | 2027-01 | — | Last with Desktop Experience |
| **2019** | 2019-01 | 2024-01 | 2029-01 | — | Last with Nano Server (1803 only) |
| **2022** | 2021-09 | 2026-10 | 2031-10 | — | Current, TPM 2.0, Credential Guard |
| **2025** | 2024-11 | 2029-10 | 2034-10 | — | Hotpatching, PowerShell 7, SMB over QUIC |
### Windows Server — version vs edition feature grid
| Version | Hyper-V | Storage Spaces Direct | Software-defined networking | Containers | GPU DDA / vGPU | WSL2 |
|---------|---------|---------------------|---------------------------|------------|---------------|------|
| 2016 Standard | Yes | No (DC only) | No (DC only) | Windows only | Yes | No |
| 2016 Datacenter | Yes | Yes | Yes | Windows | Yes | No |
| 2019 Standard | Yes | No | No | Windows | Yes | No |
| 2019 Datacenter | Yes | Yes | Yes | Windows | Yes | No |
| 2022 Standard | Yes | No | No | Windows + Linux | Yes | No |
| 2022 Datacenter | Yes | Yes | Yes | Windows + Linux (2022.2+) | Yes | No |
| 2025 Datacenter | Yes | Yes | Yes | Windows + Linux | Yes | Yes |
### Windows Desktop — support lifecycle
> **E = Enterprise, Pro = Professional, Home = Consumer**
> LTSC = Long Term Servicing Channel (stable, no feature updates)
| Version | Release | EOL (Home/Pro) | EOL (Enterprise) | LTSC EOL | Note |
|---------|---------|---------------|-----------------|----------|------|
| **10 21H2** | 2021-11 | — | 2024-06 | — |
| **10 22H2** | 2022-10 | 2025-10 | 2025-10 | — | Final Windows 10 |
| **10 LTSC 2021** | 2021-11 | — | — | 2032-01 | IoT Enterprise LTSC |
| **11 22H2** | 2022-09 | 2024-10 | 2025-10 | — |
| **11 23H2** | 2023-10 | 2025-11 | 2026-11 | — |
| **11 24H2** | 2024-10 | 2026-10 | 2027-10 | — | First with Recall, Copilot+ |
| **11 LTSC 2024** | 2024-10 | — | — | 2029-10 | Enterprise LTSC |
Windows 10 support **ended 2025-10-14** — last version with classic Control Panel.
### Windows vs Linux — comparison
| Feature | Windows Server | RHEL / Ubuntu |
|---------|---------------|---------------|
| **License (server)** | $5006,000 (per core) + CAL | $0800 (per node subscription) |
| **License (desktop)** | $100200 (OEM/retail) | Free |
| **Support cost** | Included in license (SA/ESU) | $2001,300/node/year (RHEL) |
| **Package management** | MSI, AppX, winget, NuGet | APT, DNF, Zypper |
| **Package count** | ~10,000 (chocolatey) | ~60,000+ (Ubuntu repo) |
| **Desktop GUI** | Windows Shell (mandatory) | Optional (GNOME, KDE, XFCE…) |
| **Server GUI** | Windows Shell (core-only since 2022) | CLI-only (standard) |
| **Kernel** | NT hybrid kernel (kernel-mode Win32) | Monolithic Linux kernel |
| **Device support** | OEM driver model (WHQL) | Open source + vendor drivers |
| **Container types** | Windows + Linux (WSL2) | Linux (Docker, Podman, containerd) |
| **Container registry** | Docker Hub, ACR, Nexus | Docker Hub, Quay, GHCR, Nexus… |
| **Container image size** | ~48 GB (Windows Server Core) | ~100 MB 1 GB (Alpine/Ubuntu) |
| **GPU passthrough** | DDA (Discrete Device Assignment) | GPU Direct, VFIO, SR-IOV |
| **AI/ML support** | WSL2 (CUDA), Azure ML | Native CUDA, ROCm, oneAPI |
| **CUDA support** | Yes (via WSL2 or Docker) | Native (nvidia-container-toolkit) |
| **Orchestration** | AD / GPO / SCCM / WAC | Ansible, Puppet, Salt, Foreman |
| **RBAC/AAA** | Active Directory (+ Kerberos) | LDAP, FreeIPA, SSSD, AD |
| **Remote management** | RDP, WinRM, PowerShell Remoting | SSH, Cockpit, Webmin |
| **Filesystem** | NTFS, ReFS, CSVFS | ext4, XFS, Btrfs, ZFS |
| **Max file system size** | 256 TB (NTFS), 1.2 YB (ReFS) | 1 EB (XFS), 16 EB (ZFS) |
| **Hypervisor** | Hyper-V (Type 1) | KVM (Type 2-like), Xen |
| **Dynamic memory** | Hyper-V Dynamic Memory | KSM, virtio-balloon (KVM) |
| **Live migration** | Hyper-V Live Migration | KVM Live Migration, vMotion |
### Windows specific features
| Feature | Description | Linux alternative |
|---------|------------|-------------------|
| **Active Directory** | Identity, auth, GPO, DNS, DHCP | FreeIPA, Samba AD DC, 389-ds, SSSD |
| **Group Policy** | Central desktop/server configuration | Ansible, Puppet, Salt (agent-based) |
| **Hyper-V + S2D** | Hyper-converged storage and virtualization (HCI) | Proxmox Ceph / oVirt + Gluster |
| **Failover Clustering** | Cluster-aware apps (SQL, File Server) | Pacemaker + Corosync + DRBD |
| **IIS** | Web server, ASP.NET host | Nginx, Apache (.NET host possible) |
| **PowerShell** | Scripting, Desired State Configuration | Bash, Python, Ansible |
| **Windows Admin Center** | GUI management | Cockpit, Webmin |
| **BitLocker** | Full disk encryption | LUKS + cryptsetup |
| **Windows Defender** | Antivirus + EDR | ClamAV, Wazuh, Osquery |
| **SQL Server** | Relational database | PostgreSQL, MySQL, MariaDB |
### Recommended OS per use case (including Windows)
| Use case | OS | Rationale |
|----------|-----|-------|
| **Active Directory / GPO / hybrid ID** | Windows Server 2022/2025 | AD is Windows-only |
| **SQL Server (failover cluster)** | Windows Server Datacenter + SQL EE | Always On FCI, ReFS |
| **Exchange / SharePoint** | Windows Server 2022 | Windows-only |
| **Enterprise desktop management** | Windows 11 Enterprise + Intune/SCCM | GPO, AD, enterprise MDM |
| **.NET / ASP.NET apps** | Windows Server / Linux (.NET Core) | .NET 6+ runs on Linux |
| **HCI (Microsoft stack)** | Windows Server Datacenter + S2D + Hyper-V | Azure Stack HCI |
| **Virtualization (mixed workload)** | Windows Server Datacenter (Hyper-V) | Linux + Windows VMs under one |
| **AI/GPU inference** | Linux (Ubuntu) + CUDA | NVIDIA optimal; WSL2 alternative |
| **Container orchestration (Windows nodes)** | Windows Server 2022/2025 + containerd | Windows Pods in AKS on-prem |
| **Tier 2 apps / web / API** | Ubuntu or RHEL (Linux) | Lower TCO, smaller footprint |
### Windows Server migration paths
| From | To | Recommended approach |
|------|-----|---------------------|
| Windows Server 2012 R2 (EOL 2023) | Windows Server 2022/2025 | In-place upgrade or fresh + migration |
| Windows Server 2016 (EOL 2027) | Windows Server 2022/2025 | In-place upgrade or fresh |
| Windows Server 2019 | Windows Server 2022/2025 | In-place upgrade (`Setup.exe /auto upgrade`) |
| Windows Server 2022 | Windows Server 2025 | In-place upgrade or fresh |
| Windows Server → Cloud | Azure VM / Azure Stack HCI | Azure Migrate, Storage Migration Service |
| Windows Server → Linux | Ubuntu / RHEL (re-platform) | Migrate app to .NET Core or alternative |
### Windows — API and operational limits
| Limit | Windows Server | Windows Desktop |
|-------|---------------|----------------|
| **Max RAM** | 24 TB (2025 Datacenter) | 2 TB (Pro/Enterprise), 128 GB (Home) |
| **Max CPU sockets** | 64 (Datacenter), 2 (Standard) | 2 |
| **Max CPU cores** | Unlimited | 128 (Pro), 64 (Home) |
| **Max file size (NTFS)** | 256 TB | 256 TB |
| **Max file size (ReFS)** | 18.4 EB (2025) | — |
| **Max volume size (NTFS)** | 256 TB | 256 TB |
| **Max volume size (ReFS)** | 1.2 YB (theoretical) | — |
| **Max dedup volume** | 64 TB (Data Deduplication) | — |
| **Max cluster nodes** | 64 (Failover Cluster) | — |
| **Max VM per host** | Unlimited (Datacenter) | — |
| **VM memory per VM** | 12 TB (2022+) | — |
| **VM vCPU per VM** | 240 (2022+) | — |
| **Concurrent RDP** | 2 (admin), 200+ (RDS CAL) | 1 (Home), more (RDP host) |
| **PowerShell Remoting** | Unlimited (WinRM) | Yes (WinRM) |
---
## Related
- [AI-INFRASTRUCTURE.en.md](AI-INFRASTRUCTURE.en.md) — OS for AI workloads, GPU drivers, kernel parameters
- [KUBERNETES.en.md](KUBERNETES.en.md) — container runtime, orchestration
- [HYPERVISORS.en.md](HYPERVISORS.en.md) — hypervisors, VM host OS
- [DATACENTERS.en.md](DATACENTERS.en.md) — DC layout, HW platforms
## Sources
Links, books, and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
*Last revision: 2026-06-18*

333
OS.md Normal file
View File

@@ -0,0 +1,333 @@
# Operační systémy
> Přehled Linux distribucí a Microsoft Windows pro serverové, containerové a AI/GPU workloady, včetně support lifecycle, EOL dat a srovnání.
---
## Přehled distribucí
| Distribuce | Rodina | Package manager | Init | Security | Reference platforma |
|-----------|--------|----------------|------|----------|-------------------|
| **Ubuntu LTS** | Debian | apt (deb) | systemd | AppArmor | NVIDIA DGX, nejširší AI/GPU support |
| **Debian** | Debian | apt (deb) | systemd | AppArmor | Univerzální server, stabilita |
| **RHEL** | Red Hat | dnf (rpm) | systemd | SELinux | Enterprise standard, SAP, Oracle DB |
| **Rocky Linux** | Red Hat | dnf (rpm) | systemd | SELinux | RHEL binary compatible (free) |
| **AlmaLinux** | Red Hat | dnf (rpm) | systemd | SELinux | RHEL binary compatible (free) |
| **SLES** | SUSE | zypper (rpm) | systemd | AppArmor | HPC, SAP, mainframe |
| **OpenSUSE Leap** | SUSE | zypper (rpm) | systemd | AppArmor | Desktop, vývoj |
| **OpenSUSE Tumbleweed** | SUSE | zypper (rpm) | systemd | AppArmor | Rolling release, bleeding edge |
| **Fedora** | Red Hat | dnf (rpm) | systemd | SELinux | Desktop, technologický preview |
| **Arch Linux** | Independent | pacman | systemd | — | Rolling, power users |
| **Alpine Linux** | Independent | apk | OpenRC | — | Container image, embedded |
| **Flatcar Container Linux** | Independent | — (image-based) | systemd | — | K8s worker node, minimal footprint |
| **Bottlerocket** | Independent | — (image-based) | systemd | — | AWS K8s, minimal footprint |
---
## Support lifecycle a EOL data
> **Standard:** základní podpora (bug fixy, security). **LTS/ELS:** prodloužená podpora (jen security).
> ESM = Ubuntu Extended Security Maintenance, EUS = RHEL Extended Update Support, LTSS = SUSE Long Term Service Pack Support.
### Ubuntu LTS
| Verze | Release | Standard support | ESM / Ubuntu Pro | Poznámka |
|-------|---------|-----------------|------------------|----------|
| **20.04 LTS** (Focal) | 2020-04 | Konec 2025-04 | Konec 2030-04 | Poslední verze s Python 2 |
| **22.04 LTS** (Jammy) | 2022-04 | Konec 2027-04 | Konec 2032-04 | NVIDIA DGX standard |
| **24.04 LTS** (Noble) | 2024-04 | Konec 2029-04 | Konec 2034-04 | Nejnovější GPU/CUDA support |
| **26.04 LTS** (plán) | 2026-04 | Konec 2031-04 | Konec 2036-04 | — |
### RHEL
| Verze | Release | Full support | Maintenance support | Extended life cycle |
|-------|---------|-------------|-------------------|-------------------|
| **7** | 2014-06 | Konec 2019-08 | Konec 2024-06 | Konec 2028-06 (ELS) |
| **8** | 2019-05 | Konec 2024-05 | Konec 2029-05 | Konec 2034-06 (ELS) |
| **9** | 2022-05 | Konec 2027-05 | Konec 2032-05 | Konec 2037-06 (ELS) |
| **10** (plán) | 2025 | Konec 2029 | Konec 2034 | — |
### Rocky Linux / AlmaLinux
| Verze | Release | Support do | Kompatibilní s RHEL | Poznámka |
|-------|---------|-----------|-------------------|----------|
| **8** | 2021-06 | 2029-05 | Ano (od RHEL 8.4) | Alma/rocky |
| **9** | 2022-07 | 2032-05 | Ano (od RHEL 9.0) | Alma/rocky |
### Debian
| Verze | Release | Full support | LTS support | ELTS (paid) |
|-------|---------|-------------|-------------|-------------|
| **11** (Bullseye) | 2021-08 | 2024-08 | Konec 2026-08 | Konec 2028-08 |
| **12** (Bookworm) | 2023-06 | 2026-06 | Konec 2028-06 | Konec 2030-06 |
| **13** (Trixie) | 2025 (oček.) | ~3 roky po release | ~5 let po release | — |
### SLES
| Verze | Release | General support | LTSS | Poznámka |
|-------|---------|---------------|------|----------|
| **15 SP3** | 2021-06 | Konec 2024-12 | Konec 2027-12 | — |
| **15 SP4** | 2022-06 | Konec 2025-12 | Konec 2028-12 | — |
| **15 SP5** | 2023-06 | Konec 2026-12 | Konec 2029-12 | Aktuální SP |
| **15 SP6** | 2024-10 | Konec 2027-12 | Konec 2030-12 | — |
### Fedora
| Verze | Release | EOL | Poznámka |
|-------|---------|-----|----------|
| **38** | 2023-04 | 2024-05 | — |
| **39** | 2023-11 | 2024-12 | — |
| **40** | 2024-04 | 2025-05 | — |
| **41** | 2024-11 | 2025-12 | — |
Fedora vydává novou verzi každých ~6 měsíců, EOL ~13 měsíců po release. Slouží jako upstream pro RHEL.
### Alpine Linux
| Verze | Release | EOL |
|-------|---------|-----|
| **3.18** | 2023-05 | 2025-05 |
| **3.19** | 2023-12 | 2025-12 |
| **3.20** | 2024-05 | 2026-05 |
| **3.21** | 2024-12 | 2026-12 |
---
## Kernel verze per distribuce
| Distribuce | Kernel (default) | Kernel (HWE/enhanced) | Poznámka |
|-----------|-----------------|----------------------|----------|
| Ubuntu 22.04 LTS | 5.15 (GA) | 6.5+ (HWE) | HWE od 22.04.2 |
| Ubuntu 24.04 LTS | 6.8 | — | — |
| RHEL 8 | 4.18 | — | Backportované featur |
| RHEL 9 | 5.14 | — | Backportované featur |
| RHEL 10 | 6.11+ (oček.) | — | — |
| Rocky/Alma 8 | 4.18 | — | Stejný jako RHEL 8 |
| Rocky/Alma 9 | 5.14 | — | Stejný jako RHEL 9 |
| Debian 11 | 5.10 | 6.1 (backports) | — |
| Debian 12 | 6.1 | — | — |
| SLES 15 SP5 | 5.14 | — | — |
| SLES 15 SP6 | 6.4 | — | — |
| Fedora 40 | 6.8+ | — | Rolling upstream |
| Alpine 3.20 | 6.6 | — | — |
---
## Srovnání dle use case
| Use case | Doporučená distribuce | Zdůvodnění |
|----------|---------------------|-------|
| **AI/GPU cluster (DGX)** | Ubuntu 22.04 LTS / DGX OS | NVIDIA standard, CUDA, MLNX_OFED |
| **Enterprise K8s (OpenShift)** | RHEL 9 / RHCOS | Red Hat support, GPU Operator |
| **Vanilla K8s (on-prem)** | Ubuntu 22.04 LTS + Flatcar (workers) | Community support, minimal worker image |
| **HPC cluster (Slurm)** | Rocky Linux 9 / Ubuntu 22.04 | EL ekosystém + Lustre, nebo Ubuntu |
| **Traditional enterprise DB (Oracle, SAP)** | RHEL 9 / SLES 15 | Vendor certifikace |
| **Container host** | Ubuntu 22.04 / Alpine | Široká image kompatibilita / min size |
| **Vývoj / desktop** | Fedora / Ubuntu 24.04 / OpenSUSE Tumbleweed | Aktuální balíčky, HW support |
| **Embedded / IoT** | Debian / Alpine / Yocto | Minimal footprint, stabilita |
| **Edge inference** | Ubuntu (ARM) / NVIDIA JetPack | Jetson, GPU support |
| **Mainframe (IBM z/Arch)** | SLES 15 / RHEL 9 | IBM certifikace |
---
## Package management srovnání
| Vlastnost | apt (Debian/Ubuntu) | dnf (RHEL/Rocky/Alma/Fedora) | zypper (SUSE) | pacman (Arch) | apk (Alpine) |
|-----------|--------------------|------------------------------|---------------|---------------|-------------|
| **Formát balíčků** | .deb | .rpm | .rpm | .pkg.tar.zst | .apk |
| **Repo management** | /etc/apt/sources.list | /etc/yum.repos.d/ | /etc/zypp/repos.d/ | /etc/pacman.conf | /etc/apk/repositories |
| **Lock file** | — (apt-mark hold) | — (exclude) | — (lock) | — (IgnorePkg) | — |
| **Transactional update** | Ne | Ano (dnf history) | Ano (zypper history) | Ne | Ne |
| **Rollback** | Ne (manual) | Ano (dnf history rollback) | Ano (snapper + zypper) | Ne | Ne |
| **Delta updates** | Ano (apt-xapian) | Ano (deltarpm) | Ano (zsync) | Ne | Ne |
| **Verze (k 2025)** | apt 2.7+ | dnf 4.18+ | zypper 1.14+ | pacman 6.1+ | apk 2.14+ |
---
## Security model porovnání
| Vlastnost | SELinux (RHEL deriváty) | AppArmor (Ubuntu/Debian/SUSE) |
|-----------|----------------------|------------------------------|
| **Typ** | Mandatory Access Control (MAC) | Mandatory Access Control (MAC) |
| **Labelování** | Kontextové (user:role:type) | Path-based (profil k executable) |
| **Konfigurace** | Policy (moduly, booleany) | Profily (textové, v /etc/apparmor.d/) |
| **Režimy** | Enforcing / Permissive / Disabled | Enforce / Complain / Disabled |
| **Křivka učení** | Strmá (politiky komplexní) | Mírná (profily jednodušší) |
| **Default v** | RHEL, Rocky, Alma, Fedora | Ubuntu, Debian, SLES, OpenSUSE |
| **Use case** | Enterprise multiclient, regulované prostředí | Univerzální server, containment aplikací |
| **Container integrace** | SELinux labels na kontejner | AppArmor profile na kontejner |
Další vrstvy:
- **seccomp** — syscall filtering (default v containerd, Docker)
- **Capabilities** — Linux capabilities (drop vše kromě nutných)
- **cgroups v2** — resource isolation (CPU, memory, IO, PID)
- **User namespaces** — rootless kontejnery (Podman, Docker rootless)
---
## Doporučená migrační cesta pro EOL distribuce
| Ze staré verze | Na | Doporučený postup |
|----------------|-----|-------------------|
| Ubuntu 20.04 (EOL 2025) | Ubuntu 22.04 nebo 24.04 | `do-release-upgrade` nebo fresh install |
| RHEL 7 (EOL 2024) | RHEL 8 nebo 9 | `leapp` upgrade, nebo fresh install |
| Rocky/Alma 8 | Rocky/Alma 9 | `dnf upgrade --releasever=9` |
| Debian 11 (EOL LTS 2026) | Debian 12 | `apt full-upgrade` + nové sources.list |
| SLES 15 SP4 (EOL 2025) | SLES 15 SP6 | `zypper migration` |
| Fedora 40 (EOL 2025) | Fedora 42+ | `dnf system-upgrade` |
---
## Microsoft Windows
### Windows Server — edice
| Edice | Cena (approx) | Core limity | VM rights | Use case |
|-------|--------------|-------------|-----------|----------|
| **Datacenter** | ~$6 155 (2025) | Neomezen | Neomezené Windows VM na hostiteli | Virtualizace, SDDC, S2D, HCI |
| **Standard** | ~$1 069 (2025) | 2 CPU, neomezen jader | 2 Windows VM + Hyper-V host | Běžný server, AD, file server |
| **Essentials** | ~$501 (2025) | 1 CPU, max 10 uživatelů | — | Malé firmy (do 25 uživatelů) |
| **Azure Edition** | Pay-as-you-go | Dle Azure VM | Dle Azure | Azure-only, hotpatching |
Licencování: Windows Server Standard a Datacenter se licencují **per core** (min 16 core/server + 8 core/VM).
### Windows Server — support lifecycle
> **Mainstream:** běžné aktualizace (bug fixy, security, feature). **Extended:** jen security aktualizace (zdarma).
> **ESU:** Extended Security Updates (placená vrstva navíc, cca $45300/core/rok).
| Verze | Release | Mainstream support | Extended support | ESU | Poznámka |
|-------|---------|------------------|-----------------|-----|----------|
| **2012 R2** | 2013-11 | 2018-10 | 2023-10 | Konec 2026-10 (3. rok) | ESU placená, poslední rok |
| **2016** | 2016-10 | 2022-01 | 2027-01 | — | Poslední s Desktop Experience |
| **2019** | 2019-01 | 2024-01 | 2029-01 | — | Poslední s Nano Server (jen 1803) |
| **2022** | 2021-09 | 2026-10 | 2031-10 | — | Aktuální, TPM 2.0, Credential Guard |
| **2025** | 2024-11 | 2029-10 | 2034-10 | — | Hotpatching, PowerShell 7, SMB over QUIC |
### Windows Server — verze vs edice grid
| Verze | Hyper-V | Storage Spaces Direct | Software-defined networking | Containers | GPU DDA / vGPU | WSL2 |
|-------|---------|---------------------|---------------------------|------------|---------------|------|
| 2016 Standard | Ano | Ne (jen Datacenter) | Ne (jen Datacenter) | Jen Windows | Ano | Ne |
| 2016 Datacenter | Ano | Ano | Ano | Windows | Ano | Ne |
| 2019 Standard | Ano | Ne | Ne | Windows | Ano | Ne |
| 2019 Datacenter | Ano | Ano | Ano | Windows | Ano | Ne |
| 2022 Standard | Ano | Ne | Ne | Windows + Linux | Ano | Ne |
| 2022 Datacenter | Ano | Ano | Ano | Windows + Linux (2022.2+) | Ano | Ne |
| 2025 Datacenter | Ano | Ano | Ano | Windows + Linux | Ano | Ano |
### Windows Desktop — support lifecycle
> **E = Enterprise, Pro = Professional, Home = Consumer**
> LTSC = Long Term Servicing Channel (stabilní, bez feature updatů)
| Verze | Release | EOL (Home/Pro) | EOL (Enterprise) | LTSC EOL | Poznámka |
|-------|---------|---------------|-----------------|----------|----------|
| **10 21H2** | 2021-11 | — | 2024-06 | — |
| **10 22H2** | 2022-10 | 2025-10 | 2025-10 | — | Poslední Windows 10 |
| **10 LTSC 2021** | 2021-11 | — | — | 2032-01 | IoT Enterprise LTSC |
| **11 22H2** | 2022-09 | 2024-10 | 2025-10 | — |
| **11 23H2** | 2023-10 | 2025-11 | 2026-11 | — |
| **11 24H2** | 2024-10 | 2026-10 | 2027-10 | — | První s Recall, Copilot+ |
| **11 LTSC 2024** | 2024-10 | — | — | 2029-10 | Enterprise LTSC |
Podpora Windows 10 **skončila 2025-10-14** — poslední verze s klasickým ovládacím panelem.
### Windows vs Linux — srovnání
| Vlastnost | Windows Server | RHEL / Ubuntu |
|-----------|---------------|---------------|
| **Licence (server)** | $5006 000 (per core) + CAL | $0800 (per node subscription) |
| **Licence (desktop)** | $100200 (OEM/retail) | Zdarma |
| **Cena za support** | Zahrnuto v licenci (SA/ESU) | $2001 300/node/rok (RHEL) |
| **Package management** | MSI, AppX, winget, NuGet | APT, DNF, Zypper |
| **Package count** | ~10 000 (chocolatey) | ~60 000+ (Ubuntu repo) |
| **Desktop GUI** | Windows Shell (mandatory) | Volitelný (GNOME, KDE, XFCE…) |
| **Server GUI** | Windows Shell (od 2022 Core only) | CLI-only (standard) |
| **Kernel** | NT hybrid kernel (kernel-mode Win32) | Monolithic Linux kernel |
| **Device support** | OEM driver model (WHQL) | Open source + vendor drivers |
| **Container types** | Windows + Linux (WSL2) | Linux (Docker, Podman, containerd) |
| **Container registry** | Docker Hub, ACR, Nexus | Docker Hub, Quay, GHCR, Nexus… |
| **Container image size** | ~48 GB (Windows Server Core) | ~100 MB 1 GB (Alpine/Ubuntu) |
| **GPU passthrough** | DDA (Discrete Device Assignment) | GPU Direct, VFIO, SR-IOV |
| **AI/ML support** | WSL2 (CUDA), Azure ML | Native CUDA, ROCm, oneAPI |
| **CUDA support** | Ano (přes WSL2 nebo Docker) | Native (nvidia-container-toolkit) |
| **Orchestration** | AD / GPO / SCCM / WAC | Ansible, Puppet, Salt, Foreman |
| **RBAC/AAA** | Active Directory (+ Kerberos) | LDAP, FreeIPA, SSSD, AD |
| **Remote management** | RDP, WinRM, PowerShell Remoting | SSH, Cockpit, Webmin |
| **Filesystem** | NTFS, ReFS, CSVFS | ext4, XFS, Btrfs, ZFS |
| **Max file system size** | 256 TB (NTFS), 1.2 YB (ReFS) | 1 EB (XFS), 16 EB (ZFS) |
| **Hypervisor** | Hyper-V (Type 1) | KVM (Type 2-ish), Xen |
| **Dynamic memory** | Hyper-V Dynamic Memory | KSM, virtio-balloon (KVM) |
| **Live migration** | Hyper-V Live Migration | KVM Live Migration, vMotion |
### Windows specific features
| Feature | Popis | Lze nahradit na Linuxu? |
|---------|-------|------------------------|
| **Active Directory** | Identity, auth, GPO, DNS, DHCP | FreeIPA, Samba AD DC, 389-ds, SSSD |
| **Group Policy** | Centrální konfigurace desktopů/serverů | Ansible, Puppet, Salt (agent-based) |
| **Hyper-V + S2D** | Hyper-converged storage a virtualizace (HCI) | Proxmox Ceph / oVirt + Gluster |
| **Failover Clustering** | Cluster-aware aplikace (SQL, File Server) | Pacemaker + Corosync + DRBD |
| **IIS** | Web server, ASP.NET host | Nginx, Apache (bez ASP.NET, nebo .NET host) |
| **PowerShell** | Scripting, Desired State Configuration | Bash, Python, Ansible |
| **Windows Admin Center** | GUI management | Cockpit, Webmin |
| **BitLocker** | Full disk encryption | LUKS + cryptsetup |
| **Windows Defender** | Antivirus + EDR | ClamAV, Wazuh, Osquery |
| **SQL Server** | Relační DB | PostgreSQL, MySQL, MariaDB |
### Doporučený OS dle use case (včetně Windows)
| Use case | OS | Zdůvodnění |
|----------|-----|-------|
| **Active Directory / GPO / hybrid ID** | Windows Server 2022/2025 | AD jen na Windows |
| **SQL Server (failover cluster)** | Windows Server Datacenter + SQL EE | Always On FCI, ReFS |
| **Exchange / SharePoint** | Windows Server 2022 | Jen na Windows |
| **Enterprise desktop management** | Windows 11 Enterprise + Intune/SCCM | GPO, AD, enterprise MDM |
| **.NET / ASP.NET aplikace** | Windows Server / Linux (.NET Core) | .NET 6+ běží na Linuxu |
| **HCI (Microsoft stack)** | Windows Server Datacenter + S2D + Hyper-V | Azure Stack HCI |
| **Virtualizace (mixed workload)** | Windows Server Datacenter (Hyper-V) | Linux i Windows VM pod jedním |
| **AI/GPU inference** | Linux (Ubuntu) + CUDA | NVIDIA optimální; WSL2 alternativa |
| **Container orchestration (Windows nodes)** | Windows Server 2022/2025 + containerd | Windows Pods v AKS on-prem |
| **Tier 2 aplikace / web / API** | Ubuntu nebo RHEL (Linux) | Nižší TCO, menší footprint |
### Windows Server migrační cesty
| Ze staré verze | Na | Doporučený postup |
|---------------|-----|-------------------|
| Windows Server 2012 R2 (EOL 2023) | Windows Server 2022/2025 | In-place upgrade nebo fresh + migration |
| Windows Server 2016 (EOL 2027) | Windows Server 2022/2025 | In-place upgrade nebo fresh |
| Windows Server 2019 | Windows Server 2022/2025 | In-place upgrade (`Setup.exe /auto upgrade`) |
| Windows Server 2022 | Windows Server 2025 | In-place upgrade nebo fresh |
| Windows Server → Cloud | Azure VM / Azure Stack HCI | Azure Migrate, Storage Migration Service |
| Windows Server → Linux | Ubuntu / RHEL (re-platform) | Migrace aplikace na .NET Core nebo alternativu |
### Windows — API a provozní limity
| Limit | Windows Server | Windows Desktop |
|-------|---------------|----------------|
| **Max RAM** | 24 TB (2025 Datacenter) | 2 TB (Pro/Enterprise), 128 GB (Home) |
| **Max CPU sockets** | 64 (Datacenter), 2 (Standard) | 2 |
| **Max CPU cores** | Neomezen | 128 (Pro), 64 (Home) |
| **Max file size (NTFS)** | 256 TB | 256 TB |
| **Max file size (ReFS)** | 18.4 EB (2025) | — |
| **Max volume size (NTFS)** | 256 TB | 256 TB |
| **Max volume size (ReFS)** | 1.2 YB (teoreticky) | — |
| **Max dedup volume** | 64 TB (Data Deduplication) | — |
| **Max cluster nodes** | 64 (Failover Cluster) | — |
| **Max VM per host** | Neomezen (Datacenter) | — |
| **VM memory per VM** | 12 TB (2022+) | — |
| **VM vCPU per VM** | 240 (2022+) | — |
| **Concurrent RDP** | 2 (admin), 200+ (RDS CAL) | 1 (Home), více (RDP host) |
| **PowerShell Remoting** | Neomezen (WinRM) | Ano (WinRM) |
- [AI-INFRASTRUCTURE.md](AI-INFRASTRUCTURE.md) — OS pro AI workloady, GPU drivery, kernel parametry
- [KUBERNETES.md](KUBERNETES.md) — container runtime, orchestrace
- [HYPERVISORS.md](HYPERVISORS.md) — hypervisory, VM host OS
- [DATACENTERS.md](DATACENTERS.md) — DC layout, HW platformy
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
*Poslední revize: 2026-06-18*

View File

@@ -166,7 +166,7 @@ LIMIT 10;
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
### Recommended reading

View File

@@ -167,7 +167,7 @@ resource "vsphere_virtual_machine" "web" {
}
```
More in [CICD.md](CICD.md#infrastructure-as-code-iac).
More in [CICD.en.md](CICD.en.md#infrastructure-as-code-iac).
## Firmware management
@@ -188,7 +188,7 @@ More in [CICD.md](CICD.md#infrastructure-as-code-iac).
| **Chef** | Ruby DSL | Pull (agent) | Compliance, infrastructure automation |
| **SaltStack** | YAML/Python | Both (salt-minion) | High-speed config, event-driven |
More in [CICD.md](CICD.md).
More in [CICD.en.md](CICD.en.md).
## OpenStack Provisioning
@@ -223,6 +223,6 @@ OpenStack offers several methods for provisioning infrastructure:
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
*Last revision: 2026-06-03*

View File

@@ -35,11 +35,11 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
│ PCIe,BM) │ │(BIOS, │ │ AMD) │ │ Terraform) │
└──────────┘ │ NUMA) │ └────────┘ └──────────────┘
└──────────┘
┌──────────┐ ┌──────────┐ ┌────────┐
│HYPERVISOR│ │ MONITOR │ │ CICD │
│(VMware, │ │(Prom, │ │(GitOps, │
│ KVM, ...)│ │ Grafana) │ │ IaC) │
└──────────┘ └──────────┘ └────────┘
┌──────────┐ ┌──────────┐ ┌────────┐ ┌────────────┐
│HYPERVISOR│ │ MONITOR │ │ CICD │ │ ☸ K8s │
│(VMware, │ │(Prom, │ │(GitOps, │ │(CAPI, K3s, │
│ KVM, ...)│ │ Grafana) │ │ IaC) │ │ RKE2...) │
└──────────┘ └──────────┘ └────────┘ └────────────┘
```
---
@@ -52,15 +52,22 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
| 🌐 Network architecture | [NETWORKING.md](NETWORKING.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
| 📊 Monitoring & observability | [MONITORING.md](MONITORING.md) | Prometheus, Grafana, OTel, logging, alerting | — |
| 🔄 CI/CD & DevOps | [CICD.md](CICD.md) | Pipelines, GitOps, IaC (Terraform), deployment | — |
| 💻 Operační systémy | [OS.md](OS.md) | Linux distribuce, Windows Server, lifecycle, EOL, kernel | KUBERNETES, HYPERVISORS, AI-INFRASTRUCTURE |
| 🔄 Disaster Recovery | [DR.md](DR.md) | RTO, RPO, scenarios, prevention, uptime calculation | CLOUD, DATACENTERS, MONITORING |
| 🗄️ Database architecture | [DATABASES.md](DATABASES.md) | Classification, sharding, replication, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VEKTOROVE-DB, DATABAZOVE-ENGINY |
| 🗄️ Big Data | [BIG-DATA.md](BIG-DATA.md) | HDFS, Spark, Flink, Trino, Iceberg, Delta Lake, Lakehouse | DATABASES, CLOUD, MESSAGING, KUBERNETES |
| 🖥️ Hypervisors | [HYPERVISORS.md](HYPERVISORS.md) | VMware, Hyper-V, KVM, Proxmox, migration | STORAGE, SERVER-HW |
| 🏭 Data centers | [DATACENTERS.md](DATACENTERS.md) | Tier, power, cooling, layout, DC services | MONITORING |
| 🏭 Data centers | [DATACENTERS.md](DATACENTERS.md) | Tier, power, cooling, layout, DC services, secondary DC topologies | MONITORING, MESSAGING |
| 💾 Storage | [STORAGE.md](STORAGE.md) | SAN/NAS/object, RAID, SDS, Ceph, OpenStack Cinder/Swift/Manila | — |
| 🔌 Server connectivity | [CONNECTIVITY.md](CONNECTIVITY.md) | Ethernet, FC SAN, iSCSI, NVMe-oF, SAS | — |
| 🔧 Server hardware | [SERVER-HW.md](SERVER-HW.md) | CPU, RAM, PCIe, NUMA, BMC | CONNECTIVITY |
| 🎮 GPU | [GPU.md](GPU.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
| ⚙️ Server config | [SERVER-CONFIG.md](SERVER-CONFIG.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
| 📦 Provisioning | [PROVISIONING.md](PROVISIONING.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
| ☸ Kubernetes | [KUBERNETES.md](KUBERNETES.md) | K8s architektura, deployment, Cluster API (CAPI) | CICD, CLOUD, NETWORKING |
| 📨 Messaging & streaming | [MESSAGING.md](MESSAGING.md) | Kafka, RabbitMQ, Pulsar, NATS, managed queue/pubsub | DATACENTERS, CLOUD |
| 🏗️ Migrace DC | [DC-MIGRATION.md](DC-MIGRATION.md) | Strategie, fáze, network, DB, rollback | DATACENTERS, CLOUD, DR, NETWORKING, STORAGE |
| 🧠 AI infrastruktura | [AI-INFRASTRUCTURE.md](AI-INFRASTRUCTURE.md) | GPU, AI networking, storage, cluster, cooling, training/inference | GPU, NETWORKING, STORAGE, DATACENTERS, CLOUD |
| 📋 Legacy index | [HARDWARE.md](HARDWARE.md) | → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
| 📋 Legacy infra | [INFRASTRUCTURE.md](INFRASTRUCTURE.md) | → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
| 📋 Review workflow | [REVIEW.md](REVIEW.md) | Review and content control process | — |
@@ -70,12 +77,12 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
| File | Description |
|------|-------------|
| [POSTGRESQL.md](POSTGRESQL.md) | PostgreSQL — architecture, replication, tuning |
| [MYSQL.md](MYSQL.md) | MySQL & MariaDB |
| [ORACLE.md](ORACLE.md) | Oracle Database — RAC, Data Guard, tuning |
| [MONGODB.md](MONGODB.md) | MongoDB — document DB, sharding, replica sets |
| [REDIS.md](REDIS.md) | Redis — cache, session store, streams |
| [CASSANDRA.md](CASSANDRA.md) | Cassandra & ScyllaDB — wide-column, nosql |
| [POSTGRESQL.en.md](POSTGRESQL.en.md) | PostgreSQL — architecture, replication, tuning |
| [MYSQL.en.md](MYSQL.en.md) | MySQL & MariaDB |
| [ORACLE.en.md](ORACLE.en.md) | Oracle Database — RAC, Data Guard, tuning |
| [MONGODB.en.md](MONGODB.en.md) | MongoDB — document DB, sharding, replica sets |
| [REDIS.en.md](REDIS.en.md) | Redis — cache, session store, streams |
| [CASSANDRA.en.md](CASSANDRA.en.md) | Cassandra & ScyllaDB — wide-column, nosql |
| [VEKTOROVE-DB.md](VEKTOROVE-DB.md) | Vector databases — Pinecone, Qdrant, Milvus, pgvector |
| [DATABAZOVE-ENGINY.md](DATABAZOVE-ENGINY.md) | Common DB concepts — transactions, indexes, locking |
@@ -89,15 +96,22 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
| 🌐 Network architecture | [NETWORKING.en.md](NETWORKING.en.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
| 📊 Monitoring & observability | [MONITORING.en.md](MONITORING.en.md) | Prometheus, Grafana, OTel, logging, alerting | — |
| 🔄 CI/CD & DevOps | [CICD.en.md](CICD.en.md) | Pipelines, GitOps, IaC (Terraform), deployment | — |
| 💻 Operating systems | [OS.en.md](OS.en.md) | Linux distributions, Windows Server, lifecycle, EOL, kernel | KUBERNETES, HYPERVISORS, AI-INFRASTRUCTURE |
| 🔄 Disaster Recovery | [DR.en.md](DR.en.md) | RTO, RPO, scenarios, prevention, uptime calculation | CLOUD, DATACENTERS, MONITORING |
| 🗄️ Database architecture | [DATABASES.en.md](DATABASES.en.md) | Classification, sharding, replication, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VECTOR-DBS, DATABASE-ENGINES |
| 🗄️ Big Data | [BIG-DATA.en.md](BIG-DATA.en.md) | HDFS, Spark, Flink, Trino, Iceberg, Delta Lake, Lakehouse | DATABASES, CLOUD, MESSAGING, KUBERNETES |
| 🖥️ Hypervisors | [HYPERVISORS.en.md](HYPERVISORS.en.md) | VMware, Hyper-V, KVM, Proxmox, migration | STORAGE, SERVER-HW |
| 🏭 Data centers | [DATACENTERS.en.md](DATACENTERS.en.md) | Tier, power, cooling, layout, DC services | MONITORING |
| 🏭 Data centers | [DATACENTERS.en.md](DATACENTERS.en.md) | Tier, power, cooling, layout, DC services, secondary DC topologies | MONITORING, MESSAGING |
| 💾 Storage | [STORAGE.en.md](STORAGE.en.md) | SAN/NAS/object, RAID, SDS, Ceph | — |
| 🔌 Server connectivity | [CONNECTIVITY.en.md](CONNECTIVITY.en.md) | Ethernet, FC SAN, iSCSI, NVMe-oF, SAS | — |
| 🔧 Server hardware | [SERVER-HW.en.md](SERVER-HW.en.md) | CPU, RAM, PCIe, NUMA, BMC | CONNECTIVITY |
| 🎮 GPU | [GPU.en.md](GPU.en.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
| ⚙️ Server config | [SERVER-CONFIG.en.md](SERVER-CONFIG.en.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
| 📦 Provisioning | [PROVISIONING.en.md](PROVISIONING.en.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
| ☸ Kubernetes | [KUBERNETES.en.md](KUBERNETES.en.md) | K8s architecture, deployment, Cluster API (CAPI) | CICD, CLOUD, NETWORKING |
| 📨 Messaging & streaming | [MESSAGING.en.md](MESSAGING.en.md) | Kafka, RabbitMQ, Pulsar, NATS, managed queue/pubsub | DATACENTERS, CLOUD |
| 🏗️ DC Migration | [DC-MIGRATION.en.md](DC-MIGRATION.en.md) | Strategies, phases, network, DB, rollback | DATACENTERS, CLOUD, DR, NETWORKING, STORAGE |
| 🧠 AI Infrastructure | [AI-INFRASTRUCTURE.en.md](AI-INFRASTRUCTURE.en.md) | GPU, AI networking, storage, cluster, cooling, training/inference | GPU, NETWORKING, STORAGE, DATACENTERS, CLOUD |
| 📋 Legacy index | [HARDWARE.en.md](HARDWARE.en.md) | → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
| 📋 Legacy infra | [INFRASTRUCTURE.en.md](INFRASTRUCTURE.en.md) | → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
| 📋 Review workflow | [REVIEW.en.md](REVIEW.en.md) | Review and content control process | — |
@@ -122,7 +136,7 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
| File | Description |
|------|-------------|
| [case-studies/proxmox-demo/README.md](case-studies/proxmox-demo/README.md) | Proxmox VE demo cluster — design (CZ) |
| [case-studies/proxmox-demo/README.md](case-studies/proxmox-demo/README.md) | Proxmox VE demo cluster — návrh (CZ) |
| [case-studies/proxmox-demo/README.en.md](case-studies/proxmox-demo/README.en.md) | Proxmox VE demo cluster — design (EN) |
---
@@ -131,21 +145,28 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
| File | References |
|------|------------|
| `CLOUD.md` / `CLOUD.en.md` | [`GPU.md`](GPU.md), [`NETWORKING.md`](NETWORKING.md), [`sources/cloud/sources.md`](sources/cloud/sources.md) |
| `NETWORKING.md` / `NETWORKING.en.md` | [`CLOUD.md`](CLOUD.md), [`sources/networking/sources.md`](sources/networking/sources.md) |
| `DATACENTERS.md` / `DATACENTERS.en.md` | [`MONITORING.md`](MONITORING.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `MONITORING.md` / `MONITORING.en.md` | [`sources/monitoring/sources.md`](sources/monitoring/sources.md) |
| `CICD.md` / `CICD.en.md` | [`sources/cicd/sources.md`](sources/cicd/sources.md) |
| `PROVISIONING.md` / `PROVISIONING.en.md` | [`CICD.md`](CICD.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `STORAGE.md` / `STORAGE.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `GPU.md` / `GPU.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `SERVER-HW.md` / `SERVER-HW.en.md` | [`CONNECTIVITY.md`](CONNECTIVITY.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `SERVER-CONFIG.md` / `SERVER-CONFIG.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `CONNECTIVITY.md` / `CONNECTIVITY.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `HYPERVISORS.md` / `HYPERVISORS.en.md` | [`STORAGE.md`](STORAGE.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `DATABASES.md` / `DATABASES.en.md` | [`POSTGRESQL.md`](POSTGRESQL.md), [`MYSQL.md`](MYSQL.md), [`ORACLE.md`](ORACLE.md), [`MONGODB.md`](MONGODB.md), [`REDIS.md`](REDIS.md), [`CASSANDRA.md`](CASSANDRA.md), [`VEKTOROVE-DB.md`](VEKTOROVE-DB.md), [`DATABAZOVE-ENGINY.md`](DATABAZOVE-ENGINY.md), [`sources/databases/sources.md`](sources/databases/sources.md) |
| `HARDWARE.md` / `HARDWARE.en.md` | [`SERVER-HW.md`](SERVER-HW.md), [`GPU.md`](GPU.md), [`SERVER-CONFIG.md`](SERVER-CONFIG.md), [`PROVISIONING.md`](PROVISIONING.md) |
| `INFRASTRUCTURE.md` / `INFRASTRUCTURE.en.md` | [`HYPERVISORS.md`](HYPERVISORS.md), [`DATACENTERS.md`](DATACENTERS.md), [`STORAGE.md`](STORAGE.md), [`HARDWARE.md`](HARDWARE.md) |
| `CLOUD.md` / `CLOUD.en.md` | [`GPU.en.md`](GPU.en.md), [`NETWORKING.en.md`](NETWORKING.en.md), [`sources/cloud/sources.en.md`](sources/cloud/sources.en.md) |
| `NETWORKING.md` / `NETWORKING.en.md` | [`CLOUD.en.md`](CLOUD.en.md), [`sources/networking/sources.en.md`](sources/networking/sources.en.md) |
| `DATACENTERS.md` / `DATACENTERS.en.md` | [`MONITORING.en.md`](MONITORING.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `MONITORING.md` / `MONITORING.en.md` | [`sources/monitoring/sources.en.md`](sources/monitoring/sources.en.md) |
| `CICD.md` / `CICD.en.md` | [`sources/cicd/sources.en.md`](sources/cicd/sources.en.md) |
| `DR.md` / `DR.en.md` | [`CLOUD.en.md`](CLOUD.en.md), [`DATACENTERS.en.md`](DATACENTERS.en.md), [`MONITORING.en.md`](MONITORING.en.md), [`CICD.en.md`](CICD.en.md), [`STORAGE.en.md`](STORAGE.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `MESSAGING.md` / `MESSAGING.en.md` | [`DATACENTERS.en.md`](DATACENTERS.en.md), [`CLOUD.en.md`](CLOUD.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `DC-MIGRATION.md` / `DC-MIGRATION.en.md` | [`DATACENTERS.en.md`](DATACENTERS.en.md), [`CLOUD.en.md`](CLOUD.en.md), [`DR.en.md`](DR.en.md), [`NETWORKING.en.md`](NETWORKING.en.md), [`STORAGE.en.md`](STORAGE.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `AI-INFRASTRUCTURE.md` / `AI-INFRASTRUCTURE.en.md` | [`GPU.en.md`](GPU.en.md), [`NETWORKING.en.md`](NETWORKING.en.md), [`STORAGE.en.md`](STORAGE.en.md), [`DATACENTERS.en.md`](DATACENTERS.en.md), [`CLOUD.en.md`](CLOUD.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `PROVISIONING.md` / `PROVISIONING.en.md` | [`CICD.en.md`](CICD.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `STORAGE.md` / `STORAGE.en.md` | [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `GPU.md` / `GPU.en.md` | [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `SERVER-HW.md` / `SERVER-HW.en.md` | [`CONNECTIVITY.en.md`](CONNECTIVITY.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `SERVER-CONFIG.md` / `SERVER-CONFIG.en.md` | [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `CONNECTIVITY.md` / `CONNECTIVITY.en.md` | [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `HYPERVISORS.md` / `HYPERVISORS.en.md` | [`STORAGE.en.md`](STORAGE.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `DATABASES.md` / `DATABASES.en.md` | [`POSTGRESQL.en.md`](POSTGRESQL.en.md), [`MYSQL.en.md`](MYSQL.en.md), [`ORACLE.en.md`](ORACLE.en.md), [`MONGODB.en.md`](MONGODB.en.md), [`REDIS.en.md`](REDIS.en.md), [`CASSANDRA.en.md`](CASSANDRA.en.md), [`VEKTOROVE-DB.md`](VEKTOROVE-DB.md), [`DATABAZOVE-ENGINY.md`](DATABAZOVE-ENGINY.md), [`sources/databases/sources.en.md`](sources/databases/sources.en.md) |
| `HARDWARE.md` / `HARDWARE.en.md` | [`SERVER-HW.en.md`](SERVER-HW.en.md), [`GPU.en.md`](GPU.en.md), [`SERVER-CONFIG.en.md`](SERVER-CONFIG.en.md), [`PROVISIONING.en.md`](PROVISIONING.en.md) |
| `OS.md` / `OS.en.md` | [`AI-INFRASTRUCTURE.en.md`](AI-INFRASTRUCTURE.en.md), [`KUBERNETES.en.md`](KUBERNETES.en.md), [`HYPERVISORS.en.md`](HYPERVISORS.en.md), [`DATACENTERS.en.md`](DATACENTERS.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `KUBERNETES.md` / `KUBERNETES.en.md` | [`CICD.en.md`](CICD.en.md), [`CLOUD.en.md`](CLOUD.en.md), [`NETWORKING.en.md`](NETWORKING.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `BIG-DATA.md` / `BIG-DATA.en.md` | [`DATABASES.en.md`](DATABASES.en.md), [`CLOUD.en.md`](CLOUD.en.md), [`MESSAGING.en.md`](MESSAGING.en.md), [`KUBERNETES.en.md`](KUBERNETES.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
| `INFRASTRUCTURE.md` / `INFRASTRUCTURE.en.md` | [`HYPERVISORS.en.md`](HYPERVISORS.en.md), [`DATACENTERS.en.md`](DATACENTERS.en.md), [`STORAGE.en.md`](STORAGE.en.md), [`HARDWARE.en.md`](HARDWARE.en.md) |
---
@@ -187,4 +208,4 @@ Raw reference data (documentation, books, standards) by area:
---
*This index is automatically maintained by the `kb-index` agent. Last updated: 2026-06-11.*
*This index is automatically maintained by the `kb-index` agent. Last updated: 2026-06-18.*

View File

@@ -35,11 +35,11 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
│ PCIe,BM) │ │(BIOS, │ │ AMD) │ │ Terraform) │
└──────────┘ │ NUMA) │ └────────┘ └──────────────┘
└──────────┘
┌──────────┐ ┌──────────┐ ┌────────┐
│HYPERVISOR│ │ MONITOR │ │ CICD │
│(VMware, │ │(Prom, │ │(GitOps, │
│ KVM, ...)│ │ Grafana) │ │ IaC) │
└──────────┘ └──────────┘ └────────┘
┌──────────┐ ┌──────────┐ ┌────────┐ ┌────────────┐
│HYPERVISOR│ │ MONITOR │ │ CICD │ │ ☸ K8s │
│(VMware, │ │(Prom, │ │(GitOps, │ │(CAPI, K3s, │
│ KVM, ...)│ │ Grafana) │ │ IaC) │ │ RKE2...) │
└──────────┘ └──────────┘ └────────┘ └────────────┘
```
---
@@ -52,15 +52,22 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
| 🌐 Síťová architektura | [NETWORKING.md](NETWORKING.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
| 📊 Monitoring a observabilita | [MONITORING.md](MONITORING.md) | Prometheus, Grafana, OTel, logging, alerting, SLO | — |
| 🔄 CI/CD a DevOps | [CICD.md](CICD.md) | Pipelines, GitOps, IaC (Terraform), deployment strategie | — |
| 💻 Operační systémy | [OS.md](OS.md) | Linux distribuce, Windows Server, lifecycle, EOL, kernel | KUBERNETES, HYPERVISORS, AI-INFRASTRUCTURE |
| 🔄 Disaster Recovery | [DR.md](DR.md) | RTO, RPO, scénáře, prevence, výpočet uptimu | CLOUD, DATACENTERS, MONITORING |
| 🗄️ Databázová architektura | [DATABASES.md](DATABASES.md) | Klasifikace, sharding, replikace, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VEKTOROVE-DB, DATABAZOVE-ENGINY |
| 🗄️ Big Data | [BIG-DATA.md](BIG-DATA.md) | HDFS, Spark, Flink, Trino, Iceberg, Delta Lake, Lakehouse | DATABASES, CLOUD, MESSAGING, KUBERNETES |
| 🖥️ Hypervisory | [HYPERVISORS.md](HYPERVISORS.md) | VMware, Hyper-V, KVM, Proxmox, migrace | STORAGE, SERVER-HW |
| 🏭 Datová centra | [DATACENTERS.md](DATACENTERS.md) | Tier, power, cooling, layout, DC služby | MONITORING |
| 🏭 Datová centra | [DATACENTERS.md](DATACENTERS.md) | Tier, power, cooling, layout, DC služby, sekundární DC topologie | MONITORING, MESSAGING |
| 💾 Storage | [STORAGE.md](STORAGE.md) | SAN/NAS/object, RAID, SDS, Ceph, OpenStack Cinder/Swift/Manila | — |
| 🔌 Server connectivity | [CONNECTIVITY.md](CONNECTIVITY.md) | Ethernet, FC SAN, iSCSI, NVMe-oF, SAS | — |
| 🔧 Server hardware | [SERVER-HW.md](SERVER-HW.md) | CPU, RAM, PCIe, NUMA, BMC | CONNECTIVITY |
| 🎮 GPU | [GPU.md](GPU.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
| ⚙️ Server config | [SERVER-CONFIG.md](SERVER-CONFIG.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
| 📦 Provisioning | [PROVISIONING.md](PROVISIONING.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
| ☸ Kubernetes | [KUBERNETES.md](KUBERNETES.md) | K8s architektura, deployment, Cluster API (CAPI) | CICD, CLOUD, NETWORKING |
| 📨 Messaging & streaming | [MESSAGING.md](MESSAGING.md) | Kafka, RabbitMQ, Pulsar, NATS, managed queue/pubsub | DATACENTERS, CLOUD |
| 🏗️ Migrace DC | [DC-MIGRATION.md](DC-MIGRATION.md) | Strategie, fáze, network, DB, rollback | DATACENTERS, CLOUD, DR, NETWORKING, STORAGE |
| 🧠 AI infrastruktura | [AI-INFRASTRUCTURE.md](AI-INFRASTRUCTURE.md) | GPU, AI networking, storage, cluster, cooling, training/inference | GPU, NETWORKING, STORAGE, DATACENTERS, CLOUD |
| 📋 Původní rozcestník | [HARDWARE.md](HARDWARE.md) | Legacy index → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
| 📋 Původní infrastruktura | [INFRASTRUCTURE.md](INFRASTRUCTURE.md) | Legacy index → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
| 📋 Review workflow | [REVIEW.md](REVIEW.md) | Proces oponentury a kontroly obsahu | — |
@@ -89,15 +96,22 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
| 🌐 Network architecture | [NETWORKING.en.md](NETWORKING.en.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
| 📊 Monitoring & observability | [MONITORING.en.md](MONITORING.en.md) | Prometheus, Grafana, OTel, logging, alerting | — |
| 🔄 CI/CD & DevOps | [CICD.en.md](CICD.en.md) | Pipelines, GitOps, IaC (Terraform), deployment | — |
| 💻 Operating systems | [OS.en.md](OS.en.md) | Linux distributions, Windows Server, lifecycle, EOL, kernel | KUBERNETES, HYPERVISORS, AI-INFRASTRUCTURE |
| 🔄 Disaster Recovery | [DR.en.md](DR.en.md) | RTO, RPO, scenarios, prevention, uptime calculation | CLOUD, DATACENTERS, MONITORING |
| 🗄️ Database architecture | [DATABASES.en.md](DATABASES.en.md) | Classification, sharding, replication, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VECTOR-DBS, DATABASE-ENGINES |
| 🗄️ Big Data | [BIG-DATA.en.md](BIG-DATA.en.md) | HDFS, Spark, Flink, Trino, Iceberg, Delta Lake, Lakehouse | DATABASES, CLOUD, MESSAGING, KUBERNETES |
| 🖥️ Hypervisors | [HYPERVISORS.en.md](HYPERVISORS.en.md) | VMware, Hyper-V, KVM, Proxmox, migration | STORAGE, SERVER-HW |
| 🏭 Data centers | [DATACENTERS.en.md](DATACENTERS.en.md) | Tier, power, cooling, layout, DC services | MONITORING |
| 🏭 Data centers | [DATACENTERS.en.md](DATACENTERS.en.md) | Tier, power, cooling, layout, DC services, secondary DC topologies | MONITORING, MESSAGING |
| 💾 Storage | [STORAGE.en.md](STORAGE.en.md) | SAN/NAS/object, RAID, SDS, Ceph | — |
| 🔌 Server connectivity | [CONNECTIVITY.en.md](CONNECTIVITY.en.md) | Ethernet, FC SAN, iSCSI, NVMe-oF, SAS | — |
| 🔧 Server hardware | [SERVER-HW.en.md](SERVER-HW.en.md) | CPU, RAM, PCIe, NUMA, BMC | CONNECTIVITY |
| 🎮 GPU | [GPU.en.md](GPU.en.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
| ⚙️ Server config | [SERVER-CONFIG.en.md](SERVER-CONFIG.en.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
| 📦 Provisioning | [PROVISIONING.en.md](PROVISIONING.en.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
| ☸ Kubernetes | [KUBERNETES.en.md](KUBERNETES.en.md) | K8s architecture, deployment, Cluster API (CAPI) | CICD, CLOUD, NETWORKING |
| 📨 Messaging & streaming | [MESSAGING.en.md](MESSAGING.en.md) | Kafka, RabbitMQ, Pulsar, NATS, managed queue/pubsub | DATACENTERS, CLOUD |
| 🏗️ DC Migration | [DC-MIGRATION.en.md](DC-MIGRATION.en.md) | Strategies, phases, network, DB, rollback | DATACENTERS, CLOUD, DR, NETWORKING, STORAGE |
| 🧠 AI Infrastructure | [AI-INFRASTRUCTURE.en.md](AI-INFRASTRUCTURE.en.md) | GPU, AI networking, storage, cluster, cooling, training/inference | GPU, NETWORKING, STORAGE, DATACENTERS, CLOUD |
| 📋 Legacy index | [HARDWARE.en.md](HARDWARE.en.md) | → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
| 📋 Legacy infra | [INFRASTRUCTURE.en.md](INFRASTRUCTURE.en.md) | → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
| 📋 Review workflow | [REVIEW.en.md](REVIEW.en.md) | Review and content control process | — |
@@ -136,6 +150,11 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
| `DATACENTERS.md` / `DATACENTERS.en.md` | [`MONITORING.md`](MONITORING.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `MONITORING.md` / `MONITORING.en.md` | [`sources/monitoring/sources.md`](sources/monitoring/sources.md) |
| `CICD.md` / `CICD.en.md` | [`sources/cicd/sources.md`](sources/cicd/sources.md) |
| `DR.md` / `DR.en.md` | [`CLOUD.md`](CLOUD.md), [`DATACENTERS.md`](DATACENTERS.md), [`MONITORING.md`](MONITORING.md), [`CICD.md`](CICD.md), [`STORAGE.md`](STORAGE.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `MESSAGING.md` / `MESSAGING.en.md` | [`DATACENTERS.md`](DATACENTERS.md), [`CLOUD.md`](CLOUD.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `DC-MIGRATION.md` / `DC-MIGRATION.en.md` | [`DATACENTERS.md`](DATACENTERS.md), [`CLOUD.md`](CLOUD.md), [`DR.md`](DR.md), [`NETWORKING.md`](NETWORKING.md), [`STORAGE.md`](STORAGE.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `OS.md` / `OS.en.md` | [`AI-INFRASTRUCTURE.md`](AI-INFRASTRUCTURE.md), [`KUBERNETES.md`](KUBERNETES.md), [`HYPERVISORS.md`](HYPERVISORS.md), [`DATACENTERS.md`](DATACENTERS.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `AI-INFRASTRUCTURE.md` / `AI-INFRASTRUCTURE.en.md` | [`GPU.md`](GPU.md), [`NETWORKING.md`](NETWORKING.md), [`STORAGE.md`](STORAGE.md), [`DATACENTERS.md`](DATACENTERS.md), [`CLOUD.md`](CLOUD.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `PROVISIONING.md` / `PROVISIONING.en.md` | [`CICD.md`](CICD.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `STORAGE.md` / `STORAGE.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `GPU.md` / `GPU.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
@@ -146,6 +165,8 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
| `DATABASES.md` / `DATABASES.en.md` | [`POSTGRESQL.md`](POSTGRESQL.md), [`MYSQL.md`](MYSQL.md), [`ORACLE.md`](ORACLE.md), [`MONGODB.md`](MONGODB.md), [`REDIS.md`](REDIS.md), [`CASSANDRA.md`](CASSANDRA.md), [`VEKTOROVE-DB.md`](VEKTOROVE-DB.md), [`DATABAZOVE-ENGINY.md`](DATABAZOVE-ENGINY.md), [`sources/databases/sources.md`](sources/databases/sources.md) |
| `HARDWARE.md` / `HARDWARE.en.md` | [`SERVER-HW.md`](SERVER-HW.md), [`GPU.md`](GPU.md), [`SERVER-CONFIG.md`](SERVER-CONFIG.md), [`PROVISIONING.md`](PROVISIONING.md) |
| `INFRASTRUCTURE.md` / `INFRASTRUCTURE.en.md` | [`HYPERVISORS.md`](HYPERVISORS.md), [`DATACENTERS.md`](DATACENTERS.md), [`STORAGE.md`](STORAGE.md), [`HARDWARE.md`](HARDWARE.md) |
| `KUBERNETES.md` / `KUBERNETES.en.md` | [`CICD.md`](CICD.md), [`CLOUD.md`](CLOUD.md), [`NETWORKING.md`](NETWORKING.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
| `BIG-DATA.md` / `BIG-DATA.en.md` | [`DATABASES.md`](DATABASES.md), [`CLOUD.md`](CLOUD.md), [`MESSAGING.md`](MESSAGING.md), [`KUBERNETES.md`](KUBERNETES.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
---
@@ -187,4 +208,4 @@ Raw referenční data (dokumentace, knihy, standardy) podle oblastí:
---
*Rozcestník je automaticky udržován agentem `kb-index`. Poslední aktualizace: 2026-06-11.*
*Rozcestník je automaticky udržován agentem `kb-index`. Poslední aktualizace: 2026-06-18.*

View File

@@ -114,6 +114,6 @@ Redis underwent a major license change in 2024:
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
*Last revision: 2026-06-03*

View File

@@ -752,6 +752,6 @@ flowchart TD
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
*Last revision: 2026-06-03*

View File

@@ -230,6 +230,10 @@ Conclusion: 8 DIMMs per CPU (1DPC) = highest performance
| AI training (CPU preprocessing) | 2-4 GB/core | 128-512 GB | 8× 32-64 GB RDIMM, 1DPC |
| HPC | 1-2 GB/core | 64-128 GB | 8× 16 GB RDIMM, 1DPC, high-speed |
| In-memory DB (SAP HANA) | 8-32 GB/core | 1-6 TB+ | 16× 128-256 GB LRDIMM/3DS |
| Big Data — Spark worker | 4-8 GB/core | 128-512 GB | 8-16× 32-64 GB RDIMM, 1DPC, NVMe scratch |
| Big Data — Flink worker | 8-16 GB/core (incl. managed state) | 128-512 GB | 8-16× 32-64 GB RDIMM, 1DPC, RocksDB on NVMe |
| Big Data — Trino worker | 4-8 GB/core | 64-256 GB | 8× 16-32 GB RDIMM, 1DPC |
| Big Data — HDFS DataNode | 1-2 GB/core (metadata cache) | 64-256 GB | 8× 16-32 GB RDIMM, 1DPC, max storage density |
## PCIe
@@ -324,7 +328,7 @@ Socket 0 (NUMA node 0) Socket 1 (NUMA node 1)
## Server connectivity
Detailed chapter on network and storage connectivity: [CONNECTIVITY.md](CONNECTIVITY.md)
Detailed chapter on network and storage connectivity: [CONNECTIVITY.en.md](CONNECTIVITY.en.md)
## Storage controllers
@@ -346,8 +350,51 @@ Detailed chapter on network and storage connectivity: [CONNECTIVITY.md](CONNECTI
| **Use case** | SDS (Ceph, MinIO), ZFS | VMware VMFS, Windows, legacy |
| **Battery/Backup** | Not needed | Write-back cache requires BBU |
## Pricing (2026)
### CPU pricing (2026)
| CPU | Cores | TDP | 1ku price | $/core |
|-----|-------|-----|----------|--------|
| AMD EPYC 9965 (Turin) | 192 | 500 W | ~$11,988 | $62 |
| AMD EPYC 9655 (Turin) | 96 | 400 W | ~$6,500 | $68 |
| AMD EPYC 9475F (Turin) | 48 | 360 W | ~$5,000 | $104 |
| Intel Xeon 6980P (Granite Rapids) | 128 | 500 W | ~$12,460 | $97 |
| Intel Xeon 6980P (Granite Rapids-AP) | 128 | 500 W | $13,955 | $109 |
| Intel Xeon 6767P (Granite Rapids) | 64 | 350 W | ~$7,000 | $109 |
Sources: AMD 1ku pricing, Intel RCP, Newegg verified.
### DDR5 RDIMM pricing (2026 — AI-driven price surge)
| Capacity | Speed | Price 2025 | Price Q2 2026 | Change |
|----------|---------|-----------|-------------|-------|
| 32 GB (2R×8) | DDR5-5600 | ~$95 | ~$400550 | +400500 % |
| 64 GB (2R×4) | DDR5-4800 | ~$180 | ~$700900 | +400 % |
| 96 GB (2R×4) | DDR5-6400 | ~$300 | ~$1,2001,600 | +400 % |
| 128 GB (2R×4) | DDR5-6400 | ~$450 | ~$1,8002,500 | +450 % |
| 256 GB (LRDIMM) | DDR5-6400 | ~$900 | ~$4,0005,000 | +450 % |
Trend: DDR5 prices have risen ~400500 % since mid-2025 due to AI-driven demand. Further increases expected in H2 2026. Source: Counterpoint, TrendForce.
### NVMe SSD pricing (enterprise, 2026)
| Capacity | Type | Price 2024 | Price Q2 2026 | Change |
|----------|-----|-----------|-------------|-------|
| 1.92 TB | NVMe U.3 (read-intensive) | ~$200 | ~$500600 | +150 % |
| 3.84 TB | NVMe U.3 (mixed-use) | ~$400 | ~$1,0001,200 | +150 % |
| 7.68 TB | NVMe U.3 (mixed-use) | ~$800 | ~$2,0002,500 | +150 % |
| 15.36 TB | NVMe U.3 (mixed-use) | ~$1,500 | ~$4,0005,000 | +170 % |
Trend: NAND flash prices have risen ~100200 % since 2025, average enterprise SSD now costs 23× more. Source: TrendForce, Xinnor.
### Total server cost (example configurations)
| Configuration | CPU | RAM | Storage | Estimated Price |
|-------------|-----|-----|------|-----------|
| DB server (OLTP) | 2× EPYC 9655 (96C) | 1 TB DDR5 | 6× 1.92 TB NVMe | ~$45,00060,000 |
| GPU server (AI) | 2× Xeon 6980P | 2 TB DDR5 | 4× 3.84 TB NVMe | ~$80,000120,000 (w/o GPU) |
| Hypervisor host | 2× EPYC 9475F (48C) | 512 GB DDR5 | 2× 1.92 TB NVMe + 4× 16 TB HDD | ~$25,00035,000 |
| Storage server (Ceph) | 1× EPYC 9655 (96C) | 256 GB DDR5 | 24× 15.36 TB NVMe | ~$60,00080,000 |
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
*Last revision: 2026-06-03*

View File

@@ -230,6 +230,10 @@ Závěr: 8 DIMMů na CPU (1DPC) = nejvyšší výkon
| AI training (CPU preprocessing) | 2-4 GB/core | 128-512 GB | 8× 32-64 GB RDIMM, 1DPC |
| HPC | 1-2 GB/core | 64-128 GB | 8× 16 GB RDIMM, 1DPC, high-speed |
| In-memory DB (SAP HANA) | 8-32 GB/core | 1-6 TB+ | 16× 128-256 GB LRDIMM/3DS |
| Big Data — Spark worker | 4-8 GB/core | 128-512 GB | 8-16× 32-64 GB RDIMM, 1DPC, NVMe scratch |
| Big Data — Flink worker | 8-16 GB/core (vč. managed state) | 128-512 GB | 8-16× 32-64 GB RDIMM, 1DPC, RocksDB na NVMe |
| Big Data — Trino worker | 4-8 GB/core | 64-256 GB | 8× 16-32 GB RDIMM, 1DPC |
| Big Data — HDFS DataNode | 1-2 GB/core (metadata cache) | 64-256 GB | 8× 16-32 GB RDIMM, 1DPC, max storage density |
## PCIe
@@ -346,6 +350,49 @@ Detailní kapitola o síťové a storage konektivitě: [CONNECTIVITY.md](CONNECT
| **Use case** | SDS (Ceph, MinIO), ZFS | VMware VMFS, Windows, legacy |
| **Battery/Backup** | Není potřeba | Write-back cache vyžaduje BBU |
## Ceny (2026)
### CPU ceny (2026)
| CPU | Cores | TDP | 1ku cena | $/core |
|-----|-------|-----|----------|--------|
| AMD EPYC 9965 (Turin) | 192 | 500 W | ~$11 988 | $62 |
| AMD EPYC 9655 (Turin) | 96 | 400 W | ~$6 500 | $68 |
| AMD EPYC 9475F (Turin) | 48 | 360 W | ~$5 000 | $104 |
| Intel Xeon 6980P (Granite Rapids) | 128 | 500 W | ~$12 460 | $97 |
| Intel Xeon 6980P (Granite Rapids-AP) | 128 | 500 W | $13 955 | $109 |
| Intel Xeon 6767P (Granite Rapids) | 64 | 350 W | ~$7 000 | $109 |
Sources: AMD 1ku pricing, Intel RCP, Newegg verified.
### DDR5 RDIMM ceny (2026 — AI-driven price surge)
| Kapacita | Rychlost | Cena 2025 | Cena Q2 2026 | Změna |
|----------|---------|-----------|-------------|-------|
| 32 GB (2R×8) | DDR5-5600 | ~$95 | ~$400550 | +400500 % |
| 64 GB (2R×4) | DDR5-4800 | ~$180 | ~$700900 | +400 % |
| 96 GB (2R×4) | DDR5-6400 | ~$300 | ~$1 2001 600 | +400 % |
| 128 GB (2R×4) | DDR5-6400 | ~$450 | ~$1 8002 500 | +450 % |
| 256 GB (LRDIMM) | DDR5-6400 | ~$900 | ~$4 0005 000 | +450 % |
Trend: DDR5 ceny vzrostly ~400500 % od mid-2025 kvůli AI-driven poptávce. Očekává se další růst v H2 2026. Zdroj: Counterpoint, TrendForce.
### NVMe SSD ceny (enterprise, 2026)
| Kapacita | Typ | Cena 2024 | Cena Q2 2026 | Změna |
|----------|-----|-----------|-------------|-------|
| 1.92 TB | NVMe U.3 (read-intensive) | ~$200 | ~$500600 | +150 % |
| 3.84 TB | NVMe U.3 (mixed-use) | ~$400 | ~$1 0001 200 | +150 % |
| 7.68 TB | NVMe U.3 (mixed-use) | ~$800 | ~$2 0002 500 | +150 % |
| 15.36 TB | NVMe U.3 (mixed-use) | ~$1 500 | ~$4 0005 000 | +170 % |
Trend: NAND flash ceny vzrostly ~100200 % od 2025, průměrný enterprise SSD stojí 23× více. Zdroj: TrendForce, Xinnor.
### Celková cena serveru (příkladové konfigurace)
| Konfigurace | CPU | RAM | Disk | Odhad ceny |
|-------------|-----|-----|------|-----------|
| DB server (OLTP) | 2× EPYC 9655 (96C) | 1 TB DDR5 | 6× 1.92 TB NVMe | ~$45 00060 000 |
| GPU server (AI) | 2× Xeon 6980P | 2 TB DDR5 | 4× 3.84 TB NVMe | ~$80 000120 000 (bez GPU) |
| Hypervisor host | 2× EPYC 9475F (48C) | 512 GB DDR5 | 2× 1.92 TB NVMe + 4× 16 TB HDD | ~$25 00035 000 |
| Storage server (Ceph) | 1× EPYC 9655 (96C) | 256 GB DDR5 | 24× 15.36 TB NVMe | ~$60 00080 000 |
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)

View File

@@ -270,9 +270,60 @@ OpenStack offers three main storage services:
Ceph is the most common storage backend for OpenStack: Cinder (RBD), Swift (RGW), Manila (CephFS), Glance (RBD images).
## Big Data storage
### HDFS cluster
HDFS is the primary storage for the Hadoop ecosystem (on-prem). Typical configuration:
| Parameter | Value | Note |
|-----------|-------|------|
| **Disk per DataNode** | 824 × HDD (1422 TB) + 2× NVMe (metadata, cache) | Balance capacity / performance |
| **Replication factor** | 3× | Rack-aware |
| **Network** | 2× 25/100 GbE (data) + 1× 1 GbE (management) | Data + replication traffic |
| **RAM** | 64256 GB (OS cache + metadata) | HDFS cache + OS buffer cache |
| **CPU** | 1632 cores | HDFS overhead is low |
| **NameNode HA** | Active + Standby + JN (JournalNode) | Quorum-based HA |
| **Use case** | Sequential read/write, large files, Spark YARN |
**Model cluster — 1 PB usable:**
- 10× DataNode (12× 18 TB HDD, 2× 1.9 TB NVMe)
- 2× NameNode (HA, 256 GB RAM)
- 3× JournalNode (small VMs)
- Replication 3× → raw ~ 2.2 PB
- Network: 25 GbE for data, 100 GbE for shuffle-heavy Spark
### Object storage as Data Lake (S3/GCS/MinIO)
For new projects (Spark on K8s, Iceberg/Delta, lakehouse), object storage is preferred over HDFS:
| Platform | Advantages | Limits |
|----------|-----------|--------|
| **MinIO** (on-prem) | S3 API, erasure coding, NVMe direct, high throughput | Single tenant (per cluster) |
| **Pure //C** (on-prem) | QLC NVMe, dedupe, S3 + NFS | Higher $/TB |
| **AWS S3** (cloud) | Unlimited capacity, Iceberg/Delta support | Egress fees |
| **Azure ADLS** (cloud) | Hierarchical namespace, HNS, POSIX-like ACLs | Vendor lock |
| **GCP GCS** (cloud) | Uniform + fine-grained ACLs, object versioning | Region restrictions |
### Comparison: HDFS vs Object Storage for Big Data
| Criteria | HDFS | Object Storage (S3/MinIO) |
|----------|------|-------------------------|
| **Architecture** | Master/worker (NameNode SPOF) | Distributed, no SPOF (erasure coding) |
| **Consistency** | Strong (single writer per file) | Eventual (S3) / Strong (MinIO) |
| **Throughput** | High (rack-aware, locality) | High (network-bound) |
| **Scaling** | Horizontal (DataNode) | Horizontal (stateless) |
| **Cost** | Low (HDD) | Medium (S3 API) |
| **Metadata** | NameNode (1M blocks ~ 1 GB RAM) | Object-level (flat namespace) |
| **Spark integration** | Native (locality-optimized) | S3A connector, Hadoop Compatible |
| **2026 trend** | Legacy, declining | Standard for new projects |
For more information about Big Data see [BIG-DATA.en.md](BIG-DATA.en.md).
## Sources
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
### Recommended reading

View File

@@ -270,6 +270,57 @@ OpenStack nabízí tři hlavní storage služby:
Ceph je nejčastější storage backend pro OpenStack: Cinder (RBD), Swift (RGW), Manila (CephFS), Glance (RBD images).
## Big Data storage
### HDFS cluster
HDFS je primární storage pro Hadoop ekosystém (on-prem). Typická konfigurace:
| Parametr | Hodnota | Poznámka |
|----------|---------|----------|
| **Disk per DataNode** | 824 × HDD (1422 TB) + 2× NVMe (metadata, cache) | Balance capacity / performance |
| **Replication factor** | 3× | Rack-aware |
| **Network** | 2× 25/100 GbE (data) + 1× 1 GbE (management) | Data + replication traffic |
| **RAM** | 64256 GB (OS cache + metadata) | HDFS cache + OS buffer cache |
| **CPU** | 1632 cores | HDFS overhead je nízký |
| **NameNode HA** | Active + Standby + JN (JournalNode) | Quorum-based HA |
| **Use case** | Secvenční čtení/zápis, velké soubory, Spark YARN |
**Modelový cluster — 1 PB usable:**
- 10× DataNode (12× 18 TB HDD, 2× 1.9 TB NVMe)
- 2× NameNode (HA, 256 GB RAM)
- 3× JournalNode (malé VM)
- Replication 3× → raw ~ 2.2 PB
- Network: 25 GbE pro data, 100 GbE pro shuffle-heavy Spark
### Object storage jako Data Lake (S3/GCS/MinIO)
Pro nové projekty (Spark on K8s, Iceberg/Delta, lakehouse) se preferuje object storage před HDFS:
| Platforma | Výhody | Limity |
|-----------|--------|--------|
| **MinIO** (on-prem) | S3 API, erasure coding, NVMe direct, high throughput | Single tenant (per cluster) |
| **Pure //C** (on-prem) | QLC NVMe, dedupe, S3 + NFS | Vyšší cena/TB |
| **AWS S3** (cloud) | Neomezená kapacita, Iceberg/Delta support | Egress fees |
| **Azure ADLS** (cloud) | Hierarchical namespace, HNS, POSIX-like ACLs | Vendor lock |
| **GCP GCS** (cloud) | Uniform + fine-grained ACLs, object versioning | Region restrictions |
### Srovnání: HDFS vs Object Storage pro Big Data
| Kritérium | HDFS | Object Storage (S3/MinIO) |
|-----------|------|-------------------------|
| **Architektura** | Master/worker (NameNode SPOF) | Distributed, no SPOF (erasure coding) |
| **Konzistence** | Strong (jediný writer per file) | Eventual (S3) / Strong (MinIO) |
| **Propustnost** | Vysoká (rack-aware, locality) | Vysoká (network-bound) |
| **Škálování** | Horizontální (DataNode) | Horizontální (stateless) |
| **Cena** | Nízká (HDD) | Střední (S3 API) |
| **Metadata** | NameNode (1 mil. bloků ~ 1 GB RAM) | Object-level (flat namespace) |
| **Spark integration** | Native (locality optimalizace) | S3A connector, Hadoop Compatible |
| **2026 trend** | Legacy, klesající | Standard pro nové projekty |
Podrobnější informace o Big Data viz [BIG-DATA.md](BIG-DATA.md).
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)

View File

@@ -94,7 +94,7 @@ Variants:
## Sources
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
### Recommended reading

View File

@@ -1,10 +1,10 @@
# Infrastructure — Sources
Split into separate files:
- [HYPERVISORS.md](../../HYPERVISORS.md) — hypervisors and virtualization
- [DATACENTERS.md](../../DATACENTERS.md) — data centers
- [STORAGE.md](../../STORAGE.md) — storage
- [HARDWARE.md](../../HARDWARE.md) — hardware and servers
- [HYPERVISORS.en.md](../../HYPERVISORS.en.md) — hypervisors and virtualization
- [DATACENTERS.en.md](../../DATACENTERS.en.md) — data centers
- [STORAGE.en.md](../../STORAGE.en.md) — storage
- [HARDWARE.en.md](../../HARDWARE.en.md) — hardware and servers
## Official documentation
@@ -112,7 +112,65 @@ Split into separate files:
| Complete guide to modern vSphere alternatives — Spectro Cloud | https://www.spectrocloud.com/blog/vsphere-alternatives | `[done]` |
| Broadcom VMware Acquisition: What's Next — Sayers | https://www.sayers.com/blog/after-the-deal-whats-next-for-vmware-customers | `[done]` |
| Stanford University migration from VMware to Proxmox | https://itcommunity.stanford.edu/news/enterprise-technology-completes-successful-virtual-infrastructure-migration-vmware-proxmox | `[done]` |
| | **Sangfor** | |
| Sangfor HCI — product page | https://www.sangfor.com/cloud-and-infrastructure/products/hci-hyper-converged-infrastructure | `[done]` |
| Sangfor aSV — hypervisor | https://www.sangfor.com/cloud-and-infrastructure/products/asv-hypervisor-server-virtualization | `[done]` |
| Sangfor vs VMware — feature comparison | https://www.sangfor.com/blog/cloud-and-infrastructure/sangfor-hci-vs-vmware-feature-comparison | `[done]` |
| | **AI infrastructure** | |
| NVIDIA DGX — documentation | https://www.nvidia.com/en-us/data-center/dgx-platform/ | `[done]` |
| InfiniBand — Mellanox/NVIDIA | https://www.nvidia.com/en-us/networking/products/infiniband/ | `[done]` |
| Lustre parallel filesystem | https://www.lustre.org/ | `[done]` |
| WekaFS — AI storage | https://www.weka.io/ | `[done]` |
| vLLM — inference server | https://github.com/vllm-project/vllm | `[done]` |
| Megatron-LM — distributed training | https://github.com/NVIDIA/Megatron-LM | `[done]` |
| | **Kubernetes / Cluster API** | |
| Cluster API (CAPI) — official documentation (The CAPI Book) | https://cluster-api.sigs.k8s.io/ | `[done]` |
| Cluster API — GitHub (kubernetes-sigs/cluster-api) | https://github.com/kubernetes-sigs/cluster-api | `[done]` |
| Cluster API — provider list | https://cluster-api.sigs.k8s.io/reference/providers.html | `[done]` |
| Kubernetes — official documentation | https://kubernetes.io/docs/ | `[done]` |
| K3s — lightweight Kubernetes | https://k3s.io/ | `[done]` |
| RKE2 — Rancher Kubernetes Engine 2 | https://docs.rke2.io/ | `[done]` |
| Talos — API-driven Kubernetes OS | https://www.talos.dev/ | `[done]` |
| Kamaji — hosted control plane provider | https://kamaji.clastix.io/ | `[done]` |
| Metal3 — bare metal provider for CAPI | https://metal3.io/ | `[done]` |
| Cluster API — ClusterClass and topologies | https://kubernetes.io/blog/2021/10/08/capi-clusterclass-and-managed-topologies/ | `[done]` |
| | **Big Data** | |
| Apache Spark — official documentation | https://spark.apache.org/docs/latest/ | `[done]` |
| Apache Flink — official documentation | https://flink.apache.org/ | `[done]` |
| Trino — distributed SQL engine | https://trino.io/docs/current/ | `[done]` |
| Apache Iceberg — table format | https://iceberg.apache.org/ | `[done]` |
| Delta Lake — documentation | https://docs.delta.io/ | `[done]` |
| Apache Hudi | https://hudi.apache.org/ | `[done]` |
| Apache Paimon | https://paimon.apache.org/ | `[done]` |
| Apache Hadoop — documentation | https://hadoop.apache.org/docs/stable/ | `[done]` |
| Apache Airflow — documentation | https://airflow.apache.org/docs/ | `[done]` |
| Dagster — documentation | https://docs.dagster.io/ | `[done]` |
| Prefect — documentation | https://docs.prefect.io/ | `[done]` |
| HDFS architecture (Apache) | https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html | `[done]` |
| | **Operating Systems** | |
| Ubuntu lifecycle — Ubuntu Pro + ESM | https://ubuntu.com/about/release-cycle | `[done]` |
| RHEL lifecycle — Red Hat Enterprise Linux | https://access.redhat.com/support/policy/updates/errata | `[done]` |
| Rocky Linux lifecycle | https://rockylinux.org/download/ | `[done]` |
| AlmaLinux lifecycle | https://almalinux.org/ | `[done]` |
| Debian releases / LTS | https://wiki.debian.org/LTS | `[done]` |
| SLES lifecycle — SUSE | https://www.suse.com/lifecycle/ | `[done]` |
| Alpine Linux releases | https://alpinelinux.org/releases/ | `[done]` |
| Fedora lifecycle | https://docs.fedoraproject.org/en-US/releases/lifecycle/ | `[done]` |
| SELinux — Red Hat docs | https://www.redhat.com/en/topics/linux/what-is-selinux | `[done]` |
| AppArmor — Ubuntu wiki | https://wiki.ubuntu.com/AppArmor | `[done]` |
| | **Windows** | |
| Windows Server lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-server-2022/ | `[done]` |
| Windows Server 2025 lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-server-2025/ | `[done]` |
| Windows 11 lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-11-enterprise/ | `[done]` |
| Windows 10 EOL | https://learn.microsoft.com/en-us/lifecycle/products/windows-10-enterprise/ | `[done]` |
| Windows Server licensing (per core) | https://learn.microsoft.com/en-us/windows-server/get-started/editions-and-support | `[done]` |
| | **GPU pricing** | |
| NVIDIA AI GPU pricing guide (2026) | https://intuitionlabs.ai/articles/nvidia-ai-gpu-pricing-guide | `[done]` |
| GPU cloud pricing comparison (2026) | https://www.spheron.network/blog/gpu-cloud-pricing-comparison-2026/ | `[done]` |
| GPU pricing trends 2026 — CompuX | https://compux.net/docs/guides/gpu-pricing-trends-2026 | `[done]` |
| AMD MI300X pricing (2026) | https://www.thundercompute.com/blog/amd-mi300x-pricing | `[done]` |
| GPU price/performance frontier — Silicon Analysts | https://siliconanalysts.com/tools/frontier | `[done]` |
## Hardware manufacturers
| Manufacturer | Server series | Management |

View File

@@ -111,8 +111,81 @@ Rozděleno do samostatných souborů:
| VMware Migration in 2026: Proxmox, KVM, XCP-ng & Veeam — StarWind | https://starwindsoftware.com/blog/vmware-migration-to-proxmox-kvm-xcp-ng-2026 | `[done]` |
| Complete guide to modern vSphere alternatives — Spectro Cloud | https://www.spectrocloud.com/blog/vsphere-alternatives | `[done]` |
| Broadcom VMware Acquisition: What's Next — Sayers | https://www.sayers.com/blog/after-the-deal-whats-next-for-vmware-customers | `[done]` |
| Stanford University migration from VMware to Proxmox | https://itcommunity.stanford.edu/news/enterprise-technology-completes-successful-virtual-infrastructure-migration-vmware-proxmox | `[done]` |
| Stanford University migration from VMware to Proxmox | https://itcommunity.stanford.edu/news/enterprise-technology-completes-successful-virtual-infrastructure-migration-vmware-proxmox | `[done]` |
| | **Messaging / streaming** | |
| Apache Kafka docs | https://kafka.apache.org/documentation/ | `[done]` |
| RabbitMQ docs | https://www.rabbitmq.com/documentation.html | `[done]` |
| Apache Pulsar docs | https://pulsar.apache.org/docs/ | `[done]` |
| NATS docs | https://docs.nats.io/ | `[done]` |
| Designing Event-Driven Systems (Confluent) | https://www.confluent.io/designing-event-driven-systems/ | `[done]` |
| Kafka: The Definitive Guide (2nd ed.) — Confluent | https://www.confluent.io/resources/kafka-the-definitive-guide/ | `[done]` |
| Enterprise Integration Patterns — Hohpe & Woolf | https://www.enterpriseintegrationpatterns.com/ | `[done]` |
| | **DC migrace** | |
| AWS Cloud Migration — 6 Strategies for Migrating to the Cloud | https://aws.amazon.com/blogs/enterprise-strategy/6-strategies-for-migrating-applications-to-the-cloud/ | `[done]` |
| Azure Cloud Migration — Microsoft Cloud Adoption Framework | https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ | `[done]` |
| Gartner 5 Rs of Cloud Migration | https://www.gartner.com/en/documents/3984835 | `[done]` |
| VMware Site Recovery Manager — documentation | https://docs.vmware.com/en/Site-Recovery-Manager/ | `[done]` |
| Zerto — Disaster Recovery & Migration | https://www.zerto.com/resources/ | `[done]` |
| The Phoenix Project — IT Ops & Migration patterns | https://itrevolution.com/product/the-phoenix-project/ | `[done]` |
| | **Sangfor** | |
| Sangfor HCI — product page | https://www.sangfor.com/cloud-and-infrastructure/products/hci-hyper-converged-infrastructure | `[done]` |
| Sangfor aSV — hypervisor | https://www.sangfor.com/cloud-and-infrastructure/products/asv-hypervisor-server-virtualization | `[done]` |
| Sangfor vs VMware — feature comparison | https://www.sangfor.com/blog/cloud-and-infrastructure/sangfor-hci-vs-vmware-feature-comparison | `[done]` |
| | **AI infrastruktura** | |
| NVIDIA DGX — documentation | https://www.nvidia.com/en-us/data-center/dgx-platform/ | `[done]` |
| InfiniBand — Mellanox/NVIDIA | https://www.nvidia.com/en-us/networking/products/infiniband/ | `[done]` |
| Lustre parallel filesystem | https://www.lustre.org/ | `[done]` |
| WekaFS — AI storage | https://www.weka.io/ | `[done]` |
| vLLM — inference server | https://github.com/vllm-project/vllm | `[done]` |
| Megatron-LM — distributed training | https://github.com/NVIDIA/Megatron-LM | `[done]`
| | **Kubernetes / Cluster API** | |
| Cluster API (CAPI) — oficiální dokumentace (The CAPI Book) | https://cluster-api.sigs.k8s.io/ | `[done]` |
| Cluster API — GitHub (kubernetes-sigs/cluster-api) | https://github.com/kubernetes-sigs/cluster-api | `[done]` |
| Cluster API — seznam providerů | https://cluster-api.sigs.k8s.io/reference/providers.html | `[done]` |
| Kubernetes — oficiální dokumentace | https://kubernetes.io/docs/ | `[done]` |
| K3s — lightweigh Kubernetes | https://k3s.io/ | `[done]` |
| RKE2 — Rancher Kubernetes Engine 2 | https://docs.rke2.io/ | `[done]` |
| Talos — API-driven Kubernetes OS | https://www.talos.dev/ | `[done]` |
| Kamaji — hosted control plane provider | https://kamaji.clastix.io/ | `[done]` |
| Metal3 — bare metal provider pro CAPI | https://metal3.io/ | `[done]` |
| Cluster API — ClusterClass a topologies | https://kubernetes.io/blog/2021/10/08/capi-clusterclass-and-managed-topologies/ | `[done]` |
| | **Big Data** | |
| Apache Spark — oficiální dokumentace | https://spark.apache.org/docs/latest/ | `[done]` |
| Apache Flink — oficiální dokumentace | https://flink.apache.org/ | `[done]` |
| Trino — distribuovaný SQL engine | https://trino.io/docs/current/ | `[done]` |
| Apache Iceberg — tabulkový formát | https://iceberg.apache.org/ | `[done]` |
| Delta Lake — dokumentace | https://docs.delta.io/ | `[done]` |
| Apache Hudi | https://hudi.apache.org/ | `[done]` |
| Apache Paimon | https://paimon.apache.org/ | `[done]` |
| Apache Hadoop — dokumentace | https://hadoop.apache.org/docs/stable/ | `[done]` |
| Apache Airflow — dokumentace | https://airflow.apache.org/docs/ | `[done]` |
| Dagster — dokumentace | https://docs.dagster.io/ | `[done]` |
| Prefect — dokumentace | https://docs.prefect.io/ | `[done]` |
| HDFS architektura (Apache) | https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html | `[done]` |
| | **Operační systémy** | |
| Ubuntu lifecycle — Ubuntu Pro + ESM | https://ubuntu.com/about/release-cycle | `[done]` |
| RHEL lifecycle — Red Hat Enterprise Linux | https://access.redhat.com/support/policy/updates/errata | `[done]` |
| Rocky Linux lifecycle | https://rockylinux.org/download/ | `[done]` |
| AlmaLinux lifecycle | https://almalinux.org/ | `[done]` |
| Debian releases / LTS | https://wiki.debian.org/LTS | `[done]` |
| SLES lifecycle — SUSE | https://www.suse.com/lifecycle/ | `[done]` |
| Alpine Linux releases | https://alpinelinux.org/releases/ | `[done]` |
| Fedora lifecycle | https://docs.fedoraproject.org/en-US/releases/lifecycle/ | `[done]` |
| SELinux — Red Hat docs | https://www.redhat.com/en/topics/linux/what-is-selinux | `[done]` |
| AppArmor — Ubuntu wiki | https://wiki.ubuntu.com/AppArmor | `[done]` |
| | **Windows** | |
| Windows Server lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-server-2022/ | `[done]` |
| Windows Server 2025 lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-server-2025/ | `[done]` |
| Windows 11 lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-11-enterprise/ | `[done]` |
| Windows 10 EOL | https://learn.microsoft.com/en-us/lifecycle/products/windows-10-enterprise/ | `[done]` |
| Windows Server licensing (per core) | https://learn.microsoft.com/en-us/windows-server/get-started/editions-and-support | `[done]` |
| | **GPU ceny** | |
| NVIDIA AI GPU pricing guide (2026) | https://intuitionlabs.ai/articles/nvidia-ai-gpu-pricing-guide | `[done]` |
| GPU cloud pricing comparison (2026) | https://www.spheron.network/blog/gpu-cloud-pricing-comparison-2026/ | `[done]` |
| GPU pricing trends 2026 — CompuX | https://compux.net/docs/guides/gpu-pricing-trends-2026 | `[done]` |
| AMD MI300X pricing (2026) | https://www.thundercompute.com/blog/amd-mi300x-pricing | `[done]` |
| GPU price/performance frontier — Silicon Analysts | https://siliconanalysts.com/tools/frontier | `[done]` |
## Výrobci hardware
| Výrobce | Serverové řady | Management |