18.6.2026
This commit is contained in:
600
AI-INFRASTRUCTURE.en.md
Normal file
600
AI-INFRASTRUCTURE.en.md
Normal file
@@ -0,0 +1,600 @@
|
||||
# 🧠 AI/ML Infrastructure
|
||||
|
||||
## Component overview
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
subgraph Compute
|
||||
GPU["GPU (H100/B200/Instinct)"]
|
||||
CPU["CPU (AMD EPYC / Intel Xeon)"]
|
||||
ASIC["ASIC (TPU, Trainium, Inferentia)"]
|
||||
end
|
||||
subgraph Network
|
||||
IB["InfiniBand NDR/XDR"]
|
||||
ROCE["RoCEv2"]
|
||||
NVL["NVLink / NVSwitch"]
|
||||
end
|
||||
subgraph Storage
|
||||
FS["Parallel FS (Lustre, GPFS, Weka)"]
|
||||
OBJ["Object Store (S3, MinIO)"]
|
||||
NVME["Local NVMe cache"]
|
||||
end
|
||||
subgraph Orchestration
|
||||
S["Slurm"]
|
||||
K["Kubernetes + Volcano/Kueue"]
|
||||
end
|
||||
subgraph Cooling
|
||||
DLC["Direct-to-chip liquid"]
|
||||
IMM["Immersion"]
|
||||
AIR["Air (high-density)"]
|
||||
end
|
||||
|
||||
Compute --> Network --> Storage
|
||||
Orchestration --> Compute
|
||||
Cooling --> Compute
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## GPU compute
|
||||
|
||||
### NVIDIA
|
||||
|
||||
| GPU | Architecture | FP8 | FP16/BF16 | FP64 | HBM | NVLink | TDP | Rack config |
|
||||
|-----|-------------|-----|-----------|------|-----|--------|-----|------|
|
||||
| **H100 SXM** | Hopper | 3,958 TFLOPS | 1,979 TFLOPS | 67 TFLOPS | 80 GB HBM3 | 900 GB/s | 700 W | 6–8× in DGX H100 |
|
||||
| **H200 SXM** | Hopper (HBM3e) | 3,958 TFLOPS | 1,979 TFLOPS | 67 TFLOPS | 141 GB HBM3e | 900 GB/s | 700 W | 6–8× in DGX H200 |
|
||||
| **B200** | Blackwell | ~9,000 TFLOPS | ~4,500 TFLOPS | ~40 TFLOPS | 192 GB HBM3e | 1,800 GB/s | 1,000 W | 6–8× in DGX B200 |
|
||||
| **GB200 Grace Hopper** | Blackwell | ~18,000 TFLOPS | ~9,000 TFLOPS | — | 192 GB + 480 GB (Grace) | NVLink-C2C | 1,000 W (GPU) + 500 W (CPU) | DGX GB200 (36× GPU) |
|
||||
| **L40S** | Ada Lovelace | 733 TFLOPS | 367 TFLOPS | — | 48 GB GDDR6 | N/A | 350 W | Inference, enterprise |
|
||||
| **A100 SXM** | Ampere | 1,248 TFLOPS | 624 TFLOPS | 19.5 TFLOPS | 80 GB HBM2e | 600 GB/s | 400 W | DGX A100 |
|
||||
|
||||
### AMD
|
||||
|
||||
| GPU | Architecture | FP8 | FP16/BF16 | FP64 | HBM | Infinity Fabric | TDP |
|
||||
|-----|-------------|-----|-----------|------|-----|----------------|-----|
|
||||
| **MI300X** | CDNA 3 | 2,615 TFLOPS | 1,307 TFLOPS | 81 TFLOPS | 192 GB HBM3 | 896 GB/s | 750 W |
|
||||
| **MI250** | CDNA 2 | — | 383 TFLOPS | 95.7 TFLOPS | 128 GB HBM2e | 400 GB/s | 500 W |
|
||||
|
||||
### Intel
|
||||
|
||||
| GPU | Architecture | FP16/BF16 | FP32 | HBM | TDP |
|
||||
|-----|-------------|-----------|------|-----|-----|
|
||||
| **Gaudi 3** | Custom | 1,835 TFLOPS | — | 144 GB HBM2e | 600 W |
|
||||
| **Max 1550** | Xe HPC | 600+ TFLOPS | 200 TFLOPS | 128 GB HBM2e | 600 W |
|
||||
|
||||
### Cloud ASIC
|
||||
|
||||
| ASIC | Provider | Use case | Performance |
|
||||
|------|----------|----------|-------|
|
||||
| **TPU v5p** | Google | Training | ~4,600 TFLOPS (BF16) per pod |
|
||||
| **Trainium 2** | AWS | Training | ~1,000 TFLOPS (BF16) per chip |
|
||||
| **Inferentia 2** | AWS | Inference | ~400 TOPS (INT8) per chip |
|
||||
| **Maia 100** | Microsoft | Training + inference | Custom, 800 W TDP |
|
||||
|
||||
---
|
||||
|
||||
## AI networking
|
||||
|
||||
### Technology comparison
|
||||
|
||||
| Technology | Bandwidth per link | Latency | Topology | Use case |
|
||||
|-------------|-------------------|---------|-----------|----------|
|
||||
| **InfiniBand NDR200** | 200 Gb/s | < 1 µs | Fat-tree, Dragonfly+ | Training (NVIDIA) |
|
||||
| **InfiniBand NDR400** | 400 Gb/s | < 1 µs | Fat-tree, Dragonfly+ | Training (NVIDIA) |
|
||||
| **InfiniBand XDR** | 800 Gb/s (planned) | < 1 µs | Dragonfly+ | Next-gen training |
|
||||
| **RoCEv2** (CX-7/8) | 200–400 Gb/s | 1–2 µs | Fat-tree, Spine-leaf | Training (AMD, Intel, open) |
|
||||
| **NVLink 4.0** | 900 GB/s per GPU | < 0.5 µs | NVSwitch full-mesh | Intra-node GPU comm |
|
||||
| **NVLink 5.0** | 1,800 GB/s per GPU | < 0.5 µs | NVSwitch full-mesh | Intra-node (Blackwell) |
|
||||
| **Ethernet (400 GbE)** | 400 Gb/s | 2–5 µs | Spine-leaf | Inference, data pipeline |
|
||||
|
||||
### AI fabric principles
|
||||
|
||||
- **Rail-optimized topology** — each GPU communicates on dedicated "rails" (same GPU indices across nodes connect to the same switch)
|
||||
- **Fat-tree (Clos)** — standard for InfiniBand and RoCE, non-blocking bisection bandwidth
|
||||
- **Dragonfly+** — reduces hop count while maintaining bandwidth (used in largest clusters)
|
||||
- **GPU Direct RDMA** — direct GPU ↔ GPU communication without CPU involvement, supports InfiniBand and RoCE
|
||||
- **SHARP (Scalable Hierarchical Aggregation and Reduction Protocol)** — in-network reduction for AllReduce (InfiniBand only)
|
||||
|
||||
### Bandwidth sizing
|
||||
|
||||
```text
|
||||
Rule of thumb: InfiniBand bandwidth ≥ 50 % GPU HBM bandwidth for scalable training
|
||||
|
||||
Example: H100 has 3.35 TB/s HBM
|
||||
→ Needs min. 1.6 TB/s bisection bandwidth per GPU
|
||||
→ 8× H100 in DGX: 4× NDR400 IB per GPU = 4 × 50 GB/s = 200 GB/s
|
||||
→ Reality: 8× 200 Gb/s (25 GB/s) per GPU in typical config = ~6 % HBM → bottleneck
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## AI storage
|
||||
|
||||
### Requirements
|
||||
|
||||
| Dataset size | IO pattern | Recommended storage | Bandwidth |
|
||||
|-------------|-----------|-------------------|-----------|
|
||||
| < 10 TB | Sequential read (data loading) | Local NVMe | > 10 GB/s per node |
|
||||
| 10–100 TB | Random read (checkpointing) | Parallel FS (Lustre, Weka) | > 100 GB/s cluster-wide |
|
||||
| 100 TB–10 PB | Mixed (training + checkpoint) | Parallel FS + object store | > 500 GB/s |
|
||||
| 10 PB+ | Multi-modal, video, LLM | Tiered (NVMe cache + parallel FS + object) | > 1 TB/s |
|
||||
|
||||
### Storage solution comparison
|
||||
|
||||
| Solution | Type | Bandwidth per node | Max capacity | Scaling | Use case |
|
||||
|--------|-----|-------------------|-------------|-----------|----------|
|
||||
| **Lustre** | Parallel FS (POSIX) | > 100 GB/s (cluster) | 100s PB | OST + MDS | HPC, LLM training (standard) |
|
||||
| **GPFS / StorageScale** | Parallel FS (POSIX) | > 100 GB/s | 100s PB | NSD servers | HPC, AI (IBM) |
|
||||
| **WekaFS** | Parallel FS (POSIX + NFS/SMB) | ~80 GB/s per 10 nodes | 10s PB | Container-native | AI/ML, NVIDIA DGX preferred |
|
||||
| **VAST Data** | Universal storage (NVMe + QLC) | ~100 GB/s per cluster | 10s PB | Scale-out | AI, checkpoint, data lake |
|
||||
| **Pure Storage//E** | All-flash (NVMe) | ~50 GB/s | ~30 PB | Scale-out | Enterprise AI, database |
|
||||
| **MinIO / S3** | Object store | ~20 GB/s per gateway | EB | Erasure coding | Dataset repository, checkpoint |
|
||||
| **NetApp AFF** | NAS + S3 | ~10 GB/s per controller | ~50 PB | HA pair | Enterprise, NFS baseline |
|
||||
|
||||
### Checkpointing strategies
|
||||
|
||||
| Strategy | RPO | Storage impact | Description |
|
||||
|-----------|-----|---------------|-------|
|
||||
| **Full checkpoint** | every N steps | High (stops training) | Full model + optimizer state |
|
||||
| **Async checkpoint** | every N steps | Medium (non-blocking) | Copy to staging buffer, async write |
|
||||
| **Distributed checkpoint** (NVIDIA NeMo) | every N steps | Low | Each rank writes its own shard |
|
||||
| **In-memory checkpoint** (IBM) | on failover | Minimal (DRAM) | Replication to another node's DRAM |
|
||||
| **Continuous checkpoint** (Microsoft) | every 1–5 min | Low (delta) | Changed shards only |
|
||||
|
||||
---
|
||||
|
||||
## AI cluster architecture
|
||||
|
||||
### Physical topology — DGX H100 example
|
||||
|
||||
```
|
||||
┌──────── DGX H100 (8× GPU) ────────┐
|
||||
│ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │
|
||||
│ │GPU 0│ │GPU 1│ │GPU 2│ │GPU 3│ │
|
||||
│ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ │
|
||||
│ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ │
|
||||
│ │GPU 4│ │GPU 5│ │GPU 6│ │GPU 7│ │
|
||||
│ └─────┘ └─────┘ └─────┘ └─────┘ │
|
||||
│ NVSwitch (NVLink 4.0, 900 GB/s) │
|
||||
│ InfiniBand CX-7: 8× NDR400 │
|
||||
└────────────────────────────────────┘
|
||||
│ 8× IB rails
|
||||
┌────┴──────────────┐
|
||||
│ IB NDR400 Switches │ (rail-optimized)
|
||||
└────────────────────┘
|
||||
```
|
||||
|
||||
### Kubernetes for AI
|
||||
|
||||
| Component | Role |
|
||||
|-----------|------|
|
||||
| **Volcano** | Batch scheduling, gang scheduling, queue management |
|
||||
| **Kueue** | Multi-tenant admission, resource quotas, fair sharing |
|
||||
| **NVIDIA GPU Operator** | Driver, container toolkit, MIG, DCGM, monitoring |
|
||||
| **HAMi** (ex k8s-vGPU-scheduler) | GPU sharing, MIG partitioning, fractional GPU |
|
||||
| **Node Feature Discovery** | GPU type detection, NUMA topology |
|
||||
| **Topology Manager** | NUMA-aware pod placement |
|
||||
| **DPDK / SR-IOV** | High-performance networking for GPU Direct RDMA |
|
||||
|
||||
### Slurm for AI
|
||||
|
||||
| Component | Role |
|
||||
|-----------|------|
|
||||
| **slurm.conf** | Partition for GPU nodes, GRES (Generic Resource) |
|
||||
| **gres.conf** | GPU type, GPU count per node |
|
||||
| **srun --gres=gpu:8** | Allocate 8 GPUs per job |
|
||||
| **sbatch --nodes=64 --ntasks=512** | 64 nodes, 512 ranks (8 GPU/node) |
|
||||
| **Pixis** | NVIDIA orchestration plugin for Slurm |
|
||||
|
||||
---
|
||||
|
||||
## AI cluster cooling
|
||||
|
||||
### Power density comparison
|
||||
|
||||
| Configuration | TDP per node | Racks | kW/rack | Note |
|
||||
|-------------|-------------|-------|---------|----------|
|
||||
| Standard server (2U) | 1 kW | 20 | 5–10 | Typical DC |
|
||||
| GPU server (DGX H100, 6×) | 42 kW | 6 | 45–50 | Air cooling limit |
|
||||
| GPU server (DGX B200, 6×) | 72 kW | 6 | 90–100 | Liquid cooling required |
|
||||
| GPU server (GB200 NVL72) | 120 kW | — | ~120 | Liquid cooling mandatory |
|
||||
| NVIDIA NVL72 rack | 120 kW | 1 | 120 | Fully liquid cooled |
|
||||
|
||||
### Cooling technologies
|
||||
|
||||
| Method | Max kW/rack | CAPEX | OPEX | Complexity |
|
||||
|--------|-------------|-------|------|-----------|
|
||||
| **Air cooling (CRAC/CRAH)** | < 15 | Low | Medium | Low |
|
||||
| **Air cooling (in-row)** | 15–30 | Medium | Medium | Low |
|
||||
| **Rear-door heat exchanger** | 30–50 | Medium | Low | Medium |
|
||||
| **Direct-to-chip liquid (cold plate)** | 50–150 | High | Low | High |
|
||||
| **Immersion (single-phase)** | 100–200 | High | Low | High |
|
||||
| **Immersion (two-phase)** | 200+ | Very high | Low | Very high |
|
||||
|
||||
---
|
||||
|
||||
## Inference infrastructure
|
||||
|
||||
### Inference server comparison
|
||||
|
||||
| Tool | Frameworks | Optimization | Use case |
|
||||
|---------|-----------|-------------|----------|
|
||||
| **vLLM** | Megatron, HF, AWQ, GPTQ | PagedAttention, KV cache, continuous batching | LLM inference (open source) |
|
||||
| **TensorRT-LLM** | TensorRT | INT4/INT8/FP8, inflight batching, attention optimizations | Production (NVIDIA) |
|
||||
| **Triton Inference Server** | All (TensorRT, vLLM, PyTorch) | Model ensemble, model caching, concurrent execution | Enterprise, multi-model |
|
||||
| **SageMaker** | Managed | Auto-scaling, model parallelism | AWS managed |
|
||||
| **OpenAI API / TGI** | HF Transformers | Continuous batching, flash attention | Hosting |
|
||||
|
||||
### Inference optimization
|
||||
|
||||
| Technique | Latency improvement | Throughput improvement | Memory reduction |
|
||||
|----------|-----------------|---------------------|------------------|
|
||||
| **FP8/INT8 quantization** | — | 2× | 2× |
|
||||
| **INT4 quantization** | — | 4× | 4× |
|
||||
| **Flash Attention 2/3** | 2–4× | — | 50 % (KV cache) |
|
||||
| **PagedAttention** | — | 2–5× | 95 % (KV cache fragmentation) |
|
||||
| **Continuous batching** | — | 10–20× | — |
|
||||
| **Speculative decoding** | 2–3× | — | — |
|
||||
| **Multi-LoRA / S-LoRA** | — | 8–16× | — |
|
||||
|
||||
---
|
||||
|
||||
## Distributed training techniques
|
||||
|
||||
| Technique | Description | Frameworks |
|
||||
|----------|-------|------------|
|
||||
| **Data Parallelism (DDP/FSDP)** | Each GPU has model copy, different batch | PyTorch DDP, FSDP |
|
||||
| **Tensor Parallelism (TP)** | Model split across layers (intra-node) | Megatron-LM, DeepSpeed |
|
||||
| **Pipeline Parallelism (PP)** | Layers split across nodes | Megatron-LM, DeepSpeed |
|
||||
| **Sequence Parallelism (SP)** | Sequence split across GPUs | Megatron-LM |
|
||||
| **Expert Parallelism (EP)** | Different expert subnets on different GPUs | Mixture-of-Experts (MoE) |
|
||||
| **3D Parallelism** | TP + PP + DP combination | Megatron-LM, NeMo |
|
||||
| **ZeRO (1/2/3)** | Optimizer/gradient/parameter sharding | DeepSpeed |
|
||||
| **NCCL / RCCL** | GPU collective communication library | NVIDIA/AMD |
|
||||
|
||||
---
|
||||
|
||||
## Operating systems for AI
|
||||
|
||||
### Distribution comparison
|
||||
|
||||
| OS | GPU driver | CUDA | Container toolkit | IB/RoCE | Lustre client | Production support |
|
||||
|----|-----------|------|-------------------|---------|--------------|-------------------|
|
||||
| **Ubuntu 22.04 LTS** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED, rdma-core | Yes (lustre-client) | NVIDIA DGX standard |
|
||||
| **Ubuntu 24.04 LTS** | NVIDIA 550+ | 12.5+ | nvidia-container-toolkit | MLNX_OFED, rdma-core | Yes | Latest GPU support |
|
||||
| **RHEL 9 / Rocky 9** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED | Yes (EL repo) | Red Hat, enterprise |
|
||||
| **DGX OS** (Ubuntu-based) | NVIDIA custom | 12.x | Pre-installed | Pre-configured | Yes | NVIDIA DGX only supported |
|
||||
| **SLES 15 SP5** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED | Yes | HPC, some Lustre clusters |
|
||||
| **Debian 12** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | rdma-core | Yes (backports) | Community, research |
|
||||
| **Flatcar / Bottlerocket** | Container-host | — | nvidia-container-toolkit | Limited | No | K8s-only, minimal footprint |
|
||||
|
||||
### Limitations and constraints
|
||||
|
||||
#### GPU drivers and CUDA
|
||||
|
||||
| Constraint | Detail |
|
||||
|----------|--------|
|
||||
| **Driver-CUDA compatibility** | NVIDIA driver major version must match CUDA toolkit (driver ≥ CUDA req). E.g., CUDA 12.5 requires driver ≥ 550 |
|
||||
| **Kernel version** | NVIDIA driver not compatible with all kernels. New kernel (6.8+) may require DKMS build or delayed support |
|
||||
| **Secure Boot** | NVIDIA driver requires signed module (MOK, shim) or disabled Secure Boot — common enterprise issue |
|
||||
| **Open vs Proprietary driver** | NVIDIA `nvidia-open` (since R515) — open source kernel module. GPU support: DC (H100+) → OK, older GPUs → proprietary required |
|
||||
| **nvidia-persistenced** | Required to maintain GPU initialization; without it GPUs may sleep after idle timeout (`nvidia-smi -pm 1`) |
|
||||
| **GPU reset** | After crashed training job, GPU may hang. `nvidia-smi --gpu-reset` or reboot node, sometimes power cycle |
|
||||
| **Multi-instance GPU (MIG)** | Requires specific driver, MIG mode on GPU, GPU restart. Cannot be changed at runtime. A100, H100, B200 only |
|
||||
|
||||
#### Network (InfiniBand / RoCE)
|
||||
|
||||
| Constraint | Detail |
|
||||
|----------|--------|
|
||||
| **MLNX_OFED vs rdma-core** | MLNX_OFED (NVIDIA) — full support, but own kernel modules, kernel version compatibility needed. `rdma-core` (open) — limited support, no custom modules |
|
||||
| **Kernel compatibility** | MLNX_OFED supports only specific kernel versions (major.minor). Kernel upgrade → MLNX_OFED rebuild required |
|
||||
| **NCCL** | NCCL version must be compatible with CUDA and IB firmware. `nccl-tests` for validation |
|
||||
| **SHARP** | In-network reduction requires specific MLNX_OFED + IB switch firmware combination |
|
||||
| **GPU Direct RDMA** | Requires `nvidia-peermem` module + MLNX_OFED. Does not work with all GPU and IB card combinations |
|
||||
| **RoCE PFC/ECN** | RoCE requires lossless fabric (PFC, ECN, DCQCN). Switch and host configuration — complex tuning |
|
||||
|
||||
#### Storage
|
||||
|
||||
| Constraint | Detail |
|
||||
|----------|--------|
|
||||
| **Lustre client** | Client version must match server. Server upgrade → upgrade all clients. Compatible with RHEL/Debian derivatives only |
|
||||
| **POSIX locking** | NFS and Lustre have different POSIX locking behavior. Distributed training relies on flock → problematic with mixed FS |
|
||||
| **Filesystem cache** | Page cache can mask IO bottlenecks. Training jobs often require `O_DIRECT` or sync IO |
|
||||
| **Local NVMe vs parallel FS** | Dataset staging on local NVMe eliminates network dependency but requires space and pre-fetch pipeline |
|
||||
|
||||
#### Container runtime
|
||||
|
||||
| Constraint | Detail |
|
||||
|----------|--------|
|
||||
| **Docker + GPU** | `nvidia-container-toolkit` (formerly nvidia-docker2). Requires runtime installation and config in `/etc/docker/daemon.json` |
|
||||
| **Podman + GPU** | Requires `nvidia-container-toolkit` + podman hook. Less tested than Docker |
|
||||
| **containerd + GPU** | Standard for K8s. Requires `cdi` (Container Device Interface) or `nvidia-container-runtime` |
|
||||
| **Enroot + Pyxis** | NVIDIA container stack for Slurm (Enroot = daemonless container runtime, Pyxis = Slurm plugin) |
|
||||
| **User namespace mapping** | Container GPU access requires device cgroup; rootless may fail (exception for /dev/dri and /dev/nvidia*) |
|
||||
|
||||
#### Kernel parameters
|
||||
|
||||
```text
|
||||
# AI workload recommended sysctl
|
||||
net.core.rmem_max = 134217728 # sufficient for NCCL
|
||||
net.core.wmem_max = 134217728
|
||||
net.ipv4.tcp_rmem = 4096 87380 134217728
|
||||
net.ipv4.tcp_wmem = 4096 65536 134217728
|
||||
net.core.netdev_budget = 600 # for high packet rate
|
||||
vm.max_map_count = 1048576 # PyTorch DataLoader workers
|
||||
kernel.numa_balancing = 0 # disable NUMA balancing (breaks locality)
|
||||
kernel.sched_min_granularity_ns = 10000000
|
||||
|
||||
# Disable security mitigations for perf (dedicated AI clusters only)
|
||||
mitigations=off
|
||||
transparent_hugepages=never # or madvise — THP may cause latency spikes
|
||||
intel_idle.max_cstate=1 # reduce C-state transition latency
|
||||
```
|
||||
|
||||
#### Firmware and HW
|
||||
|
||||
| Constraint | Detail |
|
||||
|----------|--------|
|
||||
| **GPU firmware (VBIOS)** | NVIDIA datacenter GPUs (H100, B200) have VBIOS updates via NVFlash. Without update → missing partitioning support or newer CUDA features |
|
||||
| **InfiniBand firmware** | IB switch and HCA firmware must be compatible. Mix old switch + new HCA → degraded perf |
|
||||
| **NVSwitch firmware** | DGX systems have NVSwitch firmware updatable only via NVIDIA DGX tools |
|
||||
| **Power capping (nvidia-smi)** | `nvidia-smi -pl <power>` — limit TDP for power budget management. Test impact on training throughput |
|
||||
| **GPU clock locking** | `nvidia-smi -ac <clock,mem>` — locked clock frequency for stable benchmarks. Apply after `nvidia-persistenced` |
|
||||
| **PCIe Gen** | GPU in PCIe Gen4 slot (instead of Gen5) → bottleneck for CPU↔GPU data transfer. Important for FSDP sharding |
|
||||
|
||||
### Recommended OS per use case
|
||||
|
||||
| Use case | OS | Rationale |
|
||||
|----------|-----|-------|
|
||||
| **DGX cluster (production)** | DGX OS / Ubuntu 22.04 LTS | NVIDIA standard, best driver support |
|
||||
| **Enterprise K8s (OpenShift)** | RHEL 9 / RHCOS | Red Hat support, GPU Operator compatible |
|
||||
| **Vanilla K8s (on-prem)** | Ubuntu 22.04 LTS + Flatcar (workers) | Widest community support, Flatcar for minimal footprint |
|
||||
| **Slurm cluster (HPC/AI)** | Rocky Linux 9 / Ubuntu 22.04 LTS | EL ecosystem (Lustre, OFED) or Ubuntu (community) |
|
||||
| **Research / rapid prototyping** | Ubuntu 24.04 LTS | Latest CUDA, PyTorch, driver support |
|
||||
| **Edge inference** | NVIDIA JetPack / Ubuntu (ARM) | Embedded GPU (Jetson Orin, AGX) |
|
||||
|
||||
---
|
||||
|
||||
## AI-ready data center — check-list
|
||||
|
||||
| Area | Requirement |
|
||||
|--------|-----------|
|
||||
| **Power** | 30–120 kW/rack, HVDC (400 V DC), UPS supporting GPU spikes |
|
||||
| **Cooling** | Liquid cooling ready (direct-to-chip), rear-door for 30+ kW |
|
||||
| **Network** | InfiniBand (NDR/XDR) or RoCEv2, rail-optimized fat-tree |
|
||||
| **Storage** | Parallel FS (Lustre/Weka), checkpoint bandwidth > 100 GB/s |
|
||||
| **GPU density** | Max GPU/rack, minimize NVSwitch hops |
|
||||
| **Physical** | Floor load 1,500+ kg/m², rack 52U–60U |
|
||||
| **Security** | Tenant isolation, network segmentation, data encryption |
|
||||
| **Monitoring** | DCGM, NCCL health checks, thermals, power capping |
|
||||
|
||||
---
|
||||
|
||||
## Model and throughput limitations
|
||||
|
||||
### Model size per GPU
|
||||
|
||||
Maximum model size fitting on a single GPU depends on HBM capacity and precision:
|
||||
|
||||
| GPU | HBM | FP32 | FP16/BF16 | INT8 | INT4 |
|
||||
|-----|-----|------|-----------|------|------|
|
||||
| **H100 80GB** | 80 GB | ~10B | ~40B | ~80B | ~160B |
|
||||
| **H200 141GB** | 141 GB | ~18B | ~70B | ~140B | ~280B |
|
||||
| **B200 192GB** | 192 GB | ~24B | ~96B | ~192B | ~384B |
|
||||
| **MI300X 192GB** | 192 GB | ~24B | ~96B | ~192B | ~384B |
|
||||
| **A100 80GB** | 80 GB | ~10B | ~40B | ~80B | ~160B |
|
||||
| **GB200 (192+480)** | 192 GB GPU + 480 GB Grace | — | ~96B + CPU offload | — | — |
|
||||
|
||||
*Approximate: 1B params ≈ 2 GB FP16 ≈ 4 GB FP32 ≈ 1 GB INT8 ≈ 0.5 GB INT4. Subtract ~10–15 % HBM for activations, KV cache, optimizer states.*
|
||||
|
||||
### Memory breakdown inference
|
||||
|
||||
| Component | Llama 3 70B (FP16) | Llama 3 8B (FP16) |
|
||||
|------------|-------------------|-------------------|
|
||||
| Model weights | 140 GB | 16 GB |
|
||||
| KV cache (4K context, batch 1) | ~2 GB | ~0.2 GB |
|
||||
| KV cache (128K context, batch 1) | ~60 GB | ~6.5 GB |
|
||||
| Activations (peak) | ~5 GB | ~1 GB |
|
||||
| **Total 4K ctx** | ~147 GB | ~17 GB |
|
||||
| **Total 128K ctx** | ~205 GB | ~23 GB |
|
||||
|
||||
**Conclusion:** Llama 3 70B FP16 does not fit on a single H100 (80 GB). Required: INT8 (170 GB → 2× H100), INT4 (85 GB → 1× H200), or tensor parallelism.
|
||||
|
||||
### Context length vs memory
|
||||
|
||||
| Context | KV cache 70B (FP16) | KV cache 8B (FP16) | Note |
|
||||
|---------|-------------------|-------------------|------|
|
||||
| 4K | ~2.2 GB | ~0.25 GB | Typical chat |
|
||||
| 32K | ~18 GB | ~2 GB | Documents |
|
||||
| 128K | ~72 GB | ~8 GB | Long-context (Claude, Gemini) |
|
||||
| 1M | ~560 GB | ~64 GB | Experimental (Gemini 1.5 Pro) |
|
||||
|
||||
KV cache is **linear with context length** and quadratic with attention head count. Critical for long-context inference.
|
||||
|
||||
### Throughput inference
|
||||
|
||||
| Model | GPU | Precision | Batch size | Tokens/s | QPS (1K output) |
|
||||
|-------|-----|-----------|-----------|----------|-----------------|
|
||||
| Llama 3 8B | H100 | FP16 | 1 | ~800 | ~0.8 |
|
||||
| Llama 3 8B | H100 | FP16 | 128 | ~4 500 | ~35 |
|
||||
| Llama 3 8B | H100 | INT4 | 128 | ~8 000 | ~62 |
|
||||
| Llama 3 70B | 4× H100 | FP16 | 1 | ~180 | ~0.18 |
|
||||
| Llama 3 70B | 4× H100 | INT4 | 64 | ~1 200 | ~19 |
|
||||
| Llama 3 70B | 8× H100 | FP16 (TP=8) | 128 | ~2 500 | ~20 |
|
||||
| DeepSeek-R1 671B | 8× H200 | FP8 (MoE) | 64 | ~500 | ~8 |
|
||||
| GPT-4 class (est.) | — | — | — | ~100–300 | ~1–3 |
|
||||
|
||||
**Notes:**
|
||||
- QPS (queries per second) depends on output length (1K tokens ≈ ~1 query)
|
||||
- Larger batch increases throughput but increases TTFB (time to first token)
|
||||
- Tensor Parallelism (TP) scales, but communication overhead grows linearly
|
||||
|
||||
### Training limits
|
||||
|
||||
#### Scaling efficiency
|
||||
|
||||
| GPU count | Model | Efficiency | Reason |
|
||||
|-----------|-------|-----------|-------|
|
||||
| 8 (1 node) | Llama 3 8B | ~95 % | NVLink intra-node |
|
||||
| 64 (8 nodes) | Llama 3 8B | ~85 % | IB inter-node |
|
||||
| 512 (64 nodes) | Llama 3 70B | ~75 % | Communication overhead |
|
||||
| 4 096 (512 nodes) | Llama 3 70B | ~60 % | Pipeline bubble, network |
|
||||
| 16 384 (2 048 nodes) | Llama 3 405B | ~45 % | Synchronous SGD overhead |
|
||||
|
||||
**Note:** Efficiency = (actual throughput) / (ideal linear speedup). Decreases logarithmically with GPU count.
|
||||
|
||||
#### Memory breakdown training
|
||||
|
||||
| Component | Llama 3 70B (BF16) | Llama 3 8B (BF16) |
|
||||
|------------|-------------------|-------------------|
|
||||
| Model weights | 140 GB | 16 GB |
|
||||
| Optimizer states (Adam) | 280 GB | 32 GB |
|
||||
| Gradients | 140 GB | 16 GB |
|
||||
| Activations (peak) | ~30 GB | ~4 GB |
|
||||
| **Total (DDP)** | ~590 GB | ~68 GB |
|
||||
| **Total (FSDP shard=8)** | ~74 GB | ~8.5 GB |
|
||||
|
||||
**Conclusion:** FSDP (Fully Sharded Data Parallelism) is required for training models > 10B. Adam optimizer doubles memory vs inference (weights + optimizer + gradients).
|
||||
|
||||
#### Time to train
|
||||
|
||||
| Model | GPU count | GPU type | Precision | Time | Cost (on-prem estimate) |
|
||||
|-------|-----------|---------|-----------|------|---------------------|
|
||||
| Llama 3 8B | 64 | H100 | BF16 | ~3 days | ~$5 000 |
|
||||
| Llama 3 70B | 512 | H100 | BF16 | ~14 days | ~$100 000 |
|
||||
| Llama 3 405B | 16 384 | H100 | BF16 | ~60 days | ~$14 M |
|
||||
| DeepSeek-R1 671B (MoE) | 2 048 | H800 | BF16 | ~30 days | ~$6 M |
|
||||
| GPT-4 (est.) | 25 000 | A100/H100 | Mixed | ~90–100 days | ~$100 M |
|
||||
|
||||
### Power and thermal limits
|
||||
|
||||
| Configuration | TDP limit | Throughput loss | Reason |
|
||||
|-------------|-----------|------------------|--------|
|
||||
| H100 SXM | 700 W (default) | 0 % | Nominal |
|
||||
| H100 SXM | 600 W (-15 %) | ~5–8 % | Power capping |
|
||||
| H100 SXM | 500 W (-30 %) | ~15–25 % | Significant throttling |
|
||||
| H100 SXM | 400 W (-43 %) | ~30–50 % | Emergency only |
|
||||
| DGX H100 (8×) | 5.6 kW (max) | 0 % | Liquid cooling required |
|
||||
| DGX H100 (8×) | 4.5 kW (air) | ~10–15 % | Rear-door heat exchanger |
|
||||
|
||||
GPU throttles when exceeding TDP or temperature (85°C+). Power capping correlates linearly with frequency but non-linearly with throughput.
|
||||
|
||||
### API and operational limits
|
||||
|
||||
| Limit | Description | Typical value |
|
||||
|-------|-------|-----------------|
|
||||
| **Rate limit** | Max requests per minute/hour | 100–10 000 RPM (per tier) |
|
||||
| **Tokens per minute (TPM)** | Max tokens per minute | 1M–300M (per model) |
|
||||
| **Context window** | Max input tokens | 4K–2M (per model) |
|
||||
| **Max output tokens** | Max generated tokens | 4K–32K (per model) |
|
||||
| **Concurrent requests** | Parallel request count | 10–10 000 (per backend) |
|
||||
| **Batch window** | Time to accumulate batch | 0–20 s (vLLM, TGI) |
|
||||
| **TTFB timeout** | Max latency to first token | 30–120 s |
|
||||
| **Idle timeout** | GPU idle → scale to 0 | 5–15 min (cloud) |
|
||||
|
||||
### Limits per deployment model
|
||||
|
||||
| Dimension | On-prem HW | Managed cloud (SageMaker, Vertex) | API (OpenAI, Anthropic) |
|
||||
|-----------|--------------|----------------------------------|------------------------|
|
||||
| **Model size** | Limited by HBM (max 192 GB/GPU) | Unlimited (cluster scaling) | Unlimited |
|
||||
| **Queries** | Limited by GPU count | Auto-scaling | Rate limit (per tier) |
|
||||
| **Latency** | < 10 ms (same node) | 10–100 ms (network hop) | 100 ms – 10 s |
|
||||
| **Customization** | Full (fine-tuning, quantization) | Managed (SageMaker, Bedrock) | Prompt engineering only |
|
||||
| **Data privacy** | Yes (on-prem) | Contractual (region, encryption) | Limited |
|
||||
| **Cost per 1M tokens** | ~$0.10–0.50 (FP16 inference) | ~$0.20–1.00 | ~$0.15–15.00 |
|
||||
| **Max context** | 128K+ (depending on GPU count) | 128K+ | 32K–2M |
|
||||
| **Cold start** | 0 (always-on) | 30 s – 5 min | 0 (shared infra) |
|
||||
|
||||
---
|
||||
|
||||
## GPU pricing and price/performance (2026)
|
||||
|
||||
> Prices are approximate — NVIDIA does not publish official datacenter GPU price lists. Cloud prices from public providers (Q2 2026). HW purchase prices vary by volume, reseller, and region.
|
||||
|
||||
### Purchase price (buy)
|
||||
|
||||
| GPU | Price/GPU | Price 8× GPU baseboard | $/PFLOPS (FP16) | Note |
|
||||
|-----|---------|----------------------|----------------|------|
|
||||
| **H100 SXM** | $27,000–40,000 | ~$200,000 | $25,000 | Scarcity 2023–2024, now stabilized |
|
||||
| **H200 SXM** | $35,000–50,000 | ~$280,000 | ~$35,000 | H100 upgrade, HBM3e |
|
||||
| **B200** | ~$60,000–70,000 | ~$500,000+ | ~$31,000 | Blackwell, FP4 support |
|
||||
| **B100** | ~$30,000 | ~$240,000 | ~$20,000 | Lower price than B200, similar FP8 perf |
|
||||
| **GB200** (Grace+Blackwell) | ~$70,000–100,000 | ~$2,000,000 (rack) | — | CPU+GPU unified, high-density |
|
||||
| **A100 80GB** | ~$10,000–15,000 | ~$120,000 | ~$19,200 | Previous gen, still relevant |
|
||||
| **MI300X** | ~$12,000–18,000 | ~$100,000 | ~$9,600 | AMD, 192 GB HBM3 |
|
||||
| **Gaudi 3** | ~$15,625 | ~$125,000 | **$8,515** | Intel, best $/PFLOPS |
|
||||
| **L40S** | ~$8,000–10,000 | — | — | Inference, enterprise |
|
||||
|
||||
### Cloud pricing (on-demand $/GPU/hr)
|
||||
|
||||
| GPU | Cheapest | Mid-range (CoreWeave, Lambda) | Hyperscaler (AWS, GCP, Azure) |
|
||||
|-----|----------|-----------------------------|-------------------------------|
|
||||
| **H100 SXM** | $1.38 (Thunder) | $2.89–3.29 | $4.15–6.88 |
|
||||
| **H100 PCIe** | $2.01 (Spheron) | $2.50 | — |
|
||||
| **H200 SXM** | $3.89 (Spheron) | $4.54 | $5.00+ |
|
||||
| **B200** | **$3.39** (Spheron) | $6.02 | $14.24 (AWS) |
|
||||
| **B200 spot** | **$2.12** (Spheron) | — | — |
|
||||
| **GB200** | $3.50 (Runcrate) | $5.85 (Oracle) | $6.95 (GCP) |
|
||||
| **MI300X** | **$1.50** (TensorWave) | $1.85 (Vultr) | $7.86 (Azure) |
|
||||
| **A100 80GB** | $1.07 (Spheron) | $1.50–2.00 | $3.00+ |
|
||||
| **Gaudi 3** | ~$1.50–2.50 | — | — |
|
||||
| **L40S** | $0.91 (Spheron) | $1.50–2.00 | — |
|
||||
|
||||
### Inference cost ($/M tokens)
|
||||
|
||||
| GPU | Provider | $/hr | Est. tok/s | $/M tok |
|
||||
|-----|----------|------|-----------|--------|
|
||||
| **B200** | Spheron | $3.39 | ~4,000 | **$0.42** |
|
||||
| **B200 spot** | Spheron | $2.12 | ~4,000 | **$0.15** |
|
||||
| **H100 PCIe** | Spheron | $2.01 | ~1,200 | $0.47 |
|
||||
| **A100 80GB** | Spheron | $1.07 | ~520 | $0.57 |
|
||||
| **H100 SXM** | AWS | $6.88 | ~1,200 | $1.59 |
|
||||
| **H200 SXM** | Spheron | $4.54 | ~1,800 | $0.70 |
|
||||
| **L40S** | Spheron | $0.91 | ~450 | $0.56 |
|
||||
|
||||
*Values for Llama 3 70B (INT8, batch=1, output 1K tok). Actual values vary by batch size, context, and quantization.*
|
||||
|
||||
### Cost per GB HBM
|
||||
|
||||
| GPU | HBM | Price/hr cloud | $/GB/hr | Best for memory-bound workloads |
|
||||
|-----|-----|-------------|--------|--------------------------------|
|
||||
| **MI300X** | 192 GB | $1.50 | **$0.0078** | ✅ Best |
|
||||
| **B200** | 192 GB | $3.39 | $0.0177 | ✅ Good |
|
||||
| **H200** | 141 GB | $3.89 | $0.0276 | ⚠️ |
|
||||
| **H100 SXM** | 80 GB | $1.38 | $0.0173 | ⚠️ Only up to 70B models |
|
||||
| **GB200** | 384 GB | $3.50 | $0.0091 | ✅✅ (2× MI300X capacity) |
|
||||
|
||||
### Price/performance by scenario
|
||||
|
||||
| Scenario | Winner | Rationale |
|
||||
|----------|--------|-----------|
|
||||
| **Absolute performance** (cost no object) | **GB200 DGX NVL72** | 72× GPU, 18 PFLOPS FP8, 384 GB HBM/GPU |
|
||||
| **Cloud inference** — best $/token | **B200 spot** | $0.15/M tok; 4× H100 throughput at lower cost |
|
||||
| **Cloud inference** — on-demand | **B200** | $0.42/M tok |
|
||||
| **Cloud inference** — budget | **A100 / L40S** | $0.57–0.56/M tok |
|
||||
| **Training** — price/perf on purchase | **Gaudi 3** | $8,515/PFLOPS, 2.5–3× better than H100 |
|
||||
| **Training** — cloud | **H100 SXM** | $1.38/hr, CUDA ecosystem, NCCL |
|
||||
| **Memory-bound** — long context, 70B+ | **MI300X / GB200** | 192–384 GB, $0.0078–0.0091/GB |
|
||||
| **Ecosystem + safe choice** | **H100/H200** | CUDA, widest SW, NVIDIA tools |
|
||||
| **Spot / preemptible** — lowest cost | **A100 / H100** | $1.07–1.38/hr, 50–90% off on-demand |
|
||||
|
||||
### 2026 Trends
|
||||
|
||||
- **H100** — price dropped 64% from peak $8/hr to $1.38–2.89/hr, then 40% rebound from inference demand
|
||||
- **B200** — new high-end, $3.39/hr cloud → ~$0.15/M tok on spot — new inference benchmark
|
||||
- **MI300X** — supply growing (TensorWave, Vultr, CoreWeave, Oracle, Azure), from $1.50/hr
|
||||
- **Gaudi 3** — best $/PFLOPS on purchase, but narrow ecosystem and limited cloud availability
|
||||
- **Market bifurcation** — prior gen (H100, A100) commoditizing, new gen (B200, GB200) commanding premium
|
||||
|
||||
- [GPU.en.md](GPU.en.md) — GPU architecture, NVIDIA/AMD, vGPU, MIG
|
||||
- [NETWORKING.en.md](NETWORKING.en.md) — InfiniBand, RoCE, network topology
|
||||
- [STORAGE.en.md](STORAGE.en.md) — parallel filesystem, object store
|
||||
- [DATACENTERS.en.md](DATACENTERS.en.md) — DC layout, power, cooling
|
||||
- [CLOUD.en.md](CLOUD.en.md) — cloud AI services (SageMaker, Vertex AI)
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books, and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-18*
|
||||
602
AI-INFRASTRUCTURE.md
Normal file
602
AI-INFRASTRUCTURE.md
Normal file
@@ -0,0 +1,602 @@
|
||||
# 🧠 Infrastruktura pro AI/ML
|
||||
|
||||
## Přehled komponent
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
subgraph Compute
|
||||
GPU["GPU (H100/B200/Instinct)"]
|
||||
CPU["CPU (AMD EPYC / Intel Xeon)"]
|
||||
ASIC["ASIC (TPU, Trainium, Inferentia)"]
|
||||
end
|
||||
subgraph Network
|
||||
IB["InfiniBand NDR/XDR"]
|
||||
ROCE["RoCEv2"]
|
||||
NVL["NVLink / NVSwitch"]
|
||||
end
|
||||
subgraph Storage
|
||||
FS["Parallel FS (Lustre, GPFS, Weka)"]
|
||||
OBJ["Object Store (S3, MinIO)"]
|
||||
NVME["Local NVMe cache"]
|
||||
end
|
||||
subgraph Orchestration
|
||||
S["Slurm"]
|
||||
K["Kubernetes + Volcano/Kueue"]
|
||||
end
|
||||
subgraph Cooling
|
||||
DLC["Direct-to-chip liquid"]
|
||||
IMM["Immersion"]
|
||||
AIR["Air (high-density)"]
|
||||
end
|
||||
|
||||
Compute --> Network --> Storage
|
||||
Orchestration --> Compute
|
||||
Cooling --> Compute
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## GPU compute
|
||||
|
||||
### NVIDIA
|
||||
|
||||
| GPU | Architektura | FP8 | FP16/BF16 | FP64 | HBM | NVLink | TDP | Rack |
|
||||
|-----|-------------|-----|-----------|------|-----|--------|-----|------|
|
||||
| **H100 SXM** | Hopper | 3 958 TFLOPS | 1 979 TFLOPS | 67 TFLOPS | 80 GB HBM3 | 900 GB/s | 700 W | 6–8× v DGX H100 |
|
||||
| **H200 SXM** | Hopper (HBM3e) | 3 958 TFLOPS | 1 979 TFLOPS | 67 TFLOPS | 141 GB HBM3e | 900 GB/s | 700 W | 6–8× v DGX H200 |
|
||||
| **B200** | Blackwell | ~9 000 TFLOPS | ~4 500 TFLOPS | ~40 TFLOPS | 192 GB HBM3e | 1 800 GB/s | 1 000 W | 6–8× v DGX B200 |
|
||||
| **GB200 Grace Hopper** | Blackwell | ~18 000 TFLOPS | ~9 000 TFLOPS | — | 192 GB + 480 GB (Grace) | NVLink-C2C | 1 000 W (GPU) + 500 W (CPU) | DGX GB200 (36× GPU) |
|
||||
| **L40S** | Ada Lovelace | 733 TFLOPS | 367 TFLOPS | — | 48 GB GDDR6 | N/A | 350 W | Inference, enterprise |
|
||||
| **A100 SXM** | Ampere | 1 248 TFLOPS | 624 TFLOPS | 19,5 TFLOPS | 80 GB HBM2e | 600 GB/s | 400 W | DGX A100 |
|
||||
|
||||
### AMD
|
||||
|
||||
| GPU | Architektura | FP8 | FP16/BF16 | FP64 | HBM | Infinity Fabric | TDP |
|
||||
|-----|-------------|-----|-----------|------|-----|----------------|-----|
|
||||
| **MI300X** | CDNA 3 | 2 615 TFLOPS | 1 307 TFLOPS | 81 TFLOPS | 192 GB HBM3 | 896 GB/s | 750 W |
|
||||
| **MI250** | CDNA 2 | — | 383 TFLOPS | 95,7 TFLOPS | 128 GB HBM2e | 400 GB/s | 500 W |
|
||||
|
||||
### Intel
|
||||
|
||||
| GPU | Architektura | FP16/BF16 | FP32 | HBM | TDP |
|
||||
|-----|-------------|-----------|------|-----|-----|
|
||||
| **Gaudi 3** | Custom | 1 835 TFLOPS | — | 144 GB HBM2e | 600 W |
|
||||
| **Max 1550** | Xe HPC | 600+ TFLOPS | 200 TFLOPS | 128 GB HBM2e | 600 W |
|
||||
|
||||
### Cloud ASIC
|
||||
|
||||
| ASIC | Provider | Use case | Výkon |
|
||||
|------|----------|----------|-------|
|
||||
| **TPU v5p** | Google | Training | ~4 600 TFLOPS (BF16) per pod |
|
||||
| **Trainium 2** | AWS | Training | ~1 000 TFLOPS (BF16) per chip |
|
||||
| **Inferentia 2** | AWS | Inference | ~400 TOPS (INT8) per chip |
|
||||
| **Maia 100** | Microsoft | Training + inference | Custom, 800 W TDP |
|
||||
|
||||
---
|
||||
|
||||
## AI networking
|
||||
|
||||
### Srovnání technologií
|
||||
|
||||
| Technologie | Bandwidth per link | Latence | Topologie | Use case |
|
||||
|-------------|-------------------|---------|-----------|----------|
|
||||
| **InfiniBand NDR200** | 200 Gb/s | < 1 µs | Fat-tree, Dragonfly+ | Training (NVIDIA) |
|
||||
| **InfiniBand NDR400** | 400 Gb/s | < 1 µs | Fat-tree, Dragonfly+ | Training (NVIDIA) |
|
||||
| **InfiniBand XDR** | 800 Gb/s (planned) | < 1 µs | Dragonfly+ | Next-gen training |
|
||||
| **RoCEv2** (CX-7/8) | 200–400 Gb/s | 1–2 µs | Fat-tree, Spine-leaf | Training (AMD, Intel, open) |
|
||||
| **NVLink 4.0** | 900 GB/s per GPU | < 0,5 µs | NVSwitch full-mesh | Intra-node GPU comm |
|
||||
| **NVLink 5.0** | 1 800 GB/s per GPU | < 0,5 µs | NVSwitch full-mesh | Intra-node (Blackwell) |
|
||||
| **Ethernet (400 GbE)** | 400 Gb/s | 2–5 µs | Spine-leaf | Inference, data pipeline |
|
||||
|
||||
### Principy AI fabric
|
||||
|
||||
- **Rail-optimized topology** — každá GPU komunikuje na dedikovaném "rails" (stejné GPU indexy napříč uzly jsou na stejném switchi)
|
||||
- **Fat-tree (Clos)** — standard pro InfiniBand a RoCE, non-blocking bisection bandwidth
|
||||
- **Dragonfly+** — redukce počtu hopů při zachování bandwidth (používáno v největších clusterech)
|
||||
- **GPU Direct RDMA** — přímá komunikace GPU ↔ GPU bez CPU involvementu, podpora InfiniBand a RoCE
|
||||
- **SHARP (Scalable Hierarchical Aggregation and Reduction Protocol)** — in-network reduction pro AllReduce (pouze InfiniBand)
|
||||
|
||||
### Bandwidth dimenzování
|
||||
|
||||
```text
|
||||
Pravidlo: InfiniBand bandwidth ≥ 50 % GPU HBM bandwidth pro škálovatelné training
|
||||
|
||||
Příklad: H100 má 3,35 TB/s HBM
|
||||
→ Potřebuje min. 1,6 TB/s bisection bandwidth per GPU
|
||||
→ 8× H100 v DGX: 4× NDR400 IB na GPU = 4 × 50 GB/s = 200 GB/s
|
||||
→ Reálně: 8× 200 Gb/s (25 GB/s) per GPU v typické konfiguraci = ~6 % HBM → bottleneck
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## AI storage
|
||||
|
||||
### Požadavky
|
||||
|
||||
| Dataset size | IO pattern | Doporučený storage | Bandwidth |
|
||||
|-------------|-----------|-------------------|-----------|
|
||||
| < 10 TB | Sequential read (data loading) | Local NVMe | > 10 GB/s per node |
|
||||
| 10–100 TB | Random read (checkpointing) | Parallel FS (Lustre, Weka) | > 100 GB/s cluster-wide |
|
||||
| 100 TB–10 PB | Mixed (training + checkpoint) | Parallel FS + object store | > 500 GB/s |
|
||||
| 10 PB+ | Multi-modal, video, LLM | Tiered (NVMe cache + parallel FS + object) | > 1 TB/s |
|
||||
|
||||
### Srovnání storage řešení
|
||||
|
||||
| Řešení | Typ | Bandwidth per node | Max capacity | Škálování | Use case |
|
||||
|--------|-----|-------------------|-------------|-----------|----------|
|
||||
| **Lustre** | Parallel FS (POSIX) | > 100 GB/s (cluster) | 100s PB | OST + MDS | HPC, LLM training (standard) |
|
||||
| **GPFS / StorageScale** | Parallel FS (POSIX) | > 100 GB/s | 100s PB | NSD servers | HPC, AI (IBM) |
|
||||
| **WekaFS** | Parallel FS (POSIX + NFS/SMB) | ~80 GB/s per 10 nodes | 10s PB | Container-native | AI/ML, NVIDIA DGX preferred |
|
||||
| **VAST Data** | Universal storage (NVMe + QLC) | ~100 GB/s per cluster | 10s PB | Scale-out | AI, checkpoint, data lake |
|
||||
| **Pure Storage//E** | All-flash (NVMe) | ~50 GB/s | ~30 PB | Scale-out | Enterprise AI, database |
|
||||
| **MinIO / S3** | Object store | ~20 GB/s per gateway | EB | Erasure coding | Dataset repository, checkpoint |
|
||||
| **NetApp AFF** | NAS + S3 | ~10 GB/s per controller | ~50 PB | HA pair | Enterprise, NFS baseline |
|
||||
|
||||
### Checkpointing strategie
|
||||
|
||||
| Strategie | RPO | Storage impact | Popis |
|
||||
|-----------|-----|---------------|-------|
|
||||
| **Full checkpoint** | každý N step | Vysoký (zastaví training) | Celý model + optimizer state |
|
||||
| **Async checkpoint** | každý N step | Střední (non-blocking) | Kopie do staging bufferu, zápis na pozadí |
|
||||
| **Distributed checkpoint** (NVIDIA NeMo) | každý N step | Nízký | Každá rank zapisuje svůj shard |
|
||||
| **In-memory checkpoint** (IBM) | při failover | Minimální (DRAM) | Replikace do DRAM jiného node |
|
||||
| **Continuous checkpoint** (Microsoft) | každý 1–5 min | Nízký (delta) | Jen changed shardy |
|
||||
|
||||
---
|
||||
|
||||
## AI cluster architektura
|
||||
|
||||
### Fyzická topologie — DGX H100 example
|
||||
|
||||
```
|
||||
┌──────── DGX H100 (8× GPU) ────────┐
|
||||
│ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │
|
||||
│ │GPU 0│ │GPU 1│ │GPU 2│ │GPU 3│ │
|
||||
│ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ │
|
||||
│ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ │
|
||||
│ │GPU 4│ │GPU 5│ │GPU 6│ │GPU 7│ │
|
||||
│ └─────┘ └─────┘ └─────┘ └─────┘ │
|
||||
│ NVSwitch (NVLink 4.0, 900 GB/s) │
|
||||
│ InfiniBand CX-7: 8× NDR400 │
|
||||
└────────────────────────────────────┘
|
||||
│ 8× IB rails
|
||||
┌────┴──────────────┐
|
||||
│ IB NDR400 Switches │ (rail-optimized)
|
||||
└────────────────────┘
|
||||
```
|
||||
|
||||
### Kubernetes pro AI
|
||||
|
||||
| Komponenta | Role |
|
||||
|-----------|------|
|
||||
| **Volcano** | Batch scheduling, gang scheduling, queue management |
|
||||
| **Kueue** | Multi-tenant admission, resource quotas, fair sharing |
|
||||
| **NVIDIA GPU Operator** | Driver, container toolkit, MIG, DCGM, monitoring |
|
||||
| **HAMi** (ex k8s-vGPU-scheduler) | GPU sharing, MIG partitioning, fractional GPU |
|
||||
| **Node Feature Discovery** | Detekce GPU typu, NUMA topologie |
|
||||
| **Topology Manager** | NUMA-aware pod placement |
|
||||
| **DPDK / SR-IOV** | High-performance networking pro GPU Direct RDMA |
|
||||
|
||||
### Slurm pro AI
|
||||
|
||||
| Komponenta | Role |
|
||||
|-----------|------|
|
||||
| **slurm.conf** | Partition pro GPU nodes, GRES (Generic Resource) |
|
||||
| **gres.conf** | GPU typ, počet GPU na node |
|
||||
| **srun --gres=gpu:8** | Alokace 8 GPU pro job |
|
||||
| **sbatch --nodes=64 --ntasks=512** | 64 uzly, 512 ranků (8 GPU/node) |
|
||||
| **Pixis** | NVIDIA orchestrace plugin pro Slurm |
|
||||
|
||||
---
|
||||
|
||||
## Chlazení AI clusterů
|
||||
|
||||
### Power density srovnání
|
||||
|
||||
| Konfigurace | TDP per node | Racků | kW/rack | Poznámka |
|
||||
|-------------|-------------|-------|---------|----------|
|
||||
| Standardní server (2U) | 1 kW | 20 | 5–10 | Běžné DC |
|
||||
| GPU server (DGX H100, 6×) | 42 kW | 6 | 45–50 | Air cooling limit |
|
||||
| GPU server (DGX B200, 6×) | 72 kW | 6 | 90–100 | Liquid cooling nutný |
|
||||
| GPU server (GB200 NVL72) | 120 kW | — | ~120 | Liquid cooling mandatory |
|
||||
| NVIDIA NVL72 rack | 120 kW | 1 | 120 | Plně liquid cooled |
|
||||
|
||||
### Chladící technologie
|
||||
|
||||
| Metoda | Max kW/rack | CAPEX | OPEX | Komplexita |
|
||||
|--------|-------------|-------|------|-----------|
|
||||
| **Air cooling (CRAC/CRAH)** | < 15 | Nízká | Střední | Nízká |
|
||||
| **Air cooling (in-row)** | 15–30 | Střední | Střední | Nízká |
|
||||
| **Rear-door heat exchanger** | 30–50 | Střední | Nízká | Střední |
|
||||
| **Direct-to-chip liquid (cold plate)** | 50–150 | Vysoká | Nízká | Vysoká |
|
||||
| **Immersion (single-phase)** | 100–200 | Vysoká | Nízká | Vysoká |
|
||||
| **Immersion (two-phase)** | 200+ | Velmi vysoká | Nízká | Velmi vysoká |
|
||||
|
||||
---
|
||||
|
||||
## Inference infrastruktura
|
||||
|
||||
### Srovnání inference serverů
|
||||
|
||||
| Nástroj | Frameworky | Optimalizace | Use case |
|
||||
|---------|-----------|-------------|----------|
|
||||
| **vLLM** | Megatron, HF, AWQ, GPTQ | PagedAttention, KV cache, continuous batching | LLM inference (open source) |
|
||||
| **TensorRT-LLM** | TensorRT | INT4/INT8/FP8, inflight batching, attention optimizations | Produkce (NVIDIA) |
|
||||
| **Triton Inference Server** | Vše (TensorRT, vLLM, PyTorch) | Model ensemble, model caching, concurrent execution | Enterprise, multi-model |
|
||||
| **SageMaker** | Managed | Auto-scaling, model parallelism | AWS managed |
|
||||
| **OpenAI API / TGI** | HF Transformers | Continuous batching, flash attention | Hosting |
|
||||
|
||||
### Optimalizace pro inference
|
||||
|
||||
| Technika | Latence zlepšení | Propustnost zlepšení | Memory reduction |
|
||||
|----------|-----------------|---------------------|------------------|
|
||||
| **FP8/INT8 quantization** | — | 2× | 2× |
|
||||
| **INT4 quantization** | — | 4× | 4× |
|
||||
| **Flash Attention 2/3** | 2–4× | — | 50 % (KV cache) |
|
||||
| **PagedAttention** | — | 2–5× | 95 % (KV cache fragmentation) |
|
||||
| **Continuous batching** | — | 10–20× | — |
|
||||
| **Speculative decoding** | 2–3× | — | — |
|
||||
| **Multi-LoRA / S-LoRA** | — | 8–16× | — |
|
||||
|
||||
---
|
||||
|
||||
## Distribuované training techniky
|
||||
|
||||
| Technika | Popis | Frameworky |
|
||||
|----------|-------|------------|
|
||||
| **Data Parallelism (DDP/FSDP)** | Každá GPU má kopii modelu, různé batch | PyTorch DDP, FSDP |
|
||||
| **Tensor Parallelism (TP)** | Model rozdělen po vrstvách (intra-node) | Megatron-LM, DeepSpeed |
|
||||
| **Pipeline Parallelism (PP)** | Vrstvy rozděleny napříč uzly | Megatron-LM, DeepSpeed |
|
||||
| **Sequence Parallelism (SP)** | Sekvence rozdělena napříč GPU | Megatron-LM |
|
||||
| **Expert Parallelism (EP)** | Různé expertní subsítě na různých GPU | Mixture-of-Experts (MoE) |
|
||||
| **3D Parallelism** | TP + PP + DP kombinace | Megatron-LM, NeMo |
|
||||
| **ZeRO (1/2/3)** | Optimalizátor/gradient/parametry sharding | DeepSpeed |
|
||||
| **NCCL / RCCL** | GPU collective communication library | NVIDIA/AMD |
|
||||
|
||||
---
|
||||
|
||||
## Operační systémy pro AI
|
||||
|
||||
### Srovnání distribucí
|
||||
|
||||
| OS | GPU driver | CUDA | Container toolkit | IB/RoCE | Lustre klient | Produkční podpora |
|
||||
|----|-----------|------|-------------------|---------|--------------|-------------------|
|
||||
| **Ubuntu 22.04 LTS** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED, rdma-core | Ano (lustre-client) | NVIDIA DGX standard |
|
||||
| **Ubuntu 24.04 LTS** | NVIDIA 550+ | 12.5+ | nvidia-container-toolkit | MLNX_OFED, rdma-core | Ano | Nejnovější GPU podpora |
|
||||
| **RHEL 9 / Rocky 9** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED | Ano (EL repo) | Red Hat, enterprise |
|
||||
| **DGX OS** (Ubuntu-based) | NVIDIA custom | 12.x | Pre-installed | Pre-configured | Ano | NVIDIA DGX jediná podporovaná |
|
||||
| **SLES 15 SP5** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | MLNX_OFED | Ano | HPC, některé Lustre clustery |
|
||||
| **Debian 12** | NVIDIA 525+ | 12.x | nvidia-container-toolkit | rdma-core | Ano (backports) | Community, research |
|
||||
| **Flatcar / Bottlerocket** | Container-host | — | nvidia-container-toolkit | Omezeně | Ne | K8s-only, minimal footprint |
|
||||
|
||||
### Omezení a limity
|
||||
|
||||
#### GPU drivery a CUDA
|
||||
|
||||
| Omezení | Detail |
|
||||
|----------|--------|
|
||||
| **Driver-CUDA kompatibilita** | NVIDIA driver major verze musí odpovídat CUDA toolkit (driver ≥ CUDA req). Např. CUDA 12.5 vyžaduje driver ≥ 550 |
|
||||
| **Kernel version** | NVIDIA driver není kompatibilní se všemi kernely. Nový kernel (6.8+) může vyžadovat DKMS build nebo opožděnou podporu |
|
||||
| **Secure Boot** | NVIDIA driver vyžaduje podepsaný modul (MOK, shim) nebo vypnutý Secure Boot — častý problém v enterprise |
|
||||
| **Open vs Proprietary driver** | NVIDIA `nvidia-open` (od R515) — open source kernel modul. Podpora GPU: datové centrum (H100+) → OK, starší GPU → proprietary nutný |
|
||||
| **nvidia-persistenced** | Nutný pro udržení GPU initialization, bez něj GPU po idle timeout usnou (`nvidia-smi -pm 1`) |
|
||||
| **GPU reset** | Po crash training jobu může GPU viset. `nvidia-smi --gpu-reset` nebo reboot node, někdy i power cycle |
|
||||
| **Multi-instance GPU (MIG)** | Vyžaduje specifický driver, MIG mode na GPU, restart GPU. Nelze měnit za běhu. Podpora jen A100, H100, B200 |
|
||||
|
||||
#### Network (InfiniBand / RoCE)
|
||||
|
||||
| Omezení | Detail |
|
||||
|----------|--------|
|
||||
| **MLNX_OFED vs rdma-core** | MLNX_OFED (NVIDIA) — plná podpora, ale vlastní kernel moduly, nutná compatibility s kernel verzí. `rdma-core` (open) — omezená podpora, ale bez modulů |
|
||||
| **Kernel compatibility** | MLNX_OFED podporuje jen specifické kernel verze (major.minor). Upgrade kernelu → nutný rebuild MLNX_OFED |
|
||||
| **NCCL** | Verze NCCL musí být kompatibilní s CUDA a IB firmware. `nccl-tests` jako validace |
|
||||
| **SHARP** | In-network reduction vyžaduje specifickou MLNX_OFED + IB switch firmware kombinaci |
|
||||
| **GPU Direct RDMA** | Vyžaduje `nvidia-peermem` modul + MLNX_OFED. Nefunguje se všemi GPU a IB kartami |
|
||||
| **RoCE v PFC/ECN** | RoCE vyžaduje lossless fabric (PFC, ECN, DCQCN). Nastavení switch i host — komplexní tuning |
|
||||
|
||||
#### Storage
|
||||
|
||||
| Omezení | Detail |
|
||||
|----------|--------|
|
||||
| **Lustre klient** | Verze klienta musí odpovídat serveru. Upgrade serveru → upgrade všech klientů. Kompatibilní jen s RHEL/Debian deriváty |
|
||||
| **POSIX locking** | NFS a Lustre mají odlišné POSIX locking chování. Distributed training spoléhá na flock → problém při smíšených FS |
|
||||
| **Filesystem cache** | Page cache může maskovat IO bottleneck. Training joby často vyžadují `O_DIRECT` nebo `sync` IO |
|
||||
| **Local NVMe vs parallel FS** | Dataset staging na lokální NVMe eliminuje síťovou závislost, ale vyžaduje prostor a pre-fetch pipeline |
|
||||
|
||||
#### Kontejnerový runtime
|
||||
|
||||
| Omezení | Detail |
|
||||
|----------|--------|
|
||||
| **Docker + GPU** | `nvidia-container-toolkit` (dříve nvidia-docker2). Nutná instalace runtime a config v `/etc/docker/daemon.json` |
|
||||
| **Podman + GPU** | Vyžaduje `nvidia-container-toolkit` + podman hook. Méně testováno než Docker |
|
||||
| **containerd + GPU** | Standart pro K8s. Vyžaduje `cdi` (Container Device Interface) nebo `nvidia-container-runtime` |
|
||||
| **Enroot + Pyxis** | NVIDIA container stack pro Slurm (Enroot = container runtime bez daemona, Pyxis = Slurm plugin) |
|
||||
| **User namespace mapping** | Kontejnerové GPU access vyžaduje device cgroup a rootless může selhat (výjimka pro /dev/dri a /dev/nvidia*) |
|
||||
|
||||
#### Kernel parametry
|
||||
|
||||
```text
|
||||
# AI workload recommended sysctl
|
||||
net.core.rmem_max = 134217728 # dostatečný pro NCCL
|
||||
net.core.wmem_max = 134217728
|
||||
net.ipv4.tcp_rmem = 4096 87380 134217728
|
||||
net.ipv4.tcp_wmem = 4096 65536 134217728
|
||||
net.core.netdev_budget = 600 # pro vysokou packet rate
|
||||
vm.max_map_count = 1048576 # PyTorch DataLoader workers
|
||||
kernel.numa_balancing = 0 # vypnout NUMA balancing (ruší locality)
|
||||
kernel.sched_min_granularity_ns = 10000000
|
||||
|
||||
# Disable security mitigations pro perf (pouze na dedicated AI clusterech)
|
||||
mitigations=off
|
||||
transparent_hugepages=never # nebo madvise — THP může způsobovat latency spiky
|
||||
intel_idle.max_cstate=1 # redukce C-state transition latency
|
||||
```
|
||||
|
||||
#### Firmware a HW
|
||||
|
||||
| Omezení | Detail |
|
||||
|----------|--------|
|
||||
| **GPU firmware (VBIOS)** | NVIDIA datacenter GPU (H100, B200) mají VBIOS updates přes NVFlash. Bez update → chybí podpora partitioning nebo novějších CUDA feature |
|
||||
| **InfiniBand firmware** | IB switch a HCA firmware musí být kompatibilní. Mix starého switch + nového HCA → degraded perf |
|
||||
| **NVSwitch firmware** | DGX systémy mají NVSwitch firmware updatovatelný jen přes NVIDIA DGX tools |
|
||||
| **Power capping (nvidia-smi)** | `nvidia-smi -pl <power>` — omezení TDP pro power budget management. Nutné testovat vliv na training throughput |
|
||||
| **GPU clock locking** | `nvidia-smi -ac <clock,mem>` — locked clock frekvence pro stabilní benchmarky. Aplikace až po `nvidia-persistenced` |
|
||||
| **PCIe Gen** | GPU v PCIe Gen4 slotu (místo Gen5) → bottleneck pro data transfer CPU↔GPU. Důležité pro FSDP sharding |
|
||||
|
||||
### Doporučené OS per use case
|
||||
|
||||
| Use case | OS | Zdůvodnění |
|
||||
|----------|-----|-------|
|
||||
| **DGX cluster (produkce)** | DGX OS / Ubuntu 22.04 LTS | NVIDIA standard, nejlepší driver support |
|
||||
| **Enterprise K8s (OpenShift)** | RHEL 9 / RHCOS | Red Hat support, GPU Operator kompatibilní |
|
||||
| **Vanilla K8s (on-prem)** | Ubuntu 22.04 LTS + Flatcar (workers) | Nejširší community support, Flatcar pro minimal footprint |
|
||||
| **Slurm cluster (HPC/AI)** | Rocky Linux 9 / Ubuntu 22.04 LTS | EL ekosystém (Lustre, OFED) nebo Ubuntu (community) |
|
||||
| **Výzkum / rapid prototyping** | Ubuntu 24.04 LTS | Nejnovější CUDA, PyTorch, driver support |
|
||||
| **Edge inference** | NVIDIA JetPack / Ubuntu (ARM) | Embedded GPU (Jetson Orin, AGX) |
|
||||
|
||||
---
|
||||
|
||||
## AI-ready datové centrum — check-list
|
||||
|
||||
| Oblast | Požadavek |
|
||||
|--------|-----------|
|
||||
| **Power** | 30–120 kW/rack, HVDC (400 V DC), UPS s podporou GPU špiček |
|
||||
| **Cooling** | Liquid cooling ready (direct-to-chip), rear-door pro 30+ kW |
|
||||
| **Network** | InfiniBand (NDR/XDR) nebo RoCEv2, rail-optimized fat-tree |
|
||||
| **Storage** | Parallel FS (Lustre/Weka), checkpoint bandwidth > 100 GB/s |
|
||||
| **GPU density** | Max GPU/rack, minimalizace NVSwitch hopů |
|
||||
| **Physical** | Podlaha nosnost 1 500+ kg/m², rack 52U–60U |
|
||||
| **Security** | Tenant isolation, network segmentation, data encryption |
|
||||
| **Monitoring** | DCGM, NCCL health checks, thermals, power capping |
|
||||
|
||||
---
|
||||
|
||||
## Omezení modelů a propustnosti
|
||||
|
||||
### Model size per GPU
|
||||
|
||||
Maximální velikost modelu, který se vejde na jednu GPU, závisí na HBM kapacitě a precision:
|
||||
|
||||
| GPU | HBM | FP32 | FP16/BF16 | INT8 | INT4 |
|
||||
|-----|-----|------|-----------|------|------|
|
||||
| **H100 80GB** | 80 GB | ~10B | ~40B | ~80B | ~160B |
|
||||
| **H200 141GB** | 141 GB | ~18B | ~70B | ~140B | ~280B |
|
||||
| **B200 192GB** | 192 GB | ~24B | ~96B | ~192B | ~384B |
|
||||
| **MI300X 192GB** | 192 GB | ~24B | ~96B | ~192B | ~384B |
|
||||
| **A100 80GB** | 80 GB | ~10B | ~40B | ~80B | ~160B |
|
||||
| **GB200 (192+480)** | 192 GB GPU + 480 GB Grace | — | ~96B + CPU offload | — | — |
|
||||
|
||||
*Hodnoty orientační: 1B parametrů ≈ 2 GB FP16 ≈ 4 GB FP32 ≈ 1 GB INT8 ≈ 0,5 GB INT4. Reálně odečíst ~10–15 % HBM pro activations, KV cache, optimizer states.*
|
||||
|
||||
### Memory breakdown inference
|
||||
|
||||
| Komponenta | Llama 3 70B (FP16) | Llama 3 8B (FP16) |
|
||||
|------------|-------------------|-------------------|
|
||||
| Model weights | 140 GB | 16 GB |
|
||||
| KV cache (4K context, batch 1) | ~2 GB | ~0,2 GB |
|
||||
| KV cache (128K context, batch 1) | ~60 GB | ~6,5 GB |
|
||||
| Activations (peak) | ~5 GB | ~1 GB |
|
||||
| **Celkem 4K ctx** | ~147 GB | ~17 GB |
|
||||
| **Celkem 128K ctx** | ~205 GB | ~23 GB |
|
||||
|
||||
**Závěr:** Llama 3 70B v FP16 se nevejde na jednu H100 (80 GB). Nutné: INT8 (170 GB → 2× H100), INT4 (85 GB → 1× H200), nebo tensor parallelism.
|
||||
|
||||
### Context length vs memory
|
||||
|
||||
| Context | KV cache 70B (FP16) | KV cache 8B (FP16) | Poznámka |
|
||||
|---------|-------------------|-------------------|----------|
|
||||
| 4K | ~2,2 GB | ~0,25 GB | Běžný chat |
|
||||
| 32K | ~18 GB | ~2 GB | Dokumenty |
|
||||
| 128K | ~72 GB | ~8 GB | Long-context (Claude, Gemini) |
|
||||
| 1M | ~560 GB | ~64 GB | Experimentální (Gemini 1.5 Pro) |
|
||||
|
||||
KV cache je **lineární s délkou kontextu** a kvadratická s počtem hlav pozornosti. Pro long-context je kritická.
|
||||
|
||||
### Throughput inference
|
||||
|
||||
| Model | GPU | Precision | Batch size | Tokens/s | QPS (1K output) |
|
||||
|-------|-----|-----------|-----------|----------|-----------------|
|
||||
| Llama 3 8B | H100 | FP16 | 1 | ~800 | ~0,8 |
|
||||
| Llama 3 8B | H100 | FP16 | 128 | ~4 500 | ~35 |
|
||||
| Llama 3 8B | H100 | INT4 | 128 | ~8 000 | ~62 |
|
||||
| Llama 3 70B | 4× H100 | FP16 | 1 | ~180 | ~0,18 |
|
||||
| Llama 3 70B | 4× H100 | INT4 | 64 | ~1 200 | ~19 |
|
||||
| Llama 3 70B | 8× H100 | FP16 (TP=8) | 128 | ~2 500 | ~20 |
|
||||
| DeepSeek-R1 671B | 8× H200 | FP8 (MoE) | 64 | ~500 | ~8 |
|
||||
| GPT-4 class (est.) | — | — | — | ~100–300 | ~1–3 |
|
||||
|
||||
**Poznámky:**
|
||||
- QPS (queries per second) závisí na output délce (1K tokenů ≈ ~1 query)
|
||||
- Batch size zvyšuje throughput, ale zvyšuje TTFB (time to first token)
|
||||
- Tensor Parallelism (TP) škáluje, ale komunikační režba roste lineárně
|
||||
|
||||
### Training limits
|
||||
|
||||
#### Scaling efficiency
|
||||
|
||||
| Počet GPU | Model | Efficiency | Důvod |
|
||||
|-----------|-------|-----------|-------|
|
||||
| 8 (1 node) | Llama 3 8B | ~95 % | NVLink intra-node |
|
||||
| 64 (8 nodes) | Llama 3 8B | ~85 % | IB inter-node |
|
||||
| 512 (64 nodes) | Llama 3 70B | ~75 % | Komunikační režie |
|
||||
| 4 096 (512 nodes) | Llama 3 70B | ~60 % | Pipeline bubble, network |
|
||||
| 16 384 (2 048 nodes) | Llama 3 405B | ~45 % | Synchronous SGD overhead |
|
||||
|
||||
**Poznámka:** Efficiency = (actual throughput) / (ideal linear speedup). Klesá logaritmicky s počtem GPU.
|
||||
|
||||
#### Memory breakdown training
|
||||
|
||||
| Komponenta | Llama 3 70B (BF16) | Llama 3 8B (BF16) |
|
||||
|------------|-------------------|-------------------|
|
||||
| Model weights | 140 GB | 16 GB |
|
||||
| Optimizer states (Adam) | 280 GB | 32 GB |
|
||||
| Gradients | 140 GB | 16 GB |
|
||||
| Activations (peak) | ~30 GB | ~4 GB |
|
||||
| **Celkem (DDP)** | ~590 GB | ~68 GB |
|
||||
| **Celkem (FSDP shard=8)** | ~74 GB | ~8,5 GB |
|
||||
|
||||
**Závěr:** FSDP (Fully Sharded Data Parallelism) je nutný pro trénování modelů > 10B. Adam optimizer zdvojnásobuje memory oproti inference (weights + optimizer + gradients).
|
||||
|
||||
#### Time to train
|
||||
|
||||
| Model | GPU count | GPU type | Precision | Time | Cost (on-prem odhad) |
|
||||
|-------|-----------|---------|-----------|------|---------------------|
|
||||
| Llama 3 8B | 64 | H100 | BF16 | ~3 dny | ~$5 000 |
|
||||
| Llama 3 70B | 512 | H100 | BF16 | ~14 dní | ~$100 000 |
|
||||
| Llama 3 405B | 16 384 | H100 | BF16 | ~60 dní | ~$14 M |
|
||||
| DeepSeek-R1 671B (MoE) | 2 048 | H800 | BF16 | ~30 dní | ~$6 M |
|
||||
| GPT-4 (est.) | 25 000 | A100/H100 | Mixed | ~90–100 dní | ~$100 M |
|
||||
|
||||
### Power a thermal limity
|
||||
|
||||
| Konfigurace | TDP limit | Throughput ztráta | Důvod |
|
||||
|-------------|-----------|------------------|-------|
|
||||
| H100 SXM | 700 W (default) | 0 % | Nominální |
|
||||
| H100 SXM | 600 W (-15 %) | ~5–8 % | Power capping |
|
||||
| H100 SXM | 500 W (-30 %) | ~15–25 % | Výrazný throttling |
|
||||
| H100 SXM | 400 W (-43 %) | ~30–50 % | Jen pro emergency |
|
||||
| DGX H100 (8×) | 5,6 kW (max) | 0 % | Nutné liquid cooling |
|
||||
| DGX H100 (8×) | 4,5 kW (air) | ~10–15 % | Rear-door heat exchanger |
|
||||
|
||||
GPU throttluje při překročení TDP nebo teploty (85°C+). Power capping je lineární korelace s frekvencí, ale nelineární s propustností.
|
||||
|
||||
### API a provozní limity
|
||||
|
||||
| Limit | Popis | Typická hodnota |
|
||||
|-------|-------|-----------------|
|
||||
| **Rate limit** | Max requestů za minutu/hodinu | 100–10 000 RPM (dle tieru) |
|
||||
| **Tokens per minute (TPM)** | Max tokenů za minutu | 1M–300M (dle modelu) |
|
||||
| **Context window** | Max vstupních tokenů | 4K–2M (dle modelu) |
|
||||
| **Max output tokens** | Max vygenerovaných tokenů | 4K–32K (dle modelu) |
|
||||
| **Concurrent requests** | Počet paralelních requestů | 10–10 000 (dle backendu) |
|
||||
| **Batch window** | Čas na sebírání batch | 0–20 s (vLLM, TGI) |
|
||||
| **TTFB timeout** | Max latence na první token | 30–120 s |
|
||||
| **Idle timeout** | GPU idle → škálování na 0 | 5–15 min (cloud) |
|
||||
|
||||
### Limity per deployment model
|
||||
|
||||
| Model | Samostatný HW | Managed cloud (SageMaker, Vertex) | API (OpenAI, Anthropic) |
|
||||
|-------|--------------|----------------------------------|------------------------|
|
||||
| **Model size** | Limitován HBM (max 192 GB/GPU) | Neomezen (škálování cluster) | Neomezen |
|
||||
| **Queries** | Limitován GPU count | Auto-scaling | Rate limit (dle tieru) |
|
||||
| **Latency** | < 10 ms (same node) | 10–100 ms (network hop) | 100 ms – 10 s |
|
||||
| **Customization** | Plná (fine-tuning, quantization) | Managed (SageMaker, Bedrock) | Pouze prompt engineering |
|
||||
| **Data privacy** | Ano (on-prem) | Smluvní (region, encryption) | Omezená |
|
||||
| **Cost per 1M tokens** | ~$0,10–0,50 (FP16 inference) | ~$0,20–1,00 | ~$0,15–15,00 |
|
||||
| **Max context** | 128K+ (dle GPU count) | 128K+ | 32K–2M |
|
||||
| **Cold start** | 0 (always-on) | 30 s – 5 min | 0 (shared infra) |
|
||||
|
||||
---
|
||||
|
||||
## Ceny GPU a poměr cena/výkon (2026)
|
||||
|
||||
> Ceny jsou orientační — NVIDIA nezveřejňuje oficiální ceník pro datacenter GPU. Cloud ceny dle veřejných providerů (Q2 2026). Při koupi HW se cena liší dle objemu, resellera a regionu.
|
||||
|
||||
### Pořizovací cena (buy)
|
||||
|
||||
| GPU | Cena/GPU | Cena 8× GPU baseboard | $/PFLOPS (FP16) | Poznámka |
|
||||
|-----|---------|----------------------|----------------|----------|
|
||||
| **H100 SXM** | $27 000–40 000 | ~$200 000 | $25 000 | Scareita 2023–2024, nyní stabilizace |
|
||||
| **H200 SXM** | $35 000–50 000 | ~$280 000 | ~$35 000 | Upgrade H100, HBM3e |
|
||||
| **B200** | ~$60 000–70 000 | ~$500 000+ | ~$31 000 | Blackwell, FP4 support |
|
||||
| **B100** | ~$30 000 | ~$240 000 | ~$20 000 | Nižší cena než B200, podobný výkon FP8 |
|
||||
| **GB200** (Grace+Blackwell) | ~$70 000–100 000 | ~$2 000 000 (rack) | — | CPU+GPU unified, high-density |
|
||||
| **A100 80GB** | ~$10 000–15 000 | ~$120 000 | ~$19 200 | Předchozí generace, stále relevantní |
|
||||
| **MI300X** | ~$12 000–18 000 | ~$100 000 | ~$9 600 | AMD, 192 GB HBM3 |
|
||||
| **Gaudi 3** | ~$15 625 | ~$125 000 | **$8 515** | Intel, nejlepší $/PFLOPS |
|
||||
| **L40S** | ~$8 000–10 000 | — | — | Inference, enterprise |
|
||||
|
||||
### Cloud ceny (on-demand $/GPU/hr)
|
||||
|
||||
| GPU | Nejdostupnější | Mid-range (CoreWeave, Lambda) | Hyperscaler (AWS, GCP, Azure) |
|
||||
|-----|--------------|-------------------------------|-------------------------------|
|
||||
| **H100 SXM** | $1.38 (Thunder) | $2.89–3.29 | $4.15–6.88 |
|
||||
| **H100 PCIe** | $2.01 (Spheron) | $2.50 | — |
|
||||
| **H200 SXM** | $3.89 (Spheron) | $4.54 | $5.00+ |
|
||||
| **B200** | **$3.39** (Spheron) | $6.02 | $14.24 (AWS) |
|
||||
| **B200** | **$2.12** (spot) | — | — |
|
||||
| **GB200** | $3.50 (Runcrate) | $5.85 (Oracle) | $6.95 (GCP) |
|
||||
| **MI300X** | **$1.50** (TensorWave) | $1.85 (Vultr) | $7.86 (Azure) |
|
||||
| **A100 80GB** | $1.07 (Spheron) | $1.50–2.00 | $3.00+ |
|
||||
| **Gaudi 3** | ~$1.50–2.50 | — | — |
|
||||
| **L40S** | $0.91 (Spheron) | $1.50–2.00 | — |
|
||||
|
||||
### Cena za inferenci ($/M tokenů)
|
||||
|
||||
| GPU | Provider | $/hr | Est. tok/s | $/M tok |
|
||||
|-----|----------|------|-----------|--------|
|
||||
| **B200** | Spheron | $3.39 | ~4 000 | **$0.42** |
|
||||
| **B200** (spot) | Spheron | $2.12 | ~4 000 | **$0.15** |
|
||||
| **H100 PCIe** | Spheron | $2.01 | ~1 200 | $0.47 |
|
||||
| **A100 80GB** | Spheron | $1.07 | ~520 | $0.57 |
|
||||
| **H100 SXM** | AWS | $6.88 | ~1 200 | $1.59 |
|
||||
| **H200 SXM** | Spheron | $4.54 | ~1 800 | $0.70 |
|
||||
| **L40S** | Spheron | $0.91 | ~450 | $0.56 |
|
||||
|
||||
*Hodnoty pro Llama 3 70B (INT8, batch=1, output 1K tok). Reálné hodnoty se liší dle batch size, kontextu a kvantizace.*
|
||||
|
||||
### Cena za GB HBM
|
||||
|
||||
| GPU | HBM | Cena/hr cloud | $/GB/hr | Vhodnost pro memory-bound workloady |
|
||||
|-----|-----|-------------|--------|-----------------------------------|
|
||||
| **MI300X** | 192 GB | $1.50 | **$0.0078** | ✅ Nejlepší |
|
||||
| **B200** | 192 GB | $3.39 | $0.0177 | ✅ Dobrý |
|
||||
| **H200** | 141 GB | $3.89 | $0.0276 | ⚠️ |
|
||||
| **H100 SXM** | 80 GB | $1.38 | $0.0173 | ⚠️ Jen do 70B modelů |
|
||||
| **GB200** | 384 GB | $3.50 | $0.0091 | ✅✅ (2× MI300X kapacita) |
|
||||
|
||||
### Poměr cena/výkon dle scénáře
|
||||
|
||||
| Scénář | Vítěz | Zdůvodnění |
|
||||
|--------|-------|-------|
|
||||
| **Absolutní výkon** (cena není limit) | **GB200 DGX NVL72** | 72× GPU, 18 PFLOPS FP8, 384 GB HBM/GPU |
|
||||
| **Cloud inference** — nejlepší $/token | **B200 spot** | $0.15/M tok; 4× throughput H100 při nižší ceně |
|
||||
| **Cloud inference** — on-demand | **B200** | $0.42/M tok |
|
||||
| **Cloud inference** — rozpočet | **A100 / L40S** | $0.57–0.56/M tok |
|
||||
| **Training** — cena/výkon při koupi | **Gaudi 3** | $8 515/PFLOPS, 2.5–3× lepší než H100 |
|
||||
| **Training** — cloud | **H100 SXM** | $1.38/hr, CUDA ekosystém, NCCL |
|
||||
| **Memory-bound** — long context, 70B+ | **MI300X / GB200** | 192–384 GB, $0.0078–0.0091/GB |
|
||||
| **Ekosystém + bezpečná volba** | **H100/H200** | CUDA, nejširší SW, NVIDIA tools |
|
||||
| **Spot / preemptible** — nejnižší cena | **A100 / H100** | $1.07–1.38/hr, 50–90 % sleva oproti on-demand |
|
||||
|
||||
### Trendy 2026
|
||||
|
||||
- **H100** — cena klesla o 64 % z peaku $8/hr na $1.38–2.89/hr, pak rebound o 40 % díky inference boomu
|
||||
- **B200** — nový high-end, $3.39/hr cloud → ~$0.15/M tok na spotu — benchmark pro inference
|
||||
- **MI300X** — nabídka roste (TensorWave, Vultr, CoreWeave, Oracle, Azure), cena od $1.50/hr
|
||||
- **Gaudi 3** — nejlepší $/PFLOPS při koupi, ale úzký ekosystém a omezená cloud dostupnost
|
||||
- **Market se bifurkoval** — starší generace (H100, A100) komoditizují, nová (B200, GB200) drží prémii
|
||||
|
||||
## Související
|
||||
|
||||
- [GPU.md](GPU.md) — GPU architektura, NVIDIA/AMD, vGPU, MIG
|
||||
- [NETWORKING.md](NETWORKING.md) — InfiniBand, RoCE, network topologie
|
||||
- [STORAGE.md](STORAGE.md) — parallel filesystem, object store
|
||||
- [DATACENTERS.md](DATACENTERS.md) — DC layout, power, cooling
|
||||
- [CLOUD.md](CLOUD.md) — cloud AI služby (SageMaker, Vertex AI)
|
||||
|
||||
## Zdroje
|
||||
|
||||
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
|
||||
*Poslední revize: 2026-06-18*
|
||||
232
BIG-DATA.en.md
Normal file
232
BIG-DATA.en.md
Normal file
@@ -0,0 +1,232 @@
|
||||
# 🗄️ Big Data — ecosystem, architecture, tools
|
||||
|
||||
## Overview
|
||||
|
||||
The Big Data ecosystem in 2026: "Hadoop is dead, and yet it's everywhere." HDFS has shrunk, MapReduce is effectively gone, the Cloudera/Hortonworks era is over. But YARN lives on, the Hive Metastore has changed clothes into Iceberg/Delta, and the lakehouse pattern (cheap object storage + table format + distributed engine) is the inheritance Hadoop left behind.
|
||||
|
||||
The modern Big Data stack has 8 layers:
|
||||
|
||||
1. **Storage** — HDFS, S3, GCS, ABFS, MinIO
|
||||
2. **Table format** — Apache Iceberg, Delta Lake, Apache Hudi, Apache Paimon
|
||||
3. **Catalog** — Hive Metastore, Unity Catalog, Polaris, Nessie, AWS Glue
|
||||
4. **Batch processing** — Apache Spark, Trino-on-Spark, Dremio
|
||||
5. **Stream processing** — Apache Flink, Spark Structured Streaming, Kafka Streams
|
||||
6. **Distributed SQL** — Trino, Presto, StarRocks, ClickHouse
|
||||
7. **Transformation** — dbt, SQLMesh
|
||||
8. **Orchestration** — Apache Airflow 3.0, Dagster, Prefect, Kestra
|
||||
|
||||
---
|
||||
|
||||
## Storage
|
||||
|
||||
### HDFS (Hadoop Distributed File System)
|
||||
|
||||
| Feature | Detail |
|
||||
|---------|--------|
|
||||
| **Architecture** | Master/worker: NameNode (metadata) + DataNode (data) |
|
||||
| **Replication** | Default 3×, configurable (rack-aware) |
|
||||
| **Block size** | Default 128 MB (range 64 MB – 256 MB) |
|
||||
| **Limits** | NameNode memory ~ 1 GB / 1 million blocks; ~1000 DataNodes per cluster |
|
||||
| **Use case** | On-prem clusters, sequential read/write, large files |
|
||||
| **Status 2026** | Declining — most projects migrate to object storage (S3, GCS, MinIO) |
|
||||
|
||||
HDFS remains relevant for on-prem environments where object storage is unavailable, or for specific use cases (YARN clusters, Spark shuffle). For new projects, object storage is recommended.
|
||||
|
||||
### Object storage as Data Lake
|
||||
|
||||
| Platform | Service | Use case |
|
||||
|----------|--------|----------|
|
||||
| **AWS** | S3 | Primary data lake, Iceberg/Delta on S3 |
|
||||
| **Azure** | ADLS Gen2 / Blob | Data lake for Azure ecosystem |
|
||||
| **GCP** | GCS | Data lake for GCP (Dataproc, BigQuery) |
|
||||
| **On-prem** | MinIO | S3-compatible object storage on own HW |
|
||||
|
||||
### HDFS capacity planning
|
||||
|
||||
| Data size | Configuration |
|
||||
|-----------|-------------|
|
||||
| **< 100 TB** | 3–5 DataNodes, 10 GbE, replication 3× |
|
||||
| **100 TB – 1 PB** | 5–20 DataNodes, 25/100 GbE, rack-aware, NameNode HA |
|
||||
| **1 PB+** | 20+ DataNodes, 100 GbE, Federation (multiple NameNodes) |
|
||||
|
||||
---
|
||||
|
||||
## Open Table Formats
|
||||
|
||||
Table formats bring ACID transactions, schema evolution, and time travel to data lake object storage.
|
||||
|
||||
| Format | Organization | Engine compatibility | Streaming | Catalog |
|
||||
|--------|-------------|---------------------|-----------|---------|
|
||||
| **Apache Iceberg** | Apache Foundation | Spark, Flink, Trino, Dremio, Athena, Snowflake | Flink sink, snapshot-based | REST catalog, Polaris, Glue, Hive |
|
||||
| **Delta Lake** | Linux Foundation (Databricks) | Spark (native), Trino, Flink (limited), Athena | Spark Streaming, DLT | Unity Catalog (proprietary), Hive |
|
||||
| **Apache Hudi** | Apache Foundation | Spark, Flink, Trino (connector) | Built-in CDC, incremental | Hive, Glue (limited) |
|
||||
| **Apache Paimon** | Apache Foundation | Flink (native), Spark | LSM-tree, changelog mode | Hive, REST |
|
||||
|
||||
**Recommendation 2026:**
|
||||
- **Iceberg** — broadest multi-engine support, vendor-neutral, open catalog (Polaris)
|
||||
- **Delta Lake** — best for Spark/Databricks ecosystem, UniForm for cross-format reads
|
||||
- **Hudi** — losing momentum, only if already in production
|
||||
- **Paimon** — emerging, Flink-native, LSM architecture
|
||||
|
||||
---
|
||||
|
||||
## Processing Engines
|
||||
|
||||
### Apache Spark
|
||||
|
||||
Dominant batch processing engine and unifying engine (batch + streaming + SQL + ML).
|
||||
|
||||
| Feature | Detail |
|
||||
|---------|--------|
|
||||
| **Version 2026** | Spark 4.x (4.1.0), native Kubernetes support, Structured Streaming, Delta Lake integration |
|
||||
| **API** | Scala, Java, Python (PySpark), SQL, R (SparkR) |
|
||||
| **Batch** | DataFrame/Dataset, RDD, SQL queries — 10–100× faster than MapReduce |
|
||||
| **Streaming** | Structured Streaming (micro-batch), latency ~100 ms – 5 s |
|
||||
| **SQL** | Spark SQL, ANSI SQL, Hive compatible |
|
||||
| **ML** | MLlib, SparkML, MLflow integration |
|
||||
| **Scheduler** | YARN, Kubernetes (production-ready since Spark 3.x), standalone |
|
||||
| **Fault tolerance** | RDD lineage, checkpointing |
|
||||
|
||||
**When to use Spark:**
|
||||
- Batch ETL/ELT pipelines
|
||||
- Unified engine for batch + streaming (team preference)
|
||||
- Machine learning pipelines (MLlib, SparkML)
|
||||
- SQL analytics on large datasets
|
||||
|
||||
### Apache Flink
|
||||
|
||||
Highest-performance engine for true streaming (per-event processing).
|
||||
|
||||
| Feature | Detail |
|
||||
|---------|--------|
|
||||
| **Version 2026** | Flink 2.x (streaming-first, batch as bounded stream) |
|
||||
| **API** | DataStream API, Table/SQL API, ProcessFunction (low-level) |
|
||||
| **Latency** | < 100 ms (true streaming, Chandy-Lamport checkpointing) |
|
||||
| **State management** | Managed state (ValueState, ListState, MapState), RocksDB backend |
|
||||
| **Event time** | Native, watermarks, out-of-order handling |
|
||||
| **Batch** | Batch as bounded stream (same runtime) |
|
||||
| **Deployment** | YARN, Kubernetes, standalone |
|
||||
| **Economics** | Higher memory requirements (managed state), requires careful tuning |
|
||||
|
||||
**When to use Flink:**
|
||||
- Fraud detection, real-time bidding, IoT (< 100 ms latency)
|
||||
- Complex stateful stream processing
|
||||
- CDC pipelines
|
||||
- Event-driven architectures
|
||||
|
||||
### Trino (ex PrestoSQL)
|
||||
|
||||
Distributed SQL query engine — federated queries across various sources.
|
||||
|
||||
| Feature | Detail |
|
||||
|---------|--------|
|
||||
| **Architecture** | Coordinator + Worker (no storage, no scheduler) |
|
||||
| **Connectors** | Iceberg, Delta, Hive, HDFS, S3, GCS, ADLS, PostgreSQL, MySQL, Kafka, Elasticsearch |
|
||||
| **Use case** | Interactive SQL, federated queries, lakehouse queries |
|
||||
| **Version 2026** | Trino 470+, Iceberg native, Delta Lake connector |
|
||||
|
||||
---
|
||||
|
||||
## Spark vs Flink vs Trino comparison
|
||||
|
||||
| Criteria | Spark | Flink | Trino |
|
||||
|----------|-------|-------|-------|
|
||||
| **Primary use case** | Batch + unifying | True streaming | Interactive SQL |
|
||||
| **Streaming latency** | 100 ms – 5 s (micro-batch) | < 100 ms (true streaming) | N/A |
|
||||
| **Throughput** | High (batch-optimized) | High (pipeline-optimized) | Medium (ad-hoc) |
|
||||
| **State management** | State store (external) | Managed state (embedded) | N/A |
|
||||
| **SQL support** | Spark SQL | Flink SQL | ANSI SQL (broadest) |
|
||||
| **ML/AI** | MLlib, SparkML | — | — |
|
||||
| **Kubernetes** | Native (production) | Native (production) | Native (production) |
|
||||
| **Learning curve** | Medium | High | Low |
|
||||
| **Operational complexity** | Medium | High | Medium |
|
||||
|
||||
---
|
||||
|
||||
## Orchestration
|
||||
|
||||
| Tool | Version 2026 | Use case |
|
||||
|------|-------------|----------|
|
||||
| **Apache Airflow** | 3.0+ (taskflow API, dynamic tasks, deferrable operators) | Universal orchestration, largest ecosystem |
|
||||
| **Dagster** | 1.x (asset-oriented, software-defined assets) | Data pipelines, observability, asset lineage |
|
||||
| **Prefect** | 3.x (native async, workers, blocks) | Python-native, serverless workers |
|
||||
| **Kestra** | 1.x (YAML-native, declarative) | Event-driven orchestration |
|
||||
| **Apache NiFi** | 2.x (flow-based, visual) | Data ingestion, CDC, streaming |
|
||||
|
||||
---
|
||||
|
||||
## Lakehouse architecture
|
||||
|
||||
Lakehouse combines data lake flexibility (object storage) with data warehouse performance and governance.
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────┐
|
||||
│ Query Engines │
|
||||
│ Trino Spark SQL Flink SQL Dremio Athena │
|
||||
└─────────────────────────┬────────────────────────────┘
|
||||
│
|
||||
┌─────────────────────────▼────────────────────────────┐
|
||||
│ Table Format Layer │
|
||||
│ Apache Iceberg / Delta Lake / Hudi │
|
||||
│ (ACID, time travel, schema evolution) │
|
||||
└─────────────────────────┬────────────────────────────┘
|
||||
│
|
||||
┌─────────────────────────▼────────────────────────────┐
|
||||
│ Storage Layer │
|
||||
│ S3 / GCS / ADLS / MinIO / HDFS │
|
||||
│ (Parquet / ORC / Avro) │
|
||||
└──────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
For Iceberg details see [DATABASES.en.md — Apache Iceberg Lakehouse](DATABASES.en.md#apache-iceberg-lakehouse).
|
||||
|
||||
---
|
||||
|
||||
## Big Data Infrastructure
|
||||
|
||||
### Cluster sizing
|
||||
|
||||
| Component | Spark (batch) | Flink (streaming) | Trino (SQL) |
|
||||
|-----------|--------------|-------------------|-------------|
|
||||
| **CPU** | 16–64 cores/node | 16–32 cores/node | 8–32 cores/node |
|
||||
| **RAM** | 64–256 GB/node | 64–256 GB/node (incl. managed state) | 64–256 GB/node |
|
||||
| **Storage** | HDFS / object storage | Object storage (checkpoints) | None (stateless) |
|
||||
| **Network** | 25–100 GbE (shuffle-heavy) | 25–100 GbE (checkpointing) | 25–100 GbE |
|
||||
| **Disk** | NVMe (scratch, shuffle) | NVMe (RocksDB state backend) | — |
|
||||
| **Cluster size** | 5–200+ nodes | 3–100+ nodes | 5–50 nodes |
|
||||
|
||||
### Network considerations
|
||||
|
||||
- **Spark shuffle** — heavy network traffic between nodes; recommend 25–100 GbE, ideally no oversubscription
|
||||
- **Flink checkpointing** — periodic state writes to object storage; requires stable latency
|
||||
- **HDFS rack awareness** — optimizes replication across racks
|
||||
- **Data locality** — HDFS: local disk reads; object storage: network-bound
|
||||
|
||||
### Kubernetes vs YARN
|
||||
|
||||
| Criteria | YARN | Kubernetes |
|
||||
|----------|------|-----------|
|
||||
| **Resource isolation** | Cgroups (YARN containers) | Cgroups + namespaces (pods) |
|
||||
| **Ecosystem fit** | Hadoop-native (HDFS, Hive, Spark) | Cloud-native, Spark, Flink, Trino |
|
||||
| **Operational complexity** | Lower (single cluster manager) | Higher (requires K8s cluster) |
|
||||
| **Multi-tenant isolation** | YARN queues (Capacity/Fair Scheduler) | Namespaces, ResourceQuotas, LimitRanges |
|
||||
| **Stateful workloads** | Limited | StatefulSets, PVC, Operators |
|
||||
| **2026 trend** | Legacy (declining) | Standard for new projects |
|
||||
|
||||
---
|
||||
|
||||
## Cloud deployment
|
||||
|
||||
| Cloud | Batch processing | Streaming | SQL | Managed K8s |
|
||||
|-------|-----------------|-----------|-----|-------------|
|
||||
| **AWS** | EMR (Spark, Hive, Flink) | Kinesis, MSK (Kafka), EMR Flink | Athena (Trino), Redshift | EKS |
|
||||
| **Azure** | HDInsight (Spark, Hive), Synapse | Event Hubs, HDInsight Flink | Synapse SQL, Azure Data Explorer | AKS |
|
||||
| **GCP** | Dataproc (Spark, Flink, Hive, Trino) | Pub/Sub, Dataflow (Beam), Dataproc Flink | BigQuery | GKE |
|
||||
|
||||
---
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-18*
|
||||
232
BIG-DATA.md
Normal file
232
BIG-DATA.md
Normal file
@@ -0,0 +1,232 @@
|
||||
# 🗄️ Big Data — ekosystém, architektura, nástroje
|
||||
|
||||
## Přehled
|
||||
|
||||
Big Data ekosystém v roce 2026: "Hadoop je mrtvý, a přitom je všude." HDFS se zmenšil, MapReduce je fakticky mrtvý, Cloudera/Hortonworks éra skončila. Ale YARN žije, Hive Metastore se převlékl do Iceberg/Delta a lakehouse pattern (levné object storage + tabulkový formát + distribuovaný engine) je dědictví, které Hadoop zanechal.
|
||||
|
||||
Moderní Big Data stack má 8 vrstev:
|
||||
|
||||
1. **Storage** — HDFS, S3, GCS, ABFS, MinIO
|
||||
2. **Tabulkový formát** — Apache Iceberg, Delta Lake, Apache Hudi, Apache Paimon
|
||||
3. **Catalog** — Hive Metastore, Unity Catalog, Polaris, Nessie, AWS Glue
|
||||
4. **Dávkové zpracování** — Apache Spark, Trino-on-Spark, Dremio
|
||||
5. **Streamové zpracování** — Apache Flink, Spark Structured Streaming, Kafka Streams
|
||||
6. **Distribuované SQL** — Trino, Presto, StarRocks, ClickHouse
|
||||
7. **Transformace** — dbt, SQLMesh
|
||||
8. **Orchestrace** — Apache Airflow 3.0, Dagster, Prefect, Kestra
|
||||
|
||||
---
|
||||
|
||||
## Úložiště (Storage)
|
||||
|
||||
### HDFS (Hadoop Distributed File System)
|
||||
|
||||
| Vlastnost | Detail |
|
||||
|-----------|--------|
|
||||
| **Architektura** | Master/worker: NameNode (metadata) + DataNode (data) |
|
||||
| **Replikace** | Výchozí 3×, konfigurovatelná (rack-aware) |
|
||||
| **Block size** | Výchozí 128 MB (lze 64 MB – 256 MB) |
|
||||
| **Limity** | NameNode memory ~ 1 GB / 1 milion bloků; ~1000 DataNode v clusteru |
|
||||
| **Use case** | On-prem clustery, sekvenční čtení/zápis, velké soubory |
|
||||
| **Stav 2026** | Klesající podíl — většina migruje na object storage (S3, GCS, MinIO) |
|
||||
|
||||
HDFS je stále relevantní pro on-prem prostředí, kde object storage není dostupná, nebo pro specifické use case (YARN cluster, Spark shuffle). Pro nové projekty se doporučuje object storage.
|
||||
|
||||
### Object storage jako Data Lake
|
||||
|
||||
| Platforma | Služba | Use case |
|
||||
|-----------|--------|----------|
|
||||
| **AWS** | S3 | Hlavní data lake, Iceberg/Delta na S3 |
|
||||
| **Azure** | ADLS Gen2 / Blob | Data lake pro Azure ekosystém |
|
||||
| **GCP** | GCS | Data lake pro GCP (Dataproc, BigQuery) |
|
||||
| **On-prem** | MinIO | S3-kompatibilní object storage na vlastním HW |
|
||||
|
||||
### Kapacitní plánování HDFS
|
||||
|
||||
| Velikost dat | Konfigurace |
|
||||
|-------------|------------|
|
||||
| **< 100 TB** | 3–5 DataNode, 10 GbE, replication 3× |
|
||||
| **100 TB – 1 PB** | 5–20 DataNode, 25/100 GbE, rack-aware, NameNode HA |
|
||||
| **1 PB+** | 20+ DataNode, 100 GbE, Federation (více NameNode) |
|
||||
|
||||
---
|
||||
|
||||
## Tabulkové formáty (Open Table Formats)
|
||||
|
||||
Tabulkové formáty přináší ACID transakce, schema evolution a time travel do data lake objektového úložiště.
|
||||
|
||||
| Formát | Organizace | Engine kompatibilita | Streaming | Katalog |
|
||||
|--------|-----------|---------------------|-----------|---------|
|
||||
| **Apache Iceberg** | Apache Foundation | Spark, Flink, Trino, Dremio, Athena, Snowflake | Flink sink, snapshot-based | REST catalog, Polaris, Glue, Hive |
|
||||
| **Delta Lake** | Linux Foundation (Databricks) | Spark (native), Trino, Flink (limited), Athena | Spark Streaming, DLT | Unity Catalog (proprietary), Hive |
|
||||
| **Apache Hudi** | Apache Foundation | Spark, Flink, Trino (connector) | Built-in CDC, incremental | Hive, Glue (limited) |
|
||||
| **Apache Paimon** | Apache Foundation | Flink (native), Spark | LSM-tree, changelog mode | Hive, REST |
|
||||
|
||||
**Doporučení 2026:**
|
||||
- **Iceberg** — nejširší multi-engine podpora, vendor-neutral, otevřený katalog (Polaris)
|
||||
- **Delta Lake** — nejlepší pro Spark/Databricks ekosystém, UniForm pro cross-format čtení
|
||||
- **Hudi** — ztrácí momentum, jen pokud již v produkci
|
||||
- **Paimon** — emerging, Flink-native, LSM architektura
|
||||
|
||||
---
|
||||
|
||||
## Zpracování (Processing Engines)
|
||||
|
||||
### Apache Spark
|
||||
|
||||
Dominantní engine pro dávkové zpracování a unifying engine (batch + streaming + SQL + ML).
|
||||
|
||||
| Vlastnost | Detail |
|
||||
|-----------|--------|
|
||||
| **Verze 2026** | Spark 4.x (4.1.0), native Kubernetes support, Structured Streaming, Delta Lake integrace |
|
||||
| **API** | Scala, Java, Python (PySpark), SQL, R (SparkR) |
|
||||
| **Batch** | DataFrame/Dataset, RDD, SQL queries — 10–100× rychlejší než MapReduce |
|
||||
| **Streaming** | Structured Streaming (micro-batch), latence ~100 ms – 5 s |
|
||||
| **SQL** | Spark SQL, ANSI SQL, Hive兼容 |
|
||||
| **ML** | MLlib, SparkML, integrace s MLflow |
|
||||
| **Scheduler** | YARN, Kubernetes (production-ready od Spark 3.x), standalone |
|
||||
| **Fault tolerance** | RDD lineage, checkpointing |
|
||||
|
||||
**Kdy použít Spark:**
|
||||
- Dávkové ETL/ELT pipelines
|
||||
- Jednotný engine pro batch + streaming (team preference)
|
||||
- Machine learning pipelines (MLlib, SparkML)
|
||||
- SQL analytika na velkých datech
|
||||
|
||||
### Apache Flink
|
||||
|
||||
Nejvýkonnější engine pro true streaming (per-event zpracování).
|
||||
|
||||
| Vlastnost | Detail |
|
||||
|-----------|--------|
|
||||
| **Verze 2026** | Flink 2.x (streaming-first, batch jako speciální případ streamu) |
|
||||
| **API** | DataStream API, Table/SQL API, ProcessFunction (low-level) |
|
||||
| **Latence** | < 100 ms (true streaming, Chandy-Lamport checkpointing) |
|
||||
| **State management** | Managed state (ValueState, ListState, MapState), RocksDB backend |
|
||||
| **Event time** | Nativní, watermarky, out-of-order handling |
|
||||
| **Batch** | Batch jako bounded stream (stejný runtime) |
|
||||
| **Deployment** | YARN, Kubernetes, standalone |
|
||||
| **Ekonomika** | Vyšší paměťové nároky (managed state), nutnost pečlivého tuningu |
|
||||
|
||||
**Kdy použít Flink:**
|
||||
- Fraud detection, real-time bidding, IoT (< 100 ms latence)
|
||||
- Komplexní stateful stream processing
|
||||
- CDC pipelines
|
||||
- Event-driven architektury
|
||||
|
||||
### Trino (ex PrestoSQL)
|
||||
|
||||
Distribuovaný SQL query engine — federované dotazy napříč různými zdroji.
|
||||
|
||||
| Vlastnost | Detail |
|
||||
|-----------|--------|
|
||||
| **Architektura** | Coordinator + Worker (bez storage, bez scheduleru) |
|
||||
| **Konektory** | Iceberg, Delta, Hive, HDFS, S3, GCS, ADLS, PostgreSQL, MySQL, Kafka, Elasticsearch |
|
||||
| **Use case** | Interactive SQL, federované dotazy, lakehouse queries |
|
||||
| **Verze 2026** | Trino 470+, Iceberg native, Delta Lake connector |
|
||||
|
||||
---
|
||||
|
||||
## Srovnání Spark vs Flink vs Trino
|
||||
|
||||
| Kritérium | Spark | Flink | Trino |
|
||||
|-----------|-------|-------|-------|
|
||||
| **Primární use case** | Batch + unifying | True streaming | Interactive SQL |
|
||||
| **Latence streaming** | 100 ms – 5 s (micro-batch) | < 100 ms (true streaming) | N/A |
|
||||
| **Throughput** | Vysoký (batch optimalizace) | Vysoký (pipeline optimalizace) | Střední (ad-hoc) |
|
||||
| **State management** | State store (external) | Managed state (embedded) | N/A |
|
||||
| **SQL support** | Spark SQL | Flink SQL | ANSI SQL (nejširší) |
|
||||
| **ML/AI** | MLlib, SparkML | — | — |
|
||||
| **Kubernetes** | Native (production) | Native (production) | Native (production) |
|
||||
| **Křivka učení** | Střední | Vysoká | Nízká |
|
||||
| **Provozní náročnost** | Střední | Vysoká | Střední |
|
||||
|
||||
---
|
||||
|
||||
## Orchestrace
|
||||
|
||||
| Nástroj | Verze 2026 | Use case |
|
||||
|---------|-----------|----------|
|
||||
| **Apache Airflow** | 3.0+ (taskflow API, dynamic tasks, deferrable operators) | Univerzální orchestrace, největší ekosystém |
|
||||
| **Dagster** | 1.x (asset-oriented, software-defined assets) | Data pipelines, observabilita, asset lineage |
|
||||
| **Prefect** | 3.x (native async, workers, blocks) | Python-native, serverless workers |
|
||||
| **Kestra** | 1.x (YAML-native, declarative) | Event-driven orchestration |
|
||||
| **Apache NiFi** | 2.x (flow-based, visual) | Data ingestion, CDC, streaming |
|
||||
|
||||
---
|
||||
|
||||
## Lakehouse architektura
|
||||
|
||||
Lakehouse kombinuje flexibilitu data lake (object storage) s výkonem a governance data warehouse.
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────┐
|
||||
│ Query Engines │
|
||||
│ Trino Spark SQL Flink SQL Dremio Athena │
|
||||
└─────────────────────────┬────────────────────────────┘
|
||||
│
|
||||
┌─────────────────────────▼────────────────────────────┐
|
||||
│ Table Format Layer │
|
||||
│ Apache Iceberg / Delta Lake / Hudi │
|
||||
│ (ACID, time travel, schema evolution) │
|
||||
└─────────────────────────┬────────────────────────────┘
|
||||
│
|
||||
┌─────────────────────────▼────────────────────────────┐
|
||||
│ Storage Layer │
|
||||
│ S3 / GCS / ADLS / MinIO / HDFS │
|
||||
│ (Parquet / ORC / Avro) │
|
||||
└──────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
Detailněji Iceberg viz [DATABASES.md — Apache Iceberg Lakehouse](DATABASES.md#apache-iceberg-lakehouse).
|
||||
|
||||
---
|
||||
|
||||
## Infrastruktura pro Big Data
|
||||
|
||||
### Cluster sizing
|
||||
|
||||
| Komponenta | Spark (batch) | Flink (streaming) | Trino (SQL) |
|
||||
|------------|--------------|-------------------|-------------|
|
||||
| **CPU** | 16–64 cores/node | 16–32 cores/node | 8–32 cores/node |
|
||||
| **RAM** | 64–256 GB/node | 64–256 GB/node (včetně managed state) | 64–256 GB/node |
|
||||
| **Storage** | HDFS / object storage | Object storage (checkpointy) | Žádná (stateless) |
|
||||
| **Network** | 25–100 GbE (shuffle-heavy) | 25–100 GbE (checkpointing) | 25–100 GbE |
|
||||
| **Disk** | NVMe (scratch, shuffle) | NVMe (RocksDB state backend) | — |
|
||||
| **Cluster velikost** | 5–200+ nodes | 3–100+ nodes | 5–50 nodes |
|
||||
|
||||
### Network considerations
|
||||
|
||||
- **Spark shuffle** — heavy network traffic mezi uzly; doporučeno 25–100 GbE, ideálně bez oversubscription
|
||||
- **Flink checkpointing** — periodický zápis stavu na object storage; vyžaduje stabilní latenci
|
||||
- **HDFS rack awareness** — optimalizuje replikaci napříč racky
|
||||
- **Data locality** — HDFS: čtení z lokálního disku; object storage: network-bound
|
||||
|
||||
### Kubernetes vs YARN
|
||||
|
||||
| Kritérium | YARN | Kubernetes |
|
||||
|-----------|------|-----------|
|
||||
| **Resource isolation** | Cgroups (YARN containers) | Cgroups + namespaces (pods) |
|
||||
| **Ecosystem fit** | Hadoop-native (HDFS, Hive, Spark) | Cloud-native, Spark, Flink, Trino |
|
||||
| **Operational complexity** | Nižší (jeden cluster manager) | Vyšší (vyžaduje K8s cluster) |
|
||||
| **Multi-tenant isolation** | YARN queues (Capacity/Fair Scheduler) | Namespaces, ResourceQuotas, LimitRanges |
|
||||
| **Stateful workloads** | Omezená | StatefulSets, PVC, Operators |
|
||||
| **2026 trend** | Legacy (klesající) | Standard pro nové projekty |
|
||||
|
||||
---
|
||||
|
||||
## Nasazení v cloudu
|
||||
|
||||
| Cloud | Dávkové zpracování | Streaming | SQL | Managed K8s |
|
||||
|-------|-------------------|-----------|-----|-------------|
|
||||
| **AWS** | EMR (Spark, Hive, Flink) | Kinesis, MSK (Kafka), EMR Flink | Athena (Trino), Redshift | EKS |
|
||||
| **Azure** | HDInsight (Spark, Hive), Synapse | Event Hubs, HDInsight Flink | Synapse SQL, Azure Data Explorer | AKS |
|
||||
| **GCP** | Dataproc (Spark, Flink, Hive, Trino) | Pub/Sub, Dataflow (Beam), Dataproc Flink | BigQuery | GKE |
|
||||
|
||||
---
|
||||
|
||||
## Zdroje
|
||||
|
||||
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
|
||||
*Poslední revize: 2026-06-18*
|
||||
@@ -123,7 +123,7 @@ ScyllaDB is advantageous when:
|
||||
|
||||
## Sources
|
||||
|
||||
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
|
||||
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
|
||||
|
||||
### Recommended reading
|
||||
|
||||
|
||||
@@ -637,7 +637,7 @@ New tools: Harness (AI-native CD), GitLab 19.0 (agentic MR workflows, secrets ma
|
||||
|
||||
## Resources
|
||||
|
||||
Links, books and standards: [sources/cicd/sources.md](sources/cicd/sources.md)
|
||||
Links, books and standards: [sources/cicd/sources.en.md](sources/cicd/sources.en.md)
|
||||
|
||||
### Recommended Reading
|
||||
|
||||
|
||||
@@ -144,7 +144,7 @@ Analogues: Azure Well-Architected Framework, GCP Architecture Framework
|
||||
| **Storage optimized** | I4i, im4gn | 1:4 + NVMe | Transactional DB, data warehousing, Kafka | i4i.large ~$0.138/h |
|
||||
| **GPU / ML** | P5, g5, trn1 | GPU attach | AI training (P5), inference (g5), ML (trn1) | g5.xlarge ~$1.006/h |
|
||||
|
||||
See [GPU.md](GPU.md) for GPU model and configuration details.
|
||||
See [GPU.en.md](GPU.en.md) for GPU model and configuration details.
|
||||
|
||||
### Storage
|
||||
|
||||
@@ -287,7 +287,7 @@ Automated checks of architectural characteristics — analogous to tests for arc
|
||||
|
||||
## Hybrid Cloud Connectivity
|
||||
|
||||
See also: [NETWORKING.md](NETWORKING.md) — network architecture (VPN, BGP, VPC design).
|
||||
See also: [NETWORKING.en.md](NETWORKING.en.md) — network architecture (VPN, BGP, VPC design).
|
||||
|
||||
- **Site-to-Site VPN** — IPSec tunnel over the internet
|
||||
- **Direct Connect / ExpressRoute / Dedicated Interconnect** — private physical connection
|
||||
@@ -480,7 +480,7 @@ OpenStack is the dominant open-source platform for building private clouds (IaaS
|
||||
|
||||
## Resources
|
||||
|
||||
Links, books and standards: [sources/cloud/sources.md](sources/cloud/sources.md)
|
||||
Links, books and standards: [sources/cloud/sources.en.md](sources/cloud/sources.en.md)
|
||||
- **Cost tagging** — assign tags for chargeback/showback (Environment, Team, Cost Center, Application)
|
||||
- **Automated compliance** — AWS Config, Azure Policy, GCP Org Policies for guardrails
|
||||
- **Multi-account strategy** — AWS Control Tower, Azure Landing Zones, GCP Resource Hierarchy
|
||||
|
||||
@@ -259,7 +259,7 @@ HPE ProLiant Gen11 (DL360/DL380) supports:
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books, and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
Links, books, and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
### Recommended literature
|
||||
|
||||
|
||||
@@ -90,7 +90,7 @@ Each transaction sees a snapshot of data as of the start time. Old row versions
|
||||
|
||||
## Resources
|
||||
|
||||
Links, books and standards: [sources/databases/sources.md](sources/databases/sources.md)
|
||||
Links, books and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
|
||||
|
||||
### Recommended Reading
|
||||
|
||||
|
||||
@@ -6,20 +6,20 @@
|
||||
|
||||
| DB | License | Use Case | Details |
|
||||
|----|---------|----------|--------|
|
||||
| **PostgreSQL** | Open source | Universal, geospatial, analytics, AI | [POSTGRESQL.md](POSTGRESQL.md) |
|
||||
| **MySQL / MariaDB** | Open source | Web, LAMP stack, e-commerce | [MYSQL.md](MYSQL.md) |
|
||||
| **PostgreSQL** | Open source | Universal, geospatial, analytics, AI | [POSTGRESQL.en.md](POSTGRESQL.en.md) |
|
||||
| **MySQL / MariaDB** | Open source | Web, LAMP stack, e-commerce | [MYSQL.en.md](MYSQL.en.md) |
|
||||
| **Microsoft SQL Server** | Proprietary | Enterprise .NET, Windows ecosystem | — |
|
||||
| **Oracle DB** | Proprietary | Enterprise, finance, mainframe, RAC cluster | [ORACLE.md](ORACLE.md) |
|
||||
| **Oracle DB** | Proprietary | Enterprise, finance, mainframe, RAC cluster | [ORACLE.en.md](ORACLE.en.md) |
|
||||
| **Amazon Aurora** | Managed | MySQL/PostgreSQL compatible, cloud-native | — |
|
||||
|
||||
### NoSQL
|
||||
|
||||
| Type | DB | Use Case | Details |
|
||||
|-----|----|----------|--------|
|
||||
| **Document** | MongoDB, Couchbase | JSON data, flexible schema | [MONGODB.md](MONGODB.md) |
|
||||
| **Key-Value / Cache** | Redis, Memcached, DynamoDB | Cache, session store, real-time | [REDIS.md](REDIS.md) |
|
||||
| **Wide-column** | Cassandra, ScyllaDB | Time-series, IoT, big data | [CASSANDRA.md](CASSANDRA.md) |
|
||||
| **Vector** | Pinecone, Qdrant, Milvus, pgvector | Embeddings, RAG, semantic search | [VEKTOROVE-DB.md](VEKTOROVE-DB.md) |
|
||||
| **Document** | MongoDB, Couchbase | JSON data, flexible schema | [MONGODB.en.md](MONGODB.en.md) |
|
||||
| **Key-Value / Cache** | Redis, Memcached, DynamoDB | Cache, session store, real-time | [REDIS.en.md](REDIS.en.md) |
|
||||
| **Wide-column** | Cassandra, ScyllaDB | Time-series, IoT, big data | [CASSANDRA.en.md](CASSANDRA.en.md) |
|
||||
| **Vector** | Pinecone, Qdrant, Milvus, pgvector | Embeddings, RAG, semantic search | [VECTOR-DBS.en.md](VECTOR-DBS.en.md) |
|
||||
| **Graph** | Neo4j, Dgraph | Relationships, recommendations, social graphs | — |
|
||||
|
||||
### Storage Engines
|
||||
@@ -258,6 +258,8 @@ Table metadata (.metadata.json)
|
||||
| **Hidden partitioning** | Automatic partition filters (user does not need to specify) |
|
||||
| **Multi-engine** | Spark, Flink, Trino, Dremio, Snowflake over the same data |
|
||||
|
||||
For a broader overview of the Big Data ecosystem (HDFS, Spark, Flink, Trino, Delta Lake, Hudi) see [BIG-DATA.en.md](BIG-DATA.en.md).
|
||||
|
||||
### When to Use Iceberg
|
||||
|
||||
- Multi-tool access to the same governed data
|
||||
@@ -305,7 +307,7 @@ Table metadata (.metadata.json)
|
||||
|
||||
## Resources
|
||||
|
||||
Links, books and standards: [sources/databases/sources.md](sources/databases/sources.md)
|
||||
Links, books and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
|
||||
|
||||
### Recommended Reading
|
||||
|
||||
|
||||
@@ -258,6 +258,8 @@ Table metadata (.metadata.json)
|
||||
| **Hidden partitioning** | Automatické partition filtry (uživatel nemusí uvádět) |
|
||||
| **Multi-engine** | Spark, Flink, Trino, Dremio, Snowflake nad stejnými daty |
|
||||
|
||||
Detailnější přehled Big Data ekosystému (HDFS, Spark, Flink, Trino, Delta Lake, Hudi) viz [BIG-DATA.md](BIG-DATA.md).
|
||||
|
||||
### Kdy použít Iceberg
|
||||
|
||||
- Multi-tool přístup ke stejným governed datům
|
||||
|
||||
@@ -950,7 +950,7 @@ Tools: `smartmontools` (smartctl, smartd), Prometheus exporter (`node_exporter`)
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
### Recommended literature
|
||||
|
||||
@@ -1010,7 +1010,7 @@ Best practices: separate auth and recursive resolvers, DNSSEC, split-horizon (in
|
||||
|
||||
### Monitoring and observability
|
||||
|
||||
See [MONITORING.md](MONITORING.md). Before running first workloads, DC must have:
|
||||
See [MONITORING.en.md](MONITORING.en.md). Before running first workloads, DC must have:
|
||||
- Metric collection (Prometheus, Zabbix)
|
||||
- Centralized logs (Loki, ELK)
|
||||
- Alerting (Alertmanager, PagerDuty)
|
||||
|
||||
@@ -241,6 +241,6 @@ See [CLOUD.en.md](CLOUD.en.md) — migration strategies (6 Rs):
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books, and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
Links, books, and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-12*
|
||||
12
DR.en.md
12
DR.en.md
@@ -323,14 +323,14 @@ contacts:
|
||||
|
||||
## Related
|
||||
|
||||
- [CLOUD.md](CLOUD.md) — cloud DR strategy, AWS/Azure/GCP specific
|
||||
- [DATACENTERS.md](DATACENTERS.md) — DC redundancy, Tier classification
|
||||
- [MONITORING.md](MONITORING.md) — alerting, SLI/SLO/SLA
|
||||
- [CICD.md](CICD.md) — deployment strategy, rollback
|
||||
- [STORAGE.md](STORAGE.md) — backup storage, replication
|
||||
- [CLOUD.en.md](CLOUD.en.md) — cloud DR strategy, AWS/Azure/GCP specific
|
||||
- [DATACENTERS.en.md](DATACENTERS.en.md) — DC redundancy, Tier classification
|
||||
- [MONITORING.en.md](MONITORING.en.md) — alerting, SLI/SLO/SLA
|
||||
- [CICD.en.md](CICD.en.md) — deployment strategy, rollback
|
||||
- [STORAGE.en.md](STORAGE.en.md) — backup storage, replication
|
||||
|
||||
## Sources
|
||||
|
||||
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
Odkazy, knihy a standardy: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
*Last revised: 2026-06-11*
|
||||
|
||||
@@ -112,6 +112,12 @@ NVLink topologie (GPU direct) PCIe topologie (CPU mediated)
|
||||
- **Denoising**: AI-accelerated denoising on GPU
|
||||
- **Farm rendering**: Deadline, Qube! (job scheduler)
|
||||
|
||||
## GPU pricing
|
||||
|
||||
Detailed pricing comparisons (purchase price, cloud on-demand, $/M token inference cost, $/GB HBM, price trends 2024→2026) see:
|
||||
|
||||
- [AI-INFRASTRUCTURE.en.md — GPU pricing and price/performance](AI-INFRASTRUCTURE.en.md#gpu-pricing-and-priceperformance)
|
||||
|
||||
## GPU server form factors
|
||||
|
||||
| Form factor | GPU count | Power | Cooling | Example |
|
||||
@@ -144,6 +150,6 @@ Cyborg is an OpenStack service for managing accelerators (GPU, FPGA, DPU, NPU).
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-03*
|
||||
|
||||
6
GPU.md
6
GPU.md
@@ -112,6 +112,12 @@ NVLink topologie (GPU direct) PCIe topologie (CPU mediated)
|
||||
- **Denoising**: AI-accelerated denoising na GPU
|
||||
- **Farm rendering**: Deadline, Qube! (job scheduler)
|
||||
|
||||
## Ceny GPU
|
||||
|
||||
Detailní cenová srovnání (nákupní cena, cloud on-demand, $/M token inferenčních nákladů, $/GB HBM, cenový vývoj 2024→2026) viz:
|
||||
|
||||
- [AI-INFRASTRUCTURE.md — Ceny GPU a poměr cena/výkon](AI-INFRASTRUCTURE.md#ceny-gpu-a-poměr-cenavýkon)
|
||||
|
||||
## GPU server form factors
|
||||
|
||||
| Form factor | GPU count | Power | Cooling | Příklad |
|
||||
|
||||
@@ -4,9 +4,9 @@ This file has been split into separate areas:
|
||||
|
||||
| Area | File |
|
||||
|--------|--------|
|
||||
| 🔧 Server hardware — components and architecture | [SERVER-HW.md](SERVER-HW.md) |
|
||||
| 🎮 GPU — architecture, models, virtualization | [GPU.md](GPU.md) |
|
||||
| ⚙️ Server configuration — best practices by workload | [SERVER-CONFIG.md](SERVER-CONFIG.md) |
|
||||
| 📦 Provisioning — boot, installation, server management | [PROVISIONING.md](PROVISIONING.md) |
|
||||
| 🔧 Server hardware — components and architecture | [SERVER-HW.en.md](SERVER-HW.en.md) |
|
||||
| 🎮 GPU — architecture, models, virtualization | [GPU.en.md](GPU.en.md) |
|
||||
| ⚙️ Server configuration — best practices by workload | [SERVER-CONFIG.en.md](SERVER-CONFIG.en.md) |
|
||||
| 📦 Provisioning — boot, installation, server management | [PROVISIONING.en.md](PROVISIONING.en.md) |
|
||||
|
||||
*Last revision: 2026-06-03*
|
||||
|
||||
@@ -24,7 +24,7 @@
|
||||
- **VM — Virtual Machine** — full virtualization, own kernel
|
||||
- **Container** — shared host kernel, lighter (Docker, LXC)
|
||||
- **Paravirtualization** — guest OS knows it runs in a VM (better I/O performance)
|
||||
- **NUMA** — Non-Uniform Memory Access, CPU/memory allocation optimization (see [SERVER-HW.md](SERVER-HW.md#numa))
|
||||
- **NUMA** — Non-Uniform Memory Access, CPU/memory allocation optimization (see [SERVER-HW.en.md](SERVER-HW.en.md#numa))
|
||||
- **Overcommit** — allocating more vCPU/RAM than physically available (ratio management)
|
||||
- **Live Migration** — moving a running VM between hosts (vSphere vMotion, Hyper-V Live Migration)
|
||||
- **HA (High Availability)** — VM restart on another host upon failure
|
||||
@@ -86,20 +86,22 @@ According to Foundry/CIO.com survey (2025): **56%** of organizations plan to red
|
||||
|
||||
#### Target Platforms — Comparison
|
||||
|
||||
| Criterion | Proxmox VE | Nutanix AHV | Microsoft Hyper-V | Red Hat OpenShift Virtualization |
|
||||
|-----------|-----------|-------------|-------------------|----------------------------------|
|
||||
| **Hypervisor** | KVM + LXC | KVM (fork) | Hyper-V | KVM (KubeVirt) |
|
||||
| **License** | Open source (free), support ~€500/host/year | Per node subscription (30–60% savings vs VCF) | Windows Server license (Standard/Datacenter) | OpenShift subscription (core-based) |
|
||||
| **Live Migration** | Live Migration (Proxmox 8+) | AHV Live Migration | Live Migration (SMB/RDMA) | KubeVirt (VMI live migration) |
|
||||
| **HA** | Proxmox HA (watchdog, fencing) | Built-in HA (Prism) | Hyper-V HA (WS Failover Cluster) | OpenShift HA (self-healing) |
|
||||
| **Storage** | ZFS, Ceph, LVM | AOS (hybrid/SSD, erasure coding) | S2D, CSV, ReFS | OCS, Ceph, LSO |
|
||||
| **Backup** | Proxmox Backup Server (free) | Native snapshot + DR | Windows Server Backup / Veeam | OpenShift APIs + OADP |
|
||||
| **Price (3 years, 3 hosts)** | $0 + support $1,500 | ~$45,000–60,000 | $0 (Hyper-V Server free) or Windows Server license | ~$90,000+ (OpenShift) |
|
||||
| **Price (3 years, 10 hosts)** | $0 + support $5,000 | ~$150,000–200,000 | Windows Server Datacenter for unlimited VMs | ~$300,000+ (OpenShift) |
|
||||
| **Migration difficulty** | Medium (VMDK → QCOW2, VirtIO drivers) | Low (Nutanix Move tool) | Medium (V2V converter, SCVMM) | High (Kubernetes learning curve) |
|
||||
| **Linux support** | Excellent (native KVM) | Excellent (KVM-based) | Good (LIS drivers) | Excellent (KVM + OpenShift) |
|
||||
| **Windows support** | Good (VirtIO drivers) | Excellent (ALAS drivers, svpd) | Excellent (native) | Good (KubeVirt + VirtIO) |
|
||||
| **GPU passthrough** | VFIO (excellent) | GPU passthrough | DDA (Direct Device Assignment) | VFIO + GPU Operator |
|
||||
| Criterion | Proxmox VE | Nutanix AHV | Microsoft Hyper-V | Red Hat OpenShift Virtualization | **Sangfor aSV (HCI)** |
|
||||
|-----------|-----------|-------------|-------------------|----------------------------------|----------------------|
|
||||
| **Hypervisor** | KVM + LXC | KVM (fork) | Hyper-V | KVM (KubeVirt) | **KVM (aSV)** |
|
||||
| **License** | Open source (free), support ~€500/host/year | Per node subscription (30–60% savings vs VCF) | Windows Server license (Standard/Datacenter) | OpenShift subscription (core-based) | **Per node (Enterprise Pro), all-inclusive** |
|
||||
| **Live Migration** | Live Migration (Proxmox 8+) | AHV Live Migration | Live Migration (SMB/RDMA) | KubeVirt (VMI live migration) | **Yes** |
|
||||
| **HA** | Proxmox HA (watchdog, fencing) | Built-in HA (Prism) | Hyper-V HA (WS Failover Cluster) | OpenShift HA (self-healing) | **Built-in HA** |
|
||||
| **Storage** | ZFS, Ceph, LVM | AOS (hybrid/SSD, erasure coding) | S2D, CSV, ReFS | OCS, Ceph, LSO | **aSAN (distributed SDS, locality-aware)** |
|
||||
| **Backup** | Proxmox Backup Server (free) | Native snapshot + DR | Windows Server Backup / Veeam | OpenShift APIs + OADP | **Built-in backup + CDP** |
|
||||
| **Price (3 years, 3 hosts)** | $0 + support $1,500 | ~$45,000–60,000 | $0 (Hyper-V Server free) or Windows Server license | ~$90,000+ (OpenShift) | **~$15,000–25,000** |
|
||||
| **Price (3 years, 10 hosts)** | $0 + support $5,000 | ~$150,000–200,000 | Windows Server Datacenter for unlimited VMs | ~$300,000+ (OpenShift) | **~$50,000–80,000** |
|
||||
| **Migration difficulty** | Medium (VMDK → QCOW2, VirtIO drivers) | Low (Nutanix Move tool) | Medium (V2V converter, SCVMM) | High (Kubernetes learning curve) | **Low (VMware import tool)** |
|
||||
| **Linux support** | Excellent (native KVM) | Excellent (KVM-based) | Good (LIS drivers) | Excellent (KVM + OpenShift) | **Excellent (KVM-based)** |
|
||||
| **Windows support** | Good (VirtIO drivers) | Excellent (ALAS drivers, svpd) | Excellent (native) | Good (KubeVirt + VirtIO) | **Good (VirtIO drivers)** |
|
||||
| **GPU passthrough** | VFIO (excellent) | GPU passthrough | DDA (Direct Device Assignment) | VFIO + GPU Operator | **vGPU support (standard)** |
|
||||
| **Integrated security** | — | — | — | — | **Yes (NGFW, IPS, WAF, EDR — aSEC)** |
|
||||
| **Min. cluster (3 copies)** | 3 (Ceph) | 3 | 2–3 | 3 | **3** |
|
||||
|
||||
#### Migration Tools
|
||||
|
||||
@@ -112,8 +114,47 @@ According to Foundry/CIO.com survey (2025): **56%** of organizations plan to red
|
||||
| **virt-v2v** | VMware ESXi, Xen, Hyper-V | KVM (libvirt) | Open source CLI tool, disk + driver conversion (virtio), suitable for bulk migration |
|
||||
| **Windows Admin Center VM Conversion Extension** | VMware ESXi | Hyper-V | Microsoft WAC extension, free, GUI-based, bulk migration |
|
||||
| **Platform9 vJailbreak** | VMware ESXi | OpenStack / KVM | In-place migration (no swing gear), open source |
|
||||
| **Sangfor VMware Import Tool** | VMware ESXi | Sangfor aSV (HCI) | VMware import tool, disk + driver conversion, can retain network config |
|
||||
|
||||
#### TCO Comparison — Example: 3 hosts (2× 20C CPU), 50 VMs
|
||||
#### Cross-Hypervisor Migration Matrix
|
||||
|
||||
Comprehensive overview of all source→target pairs with methods, tools, limitations, and complexity.
|
||||
|
||||
| Source → Target | Method | Tools | Complexity | Limitations |
|
||||
|-------------|--------|----------|-----------|---------|
|
||||
| **VMware → Proxmox** | Disk conversion VMDK→QCOW2, driver reinstall | Proxmox Import Wizard, Veeam, StarWind, virt-v2v | Medium | VirtIO drivers required, UEFI not supported in Import Wizard (< 8.1), snapshots must be removed |
|
||||
| **VMware → Hyper-V** | Disk conversion VMDK→VHDX, driver reinstall | StarWind, WAC Converter, SCVMM, Microsoft MTC | Medium | Integration Services required, network config differences (VMXNET3 → Hyper-V Synthetic) |
|
||||
| **VMware → KVM/XCP-ng** | Disk conversion VMDK→raw/QCOW2, driver swap | virt-v2v, StarWind | Medium | VirtIO drivers, UEFI support (OVMF), host passthrough compatibility |
|
||||
| **VMware → Nutanix AHV** | Automated migration via Move appliance | Nutanix Move, Veeam | Low | AHV is also KVM — minimal issues, retain IP/MAC, UEFI support |
|
||||
| **VMware → Sangfor aSV** | Import via VMware Import Tool, disk + driver conversion | Sangfor VMware Import Tool | Low | Built-in tool, retain network config, UEFI support |
|
||||
| **VMware → OpenStack** | In-place or swing | Platform9 vJailbreak, virt-v2v + Glance | High | Network redesign (Neutron), storage (Cinder), image format (Glance) required |
|
||||
| **Hyper-V → VMware** | Disk conversion VHDX→VMDK, driver reinstall | StarWind, virt-v2v, VMware vCenter Converter (standalone) | Medium | VMware Tools required, network driver change (VMXNET3), UEFI/secure boot issues |
|
||||
| **Hyper-V → Proxmox** | Disk conversion VHDX→QCOW2, driver swap | StarWind, virt-v2v, qemu-img | Medium–High | VirtIO drivers, integration services → guest agent, secure boot issues |
|
||||
| **Hyper-V → KVM/XCP-ng** | Disk conversion VHDX→raw/QCOW2 | virt-v2v, qemu-img | Medium | VirtIO drivers, Linux generic drivers usually work |
|
||||
| **Hyper-V → Nutanix AHV** | Automated migration | Nutanix Move | Low–Medium | Similar to VMware→Nutanix, UEFI support, retain IP |
|
||||
| **Proxmox → VMware** | Export OVF/OVA, qemu-img convert | qemu-img (QCOW2→VMDK), ovftool, manual OVF export | High | VMware Tools required, storage format differences, no live migration, downtime required |
|
||||
| **Proxmox → Hyper-V** | qemu-img convert, driver reinstall | qemu-img, manual VHDX conversion | High | Hyper-V Integration Services required, no automated tool, edge case |
|
||||
| **Proxmox → KVM/XCP-ng** | Direct QCOW2 (same format), XML edit | libvirt, virsh dumpxml/define | Medium | libvirt XML/QEMU args differences (storage pool, network), validation required |
|
||||
| **Proxmox → Nutanix AHV** | qemu-img + manual import | qemu-img, Nutanix Image Service CLI | High | No hot tool, conversion + manual VM reconfiguration required |
|
||||
| **XCP-ng → VMware** | Disk conversion VHD→VMDK | qemu-img, StarWind, virt-v2v | High | VMware Tools required, paravirtualization differences (Xen PV vs VMware) |
|
||||
| **XCP-ng → Proxmox** | Disk conversion or direct VHD | qemu-img, manual import | Medium | Disk conversion, VHD format not native in Proxmox |
|
||||
| **XCP-ng → Hyper-V** | Disk conversion VHD→VHDX (direct) | StarWind, qemu-img | Medium | VHD/VHDX compatible, Integration Services required |
|
||||
| **Nutanix AHV → VMware** | Export + conversion | qemu-img, Nutanix Export, VMware vCenter Converter | High | VMware Tools, AHV is KVM → usually easier than Hyper-V→VMware |
|
||||
| **Nutanix AHV → Proxmox** | qemu-img + manual import | qemu-img, Nutanix self-service restore | Medium | AFS disks → QCOW2, metadata must be reconstructed |
|
||||
| **Nutanix AHV → Hyper-V** | qemu-img + manual | qemu-img, StarWind | High | Edge case, no hot tool |
|
||||
| **OpenStack → (any)** | Glance export + qemu-img | glance image-download, qemu-img, ovftool | Medium–High | Image format (raw/QCOW2), metadata (flavor, security groups) must be recreated |
|
||||
| **Sangfor aSV → (any)** | qemu-img conversion + manual | qemu-img, manual OVF/OVA export | Medium–High | KVM-based → conversion to QCOW2/VMDK/VHDX via qemu-img, metadata must be recreated |
|
||||
| **(any) → Sangfor aSV** | aSV API import + VMware Import Tool | Sangfor VMware Import Tool (for VMware), manual qemu-img import for others | Medium | KVM-based → standard formats supported, import tool for VMware only |
|
||||
|
||||
**Migration success keys:**
|
||||
|
||||
- **Drivers** — each platform requires its own paravirtual drivers (VMware Tools, VirtIO, Hyper-V Integration Services, Xen Tools). Always swap after migration.
|
||||
- **UEFI / Secure Boot** — not all combinations support UEFI (Proxmox Import Wizard < 8.1 does not). Test UEFI VMs before migration.
|
||||
- **Snapshots** — snapshots must be removed (merged) before migration. Most tools only migrate flat disks.
|
||||
- **Network** — MAC addresses, IP addresses, VLAN tagging — verify after migration. Some tools (Nutanix Move, VMware Converter) can retain MAC.
|
||||
- **Storage format** — VMDK ↔ VHDX ↔ QCOW2 ↔ raw are inter-convertible via `qemu-img`, but metadata differs (snapshots, backing files).
|
||||
- **Live migration** — no live migration exists between different hypervisors. Downtime is always required (minutes to hours depending on VM size).
|
||||
- **Migration temperature** — the "colder" the VM (fewer changes), the easier the migration. Real-time database applications require a separate DB migration plan.
|
||||
|
||||
| Platform | Year 1 | 3 Years Total | Note |
|
||||
|-----------|--------|---------------|----------|
|
||||
@@ -123,6 +164,7 @@ According to Foundry/CIO.com survey (2025): **56%** of organizations plan to red
|
||||
| **Nutanix AHV** (average) | ~$18,000 | ~$54,000 | Per node subscription, estimate |
|
||||
| **Hyper-V** (Windows Server Datacenter) | $12,400 | $37,200 | One-time license per core, without SA |
|
||||
| **Hyper-V** (Azure Stack HCI) | ~$7,200 | ~$21,600 | ~$10/core/month, 120 cores |
|
||||
| **Sangfor HCI** (Enterprise Pro) | ~$5,000–8,000 | ~$15,000–25,000 | Per node, all-inclusive, 3 nodes |
|
||||
|
||||
**Real-world example from Spiceworks (2026)**: A user reports VMware Essentials+ increasing from $1,900/year to $14,000/year (VVF) — a 7.4× increase.
|
||||
|
||||
@@ -142,8 +184,9 @@ According to Foundry/CIO.com survey (2025): **56%** of organizations plan to red
|
||||
3. Select target platform (1-2 candidates)
|
||||
├─ Proxmox: lowest TCO, Linux-heavy shops
|
||||
├─ Nutanix: enterprise HCI, low migration difficulty
|
||||
├─ Hyper-V: Windows-centric, Azure hybrid
|
||||
└─ OpenShift: Kubernetes-first, platform engineering
|
||||
├─ Hyper-V: Windows-centric, Azure hybrid
|
||||
├─ Sangfor: HCI all-in-one, security-first, VMware exit (SMB/mid-market)
|
||||
└─ OpenShift: Kubernetes-first, platform engineering
|
||||
|
||||
4. Plan migration phases
|
||||
├─ Wave 1: non-critical (dev/test, 1-2 months)
|
||||
@@ -269,9 +312,71 @@ Hardware ──> QEMU (I/O emulation) + KVM (kernel module, virtualization)
|
||||
- Load KVM modules: `kvm`, `kvm_intel`/`kvm_amd`, `vfio-pci`
|
||||
- Optimize storage: raw/LVM (avoid qcow2 for performance workloads)
|
||||
|
||||
## Sangfor aSV (HCI)
|
||||
|
||||
[Chinese vendor](https://www.sangfor.com) — KVM-based hypervisor, part of Sangfor HCI stack (aSV + aSAN + aNet + aSEC). Distributed through partners in EMEA.
|
||||
|
||||
### Stack architecture
|
||||
|
||||
| Component | Role |
|
||||
|-----------|------|
|
||||
| **aSV** | Hypervisor (KVM-based) |
|
||||
| **aSAN** | Distributed SDS (locality-aware, data tiering, dedup, compression) |
|
||||
| **aNet** | Network virtualization (distributed switches and routers, WYDIWYG visual editor) |
|
||||
| **aSEC** | Security (NGFW, IPS, WAF, EDR, east-west segmentation) |
|
||||
| **Sangfor Cloud Platform** | Management orchestrator, unified dashboard |
|
||||
|
||||
### Key features
|
||||
|
||||
| Feature | Detail |
|
||||
|-----------|--------|
|
||||
| **Hypervisor** | KVM (aSV) — custom fork with HCI extensions |
|
||||
| **License** | Enterprise Pro — per node, all-inclusive (compute + storage + network + security) |
|
||||
| **Min. cluster** | 3 nodes (3 data copies) |
|
||||
| **Live Migration** | Yes |
|
||||
| **HA** | Built-in HA |
|
||||
| **Storage** | aSAN — locality-aware, data tiering (SSD + HDD), dedup, compression, erasure coding |
|
||||
| **Backup** | Built-in backup + CDP — no 3rd party needed |
|
||||
| **Security** | Integrated NGFW, IPS, WAF, EDR — no external appliances |
|
||||
| **VDI** | aDesk — integrated VDI solution |
|
||||
| **Kubernetes** | SKE (Sangfor Kubernetes Engine) |
|
||||
| **Migration** | Sangfor VMware Import Tool (from vCenter), qemu-img for others |
|
||||
| **vGPU** | Standard support (no extra license) |
|
||||
|
||||
### Comparison with VMware
|
||||
|
||||
| Feature | Sangfor | VMware |
|
||||
|---------|---------|--------|
|
||||
| **License** | Per node, all-inclusive | Multi-tier (vSphere + vSAN + NSX + Aria) |
|
||||
| **vGPU** | Included (standard) | Enterprise Plus only |
|
||||
| **Backup + CDP** | Built-in | 3rd party or extra license |
|
||||
| **Security (NGFW, IPS, WAF)** | Built-in (aSEC) | NSX + 3rd party |
|
||||
| **Network management** | WYDIWYG visual editor | NSX Manager (more complex) |
|
||||
| **Min. cluster (3 copies)** | 3 nodes | 5 nodes (vSAN) |
|
||||
| **Data locality** | Yes | No |
|
||||
| **SSD life prediction** | Yes | No |
|
||||
|
||||
### Use case
|
||||
|
||||
- **VMware exit** — VMware replacement for SMB and mid-market
|
||||
- **Greenfield HCI** — new DCs, branch offices, remote sites
|
||||
- **VDI** — aDesk integrated with HCI
|
||||
- **Security-first** — organizations requiring integrated security
|
||||
- **Asia-Pacific / EMEA** — strongest in Asia, expanding to Europe
|
||||
|
||||
### Risks and limitations
|
||||
|
||||
| Risk | Detail |
|
||||
|--------|--------|
|
||||
| **Geopolitical** | Chinese vendor — possible regulatory restrictions (GDPR, EU, NATO, government) |
|
||||
| **Ecosystem** | Smaller community than VMware/Proxmox, less documentation and ISV certifications |
|
||||
| **Support** | Primary support from Asia, local partner critical |
|
||||
| **Vendor lock-in** | Closed ecosystem (aSV + aSAN + aNet + aSEC), harder to mix with 3rd party |
|
||||
| **References in CZ/EU** | Very limited — pilot required before production |
|
||||
|
||||
## Storage in Hypervisors
|
||||
|
||||
See also: [STORAGE.md](STORAGE.md) — detailed overview of storage protocols and configurations.
|
||||
See also: [STORAGE.en.md](STORAGE.en.md) — detailed overview of storage protocols and configurations.
|
||||
|
||||
| Type | Description | Protocols |
|
||||
|-----|-------|-----------|
|
||||
@@ -443,7 +548,7 @@ For telco, large private clouds, MANO/NFVI environments.
|
||||
|
||||
## Resources
|
||||
|
||||
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
### Recommended Reading
|
||||
|
||||
|
||||
143
HYPERVISORS.md
143
HYPERVISORS.md
@@ -86,20 +86,22 @@ Dle Foundry/CIO.com průzkumu (2025): **56 %** organizací plánuje snížit vyu
|
||||
|
||||
#### Cílové platformy — srovnání
|
||||
|
||||
| Kritérium | Proxmox VE | Nutanix AHV | Microsoft Hyper-V | Red Hat OpenShift Virtualization |
|
||||
|-----------|-----------|-------------|-------------------|----------------------------------|
|
||||
| **Hypervisor** | KVM + LXC | KVM (fork) | Hyper-V | KVM (KubeVirt) |
|
||||
| **Licence** | Open source (free), support ~€500/host/rok | Per node subscription (30–60 % savings oproti VCF) | Windows Server license (Standard/Datacenter) | OpenShift subscription (core-based) |
|
||||
| **Live Migration** | Live Migration (Proxmox 8+) | AHV Live Migration | Live Migration (SMB/RDMA) | KubeVirt (VMI live migration) |
|
||||
| **HA** | Proxmox HA (watchdog, fencing) | Built-in HA (Prism) | Hyper-V HA (WS Failover Cluster) | OpenShift HA (self-healing) |
|
||||
| **Storage** | ZFS, Ceph, LVM | AOS (hybrid/SSD, erasure coding) | S2D, CSV, ReFS | OCS, Ceph, LSO |
|
||||
| **Backup** | Proxmox Backup Server (free) | Native snapshot + DR | Windows Server Backup / Veeam | OpenShift APIs + OADP |
|
||||
| **Cena (3 roky, 3 hosty)** | $0 + support $1 500 | ~$45 000–60 000 | $0 (Hyper-V Server zdarma) nebo Windows Server lic. | ~$90 000+ (OpenShift) |
|
||||
| **Cena (3 roky, 10 hostů)** | $0 + support $5 000 | ~$150 000–200 000 | Windows Server Datacenter pro neomezené VM | ~$300 000+ (OpenShift) |
|
||||
| **Náročnost migrace** | Střední (VMDK → QCOW2, VirtIO drivery) | Nízká (Nutanix Move tool) | Střední (V2V converter, SCVMM) | Vysoká (Kubernetes learning curve) |
|
||||
| **Linux podpora** | Výborná (nativní KVM) | Výborná (KVM-based) | Dobrá (LIS drivers) | Výborná (KVM + OpenShift) |
|
||||
| **Windows podpora** | Dobrá (VirtIO drivers) | Výborná (ALAS drivers, svpd) | Výborná (nativní) | Dobrá (KubeVirt + VirtIO) |
|
||||
| **GPU passthrough** | VFIO (výborná) | GPU passthrough | DDA (Direct Device Assignment) | VFIO + GPU Operator |
|
||||
| Kritérium | Proxmox VE | Nutanix AHV | Microsoft Hyper-V | Red Hat OpenShift Virtualization | **Sangfor aSV (HCI)** |
|
||||
|-----------|-----------|-------------|-------------------|----------------------------------|----------------------|
|
||||
| **Hypervisor** | KVM + LXC | KVM (fork) | Hyper-V | KVM (KubeVirt) | **KVM (aSV)** |
|
||||
| **Licence** | Open source (free), support ~€500/host/rok | Per node subscription (30–60 % savings oproti VCF) | Windows Server license (Standard/Datacenter) | OpenShift subscription (core-based) | **Per node (Enterprise Pro), vše v ceně** |
|
||||
| **Live Migration** | Live Migration (Proxmox 8+) | AHV Live Migration | Live Migration (SMB/RDMA) | KubeVirt (VMI live migration) | **Ano** |
|
||||
| **HA** | Proxmox HA (watchdog, fencing) | Built-in HA (Prism) | Hyper-V HA (WS Failover Cluster) | OpenShift HA (self-healing) | **Built-in HA** |
|
||||
| **Storage** | ZFS, Ceph, LVM | AOS (hybrid/SSD, erasure coding) | S2D, CSV, ReFS | OCS, Ceph, LSO | **aSAN (distribuovaný SDS, locality-aware)** |
|
||||
| **Backup** | Proxmox Backup Server (free) | Native snapshot + DR | Windows Server Backup / Veeam | OpenShift APIs + OADP | **Built-in backup + CDP (Continuous Data Protection)** |
|
||||
| **Cena (3 roky, 3 hosty)** | $0 + support $1 500 | ~$45 000–60 000 | $0 (Hyper-V Server zdarma) nebo Windows Server lic. | ~$90 000+ (OpenShift) | **~$15 000–25 000** |
|
||||
| **Cena (3 roky, 10 hostů)** | $0 + support $5 000 | ~$150 000–200 000 | Windows Server Datacenter pro neomezené VM | ~$300 000+ (OpenShift) | **~$50 000–80 000** |
|
||||
| **Náročnost migrace** | Střední (VMDK → QCOW2, VirtIO drivery) | Nízká (Nutanix Move tool) | Střední (V2V converter, SCVMM) | Vysoká (Kubernetes learning curve) | **Nízká (nástroje pro VMware import)** |
|
||||
| **Linux podpora** | Výborná (nativní KVM) | Výborná (KVM-based) | Dobrá (LIS drivers) | Výborná (KVM + OpenShift) | **Výborná (KVM-based)** |
|
||||
| **Windows podpora** | Dobrá (VirtIO drivers) | Výborná (ALAS drivers, svpd) | Výborná (nativní) | Dobrá (KubeVirt + VirtIO) | **Dobrá (VirtIO drivers)** |
|
||||
| **GPU passthrough** | VFIO (výborná) | GPU passthrough | DDA (Direct Device Assignment) | VFIO + GPU Operator | **vGPU support (standard)** |
|
||||
| **Integrovaná bezpečnost** | — | — | — | — | **Ano (NGFW, IPS, WAF, EDR — aSEC)** |
|
||||
| **Min. cluster (3 kopie)** | 3 (Ceph) | 3 | 2–3 | 3 | **3** |
|
||||
|
||||
#### Migrační nástroje
|
||||
|
||||
@@ -112,6 +114,47 @@ Dle Foundry/CIO.com průzkumu (2025): **56 %** organizací plánuje snížit vyu
|
||||
| **virt-v2v** | VMware ESXi, Xen, Hyper-V | KVM (libvirt) | Open source CLI nástroj, konverze disků + driverů (virtio), vhodný pro hromadnou migraci |
|
||||
| **Windows Admin Center VM Conversion Extension** | VMware ESXi | Hyper-V | Microsoft WAC extension, free, GUI-based, hromadná migrace |
|
||||
| **Platform9 vJailbreak** | VMware ESXi | OpenStack / KVM | In-place migration (bez swing gear), open source |
|
||||
| **Sangfor VMware Import Tool** | VMware ESXi | Sangfor aSV (HCI) | Nástroj pro import VM z vCenter, konverze disků + driverů, možnost retain network config |
|
||||
|
||||
#### Matice migrací napříč hypervisory
|
||||
|
||||
Komplexní přehled všech dvojic zdroj → cíl s metodami, nástroji, omezeními a obtížností.
|
||||
|
||||
| Zdroj → Cíl | Metoda | Nástroje | Obtížnost | Omezení |
|
||||
|-------------|--------|----------|-----------|---------|
|
||||
| **VMware → Proxmox** | Disk konverze VMDK→QCOW2, reinstalace driverů | Proxmox Import Wizard, Veeam, StarWind, virt-v2v | Střední | Nutné VirtIO drivery, UEFI nepodporováno v Import Wizard (< 8.1), nutno odstranit snapshoty |
|
||||
| **VMware → Hyper-V** | Disk konverze VMDK→VHDX, reinstalace driverů | StarWind, WAC Converter, SCVMM, Microsoft MTC | Střední | Integration Services nutné, rozdíly v síťové konfiguraci (VMXNET3 → Hyper-V Synthetic) |
|
||||
| **VMware → KVM/XCP-ng** | Disk konverze VMDK→raw/QCOW2, driver swap | virt-v2v, StarWind | Střední | VirtIO drivers, UEFI support (OVMF), host passthrough musí být kompatibilní |
|
||||
| **VMware → Nutanix AHV** | Automatizovaná migrace přes Move appliance | Nutanix Move, Veeam | Nízká | AHV je také KVM – minimální problémy, retain IP/MAC, podpora UEFI |
|
||||
| **VMware → Sangfor aSV** | Import přes VMware Import Tool, konverze disků + driverů | Sangfor VMware Import Tool | Nízká | Built-in nástroj, retain network config, support UEFI |
|
||||
| **VMware → OpenStack** | In-place nebo swing | Platform9 vJailbreak, virt-v2v + Glance | Vysoká | Nutný redesign networking (Neutron), storage (Cinder), image format (Glance) |
|
||||
| **Hyper-V → VMware** | Disk konverze VHDX→VMDK, reinstalace driverů | StarWind, virt-v2v, VMware vCenter Converter (standalone) | Střední | VMware Tools nutné, síťový driver change (VMXNET3), UEFI/secure boot issues |
|
||||
| **Hyper-V → Proxmox** | Disk konverze VHDX→QCOW2, driver swap | StarWind, virt-v2v, qemu-img | Střední–Vysoká | VirtIO drivers, integration services → guest agent, secure boot issues |
|
||||
| **Hyper-V → KVM/XCP-ng** | Disk konverze VHDX→raw/QCOW2 | virt-v2v, qemu-img | Střední | VirtIO drivers, Linux generické drivery obvykle fungují |
|
||||
| **Hyper-V → Nutanix AHV** | Automatizovaná migrace | Nutanix Move | Nízká–Střední | Obdobné jako VMware→Nutanix, support UEFI, retain IP |
|
||||
| **Proxmox → VMware** | Export OVF/OVA, qemu-img convert | qemu-img (QCOW2→VMDK), ovftool, manuální OVF export | Vysoká | VMware Tools nutné, rozdíly v storage formátech, bez live migration, nutný downtime |
|
||||
| **Proxmox → Hyper-V** | qemu-img convert, reinstalace driverů | qemu-img, manuální VHDX konverze | Vysoká | Hyper-V Integration Services nutné, žádný automatizovaný nástroj, edge case |
|
||||
| **Proxmox → KVM/XCP-ng** | Direct QCOW2 (stejný formát), úprava XML | libvirt, virsh dumpxml/define | Střední | Rozdíly v libvirt XML/QEMU args (storage pool, síť), nutná validace |
|
||||
| **Proxmox → Nutanix AHV** | qemu-img + manuální import | qemu-img, Nutanix Image Service CLI | Vysoká | Žádný hot nástroj, nutná konverze + manuální rekonfigurace VM |
|
||||
| **XCP-ng → VMware** | Disk konverze VHD→VMDK | qemu-img, StarWind, virt-v2v | Vysoká | VMware Tools nutné, rozdíly v paravirtualizaci (Xen PV vs VMware) |
|
||||
| **XCP-ng → Proxmox** | Disk konverze nebo direct VHD | qemu-img, manuální import | Střední | Konverze disků, formát VHD není nativní v Proxmox |
|
||||
| **XCP-ng → Hyper-V** | Disk konverze VHD→VHDX (přímá) | StarWind, qemu-img | Střední | VHD/VHDX kompatibilní, nutné Integration Services |
|
||||
| **Nutanix AHV → VMware** | Export + konverze | qemu-img, Nutanix Export, VMware vCenter Converter | Vysoká | VMware Tools, AHV je KVM → obvykle jednodušší než Hyper-V→VMware |
|
||||
| **Nutanix AHV → Proxmox** | qemu-img + manuální import | qemu-img, Nutanix self-service restore | Střední | Disky z AFS → QCOW2, metadata nutno rekonstruovat |
|
||||
| **Nutanix AHV → Hyper-V** | qemu-img + manuální | qemu-img, StarWind | Vysoká | Edge case, žádný hot nástroj |
|
||||
| **OpenStack → (any)** | Glance export + qemu-img | glance image-download, qemu-img, ovftool | Střední–Vysoká | Image formát (raw/QCOW2), metadata (flavor, security groups) nutno znovu vytvořit |
|
||||
| **Sangfor aSV → (any)** | qemu-img konverze + manuální | qemu-img, manuální OVF/OVA export | Střední–Vysoká | KVM-based → konverze do QCOW2/VMDK/VHDX přes qemu-img, metadata nutno znovu vytvořit |
|
||||
| **(any) → Sangfor aSV** | aSV API import + VMware Import Tool | Sangfor VMware Import Tool (pro VMware), manuální qemu-img import pro ostatní | Střední | KVM-based → podpora standardních formátů, import tool jen pro VMware |
|
||||
|
||||
**Klíče k úspěšné migraci:**
|
||||
|
||||
- **Drivery** — každá platforma vyžaduje vlastní paravirtual drivers (VMware Tools, VirtIO, Hyper-V Integration Services, Xen Tools). Po migraci vždy vyměnit.
|
||||
- **UEFI / Secure Boot** — ne všechny kombinace podporují UEFI (Proxmox Import Wizard < 8.1 nepodporuje). Při migraci UEFI VM raději testovat.
|
||||
- **Snapshoty** — snapshots musí být před migrací odstraněny (sloučeny). Většina nástrojů migruje jen flat disky.
|
||||
- **Síť** — MAC adresy, IP adresy, VLAN tagging — po migraci zkontrolovat. Některé nástroje (Nutanix Move, VMware Converter) umí retain MAC.
|
||||
- **Storage format** — VMDK ↔ VHDX ↔ QCOW2 ↔ raw jsou vzájemně konvertovatelné přes `qemu-img`, ale liší se v metadatech (snapshots, backing files).
|
||||
- **Live migration** — mezi různými hypervisory neexistuje live migration. Vždy je potřeba downtime (minuty až hodiny podle velikosti VM).
|
||||
- **Teplota migrace** — čím "chladnější" VM (méně změn), tím snazší migrace. Aplikace s databází v reálném čase vyžadují samostatný DB migrační plán.
|
||||
|
||||
#### TCO srovnání — příklad: 3 hosty (2× 20C CPU), 50 VM
|
||||
|
||||
@@ -123,6 +166,7 @@ Dle Foundry/CIO.com průzkumu (2025): **56 %** organizací plánuje snížit vyu
|
||||
| **Nutanix AHV** (průměr) | ~$18 000 | ~$54 000 | Per node subscription, odhad |
|
||||
| **Hyper-V** (Windows Server Datacenter) | $12 400 | $37 200 | Jednorázová licence per core, bez SA |
|
||||
| **Hyper-V** (Azure Stack HCI) | ~$7 200 | ~$21 600 | ~$10/core/měsíc, 120 cores |
|
||||
| **Sangfor HCI** (Enterprise Pro) | ~$5 000–8 000 | ~$15 000–25 000 | Per node, vše v ceně, 3 uzly |
|
||||
|
||||
**Reálný příklad ze Spiceworks (2026)**: Uživatel hlásí navýšení VMware Essentials+ z $1 900/rok na $14 000/rok (VVF) — nárůst 7.4×.
|
||||
|
||||
@@ -142,8 +186,9 @@ Dle Foundry/CIO.com průzkumu (2025): **56 %** organizací plánuje snížit vyu
|
||||
3. Vyber cílovou platformu (1-2 kandidáty)
|
||||
├─ Proxmox: nejnižší TCO, Linux-heavy shops
|
||||
├─ Nutanix: enterprise HCI, nízká náročnost migrace
|
||||
├─ Hyper-V: Windows-centric, Azure hybrid
|
||||
└─ OpenShift: Kubernetes-first, platform engineering
|
||||
├─ Hyper-V: Windows-centric, Azure hybrid
|
||||
├─ Sangfor: HCI all-in-one, security-first, VMware exit (SMB/mid-market)
|
||||
└─ OpenShift: Kubernetes-first, platform engineering
|
||||
|
||||
4. Naplánuj migrační fáze
|
||||
├─ Wave 1: non-critical (dev/test, 1-2 měsíce)
|
||||
@@ -269,6 +314,72 @@ Hardware ──> QEMU (emulace I/O) + KVM (kernel module, virtualization)
|
||||
- Naložit KVM moduly: `kvm`, `kvm_intel`/`kvm_amd`, `vfio-pci`
|
||||
- Optimalizovat storage: raw/LVM (vyhnout se qcow2 u výkonových workloadů)
|
||||
|
||||
## Sangfor aSV (HCI)
|
||||
|
||||
[Čínský vendor](https://www.sangfor.com) — KVM-based hypervisor, součást Sangfor HCI stacku (aSV + aSAN + aNet + aSEC). V ČR distribuován přes partnery.
|
||||
|
||||
### Architektura stacku
|
||||
|
||||
| Komponenta | Role |
|
||||
|-----------|------|
|
||||
| **aSV** | Hypervisor (KVM-based) |
|
||||
| **aSAN** | Distributed SDS (locality-aware, data tiering, dedup, compression) |
|
||||
| **aNet** | Network virtualization (distribuované switche a routery, WYDIWYG vizuální editor) |
|
||||
| **aSEC** | Bezpečnost (NGFW, IPS, WAF, EDR, east-west segmentation) |
|
||||
| **Sangfor Cloud Platform** | Management orchestrator, unified dashboard |
|
||||
|
||||
### Klíčové vlastnosti
|
||||
|
||||
| Vlastnost | Detail |
|
||||
|-----------|--------|
|
||||
| **Hypervisor** | KVM (aSV) — vlastní fork s rozšířeními pro HCI |
|
||||
| **Licence** | Enterprise Pro — per node, vše v ceně (compute + storage + network + security) |
|
||||
| **Min. cluster** | 3 uzly (3 kopie dat) |
|
||||
| **Live Migration** | Ano |
|
||||
| **HA** | Built-in HA |
|
||||
| **Storage** | aSAN — locality-aware (data locality), data tiering (SSD + HDD), dedup, compression, erasure coding |
|
||||
| **Backup** | Built-in backup + CDP (Continuous Data Protection) — bez nutnosti 3rd party |
|
||||
| **Security** | Integrated NGFW, IPS, WAF, EDR — bez externích appliance |
|
||||
| **VDI** | aDesk — integrované VDI řešení |
|
||||
| **Kubernetes** | SKE (Sangfor Kubernetes Engine) |
|
||||
| **Migrace** | Sangfor VMware Import Tool (z vCenter), qemu-img pro ostatní |
|
||||
| **vGPU** | Standardní podpora (bez extra licence) |
|
||||
|
||||
### Srovnání s VMware
|
||||
|
||||
| Feature | Sangfor | VMware |
|
||||
|---------|---------|--------|
|
||||
| **Licence** | Per node, vše v ceně | Vícestupňová (vSphere + vSAN + NSX + Aria) |
|
||||
| **vGPU** | V ceně (standard) | Jen v Enterprise Plus |
|
||||
| **Backup + CDP** | Built-in | 3rd party nebo extra licence |
|
||||
| **Security (NGFW, IPS, WAF)** | Built-in (aSEC) | NSX + 3rd party (Palo Alto, Check Point) |
|
||||
| **Network management** | WYDIWYG vizuální editor | NSX Manager (složitější) |
|
||||
| **Min. cluster (3 kopie)** | 3 uzly | 5 uzlů (vSAN) |
|
||||
| **Data locality** | Ano | Ne |
|
||||
| **SSD life prediction** | Ano | Ne |
|
||||
|
||||
### Use case
|
||||
|
||||
- **VMware exit** — náhrada za VMware v SMB a mid-market
|
||||
- **Greenfield HCI** — nové DC, branch offices, remote sites
|
||||
- **VDI** — aDesk integrovaný s HCI
|
||||
- **Security-first** — organizace vyžadující integrovanou bezpečnost (NGFW, IPS, WAF)
|
||||
- **Asie-Pacific / EMEA** — nejsilnější v Asii, expanding do Evropy
|
||||
|
||||
### Rizika a omezení
|
||||
|
||||
| Riziko | Detail |
|
||||
|--------|--------|
|
||||
| **Geopolitické** | Čínský vendor — možné regulatory restrictions (GDPR, EU, NATO, government) |
|
||||
| **Ekosystém** | Menší komunita než VMware/Proxmox, méně dokumentace a ISV certifikací |
|
||||
| **Support** | Support primárně z Asie, lokální partner kritický |
|
||||
| **Vendor lock-in** | Uzavřený ekosystém (aSV + aSAN + aNet + aSEC), těžší mix s 3rd party |
|
||||
| **Reference v ČR** | Velmi omezené — nutný pilot před produkcí |
|
||||
|
||||
### Migrace na/z Sangfor
|
||||
|
||||
Viz matice migrací výše v této sekci. Pro VMware → Sangfor existuje dedikovaný import nástroj. Pro ostatní hypervisory standardní qemu-img.
|
||||
|
||||
## Storage v hypervizorech
|
||||
|
||||
Viz také: [STORAGE.md](STORAGE.md) — detailní přehled storage protokolů a konfigurací.
|
||||
|
||||
@@ -4,9 +4,9 @@ This file has been split into separate areas:
|
||||
|
||||
| Area | File |
|
||||
|--------|--------|
|
||||
| 🖥️ Hypervisors and virtualization | [HYPERVISORS.md](HYPERVISORS.md) |
|
||||
| 🏭 Data centers | [DATACENTERS.md](DATACENTERS.md) |
|
||||
| 💾 Storage | [STORAGE.md](STORAGE.md) |
|
||||
| 🔧 Hardware and servers | [HARDWARE.md](HARDWARE.md) |
|
||||
| 🖥️ Hypervisors and virtualization | [HYPERVISORS.en.md](HYPERVISORS.en.md) |
|
||||
| 🏭 Data centers | [DATACENTERS.en.md](DATACENTERS.en.md) |
|
||||
| 💾 Storage | [STORAGE.en.md](STORAGE.en.md) |
|
||||
| 🔧 Hardware and servers | [HARDWARE.en.md](HARDWARE.en.md) |
|
||||
|
||||
*Last revision: 2026-06-03*
|
||||
|
||||
299
KUBERNETES.en.md
Normal file
299
KUBERNETES.en.md
Normal file
@@ -0,0 +1,299 @@
|
||||
# ☸ Kubernetes — architecture, platforms, Cluster API
|
||||
|
||||
## Overview
|
||||
|
||||
Kubernetes (K8s) is an open-source container orchestrator — the de facto standard for deploying, scaling, and managing containerized applications. Built on declarative configuration and control loops (reconciliation).
|
||||
|
||||
## Kubernetes deployment methods
|
||||
|
||||
| Method | Description | Control plane | Best for |
|
||||
|--------|-------------|--------------|----------|
|
||||
| **kubeadm** | Official K8s cluster bootstrap tool | Self-managed (stacked/external etcd) | On-prem, lab, learning |
|
||||
| **K3s** | Lightweight K8s (Rancher), single binary, embedded etcd/SQLite | Self-managed | Edge, IoT, low-resource, HA with embedded etcd |
|
||||
| **RKE2** | Rancher Kubernetes Engine 2, CIS-hardened, FIPS-ready | Self-managed | Enterprise on-prem, air-gapped, regulatory |
|
||||
| **OpenShift** | Red Hat enterprise K8s + operator lifecycle + SDN + routing | Self-managed (RHCOS) | Enterprise, multicluster, platform engineering |
|
||||
| **Vanilla K8s (CAPI)** | Cluster API — declarative provisioning and lifecycle management | Self-managed (CAPI managed) | Fleet management, GitOps, multi-provider |
|
||||
| **EKS** (AWS) | Managed K8s | AWS managed | AWS cloud-native, least ops |
|
||||
| **AKS** (Azure) | Managed K8s | Azure managed | Azure cloud-native |
|
||||
| **GKE** (GCP) | Managed K8s, auto-pilot, autopilot modes | GCP managed | GCP cloud-native |
|
||||
| **SKE** (Sangfor) | Managed K8s on Sangfor HCI | Vendor managed | Sangfor HCI ecosystem |
|
||||
|
||||
---
|
||||
|
||||
## Cluster API (CAPI)
|
||||
|
||||
### What is Cluster API
|
||||
|
||||
Cluster API is a Kubernetes sub-project (SIG Cluster-Lifecycle) that brings declarative APIs for provisioning, upgrading, and operating Kubernetes clusters. Instead of Terraform scripts or manual `kubeadm`, you define clusters as Kubernetes Custom Resources — `Cluster`, `Machine`, `MachineDeployment`, etc.
|
||||
|
||||
Core principle: **A Kubernetes cluster that manages Kubernetes clusters.**
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ Management Cluster │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────┐ │
|
||||
│ │ CAPI Controllers │ │
|
||||
│ │ ┌──────┐ ┌──────┐ ┌─────────┐ │ │
|
||||
│ │ │ Infra│ │Bootstrap│ │Control │ │ │
|
||||
│ │ │ Prov │ │ Prov │ │Plane Pr │ │ │
|
||||
│ │ └──────┘ └──────┘ └─────────┘ │ │
|
||||
│ └──────────────────────────────────┘ │
|
||||
│ │
|
||||
│ CR: Cluster, Machine, MachineDeployment│
|
||||
│ ... │
|
||||
└────────────────┬────────────────────────┘
|
||||
│ CAPI controller
|
||||
│ creates / manages
|
||||
┌────────┴────────┐
|
||||
▼ ▼
|
||||
┌───────────────┐ ┌───────────────┐
|
||||
│ Workload │ │ Workload │
|
||||
│ Cluster (dev) │ │ Cluster (prod)│
|
||||
│ ┌───┐ ┌───┐ │ │ ┌───┐ ┌───┐ │
|
||||
│ │ CP│ │ W │ │ │ │ CP│ │ W │ │
|
||||
│ └───┘ └───┘ │ │ └───┘ └───┘ │
|
||||
└───────────────┘ └───────────────┘
|
||||
```
|
||||
|
||||
- **Management cluster** — a Kubernetes cluster running CAPI controllers. Can be a dedicated small admin cluster.
|
||||
- **Workload (managed) cluster** — Kubernetes clusters managed by CAPI; each is a CRD inside the management cluster.
|
||||
- **Machine** — abstraction of a compute unit (VM, bare metal) that becomes a K8s node.
|
||||
|
||||
### Key CRDs (Custom Resource Definitions)
|
||||
|
||||
| CRD | API group | Purpose |
|
||||
|-----|-----------|---------|
|
||||
| **Cluster** | `cluster.x-k8s.io` | Cluster representation (infra ref, control plane ref, networking) |
|
||||
| **Machine** | `cluster.x-k8s.io` | Individual node (VM/BM instance) |
|
||||
| **MachineDeployment** | `cluster.x-k8s.io` | Declarative scaling and rolling update of workers |
|
||||
| **MachineSet** | `cluster.x-k8s.io` | Replica set for Machines (lower-level) |
|
||||
| **MachineHealthCheck** | `cluster.x-k8s.io` | Auto-remediation (replace unhealthy nodes) |
|
||||
| **ClusterClass** | `cluster.x-k8s.io` | Cluster template for reuse |
|
||||
| **KubeadmControlPlane** | `controlplane.cluster.x-k8s.io` | Kubeadm-managed control plane (stacked/external etcd) |
|
||||
| **KubeadmConfig / KubeadmConfigTemplate** | `bootstrap.cluster.x-k8s.io` | Bootstrap configuration (kubeadm init/join) |
|
||||
|
||||
### Provider model
|
||||
|
||||
CAPI uses a three-layer provider model:
|
||||
|
||||
#### 1. Infrastructure Provider
|
||||
Creates and manages infrastructure (VM, networks, LB, storage).
|
||||
|
||||
| Provider | Platform | Status |
|
||||
|----------|----------|--------|
|
||||
| **AWS (CAPA)** | AWS EC2, VPC, ELB, EKS | Stable, SIG-sponsored |
|
||||
| **Azure (CAPZ)** | Azure VM, VNet, LB, AKS | Stable, SIG-sponsored |
|
||||
| **GCP (CAPG)** | GCP Compute, VPC, GKE | Beta |
|
||||
| **vSphere (CAPV)** | VMware vSphere | Stable |
|
||||
| **OpenStack (CAPO)** | OpenStack compute/network | Stable |
|
||||
| **Metal3** | Bare metal (Ironic) | Stable |
|
||||
| **Docker (CAPD)** | Docker containers (development) | Tilt/Dev only |
|
||||
| **Akamai (Linode)** | Linode | Community |
|
||||
| **Azure Stack HCI** | Azure Stack HCI | Community |
|
||||
| **cloudscale** | cloudscale.ch | Community |
|
||||
| **Exoscale** | Exoscale | Community |
|
||||
| **IBM Cloud** | IBM Cloud | Community |
|
||||
| **Equinix Metal** | Equinix (ex Packet) | Community |
|
||||
| **Hetzner** | Hetzner Cloud | Community |
|
||||
| **OpenNebula** | OpenNebula | Community |
|
||||
|
||||
#### 2. Bootstrap Provider
|
||||
Handles K8s initialization on a node (kubeadm init/join, TLS certs, tokens).
|
||||
|
||||
| Provider | Description |
|
||||
|----------|-------------|
|
||||
| **Kubeadm** (built-in) | Standard kubeadm init/join, supports stacked/external etcd |
|
||||
| **EKS** | Bootstrap for EKS managed control plane (AWS) |
|
||||
| **K3s** | Lightweight K8s bootstrap (edge, IoT) |
|
||||
| **RKE2** | Rancher K8s bootstrap, CIS-hardened |
|
||||
| **Talos** | API-driven bootstrap (Sidero Labs), immutable OS |
|
||||
| **k0smotron** | K0s-based bootstrap + hosted control plane |
|
||||
| **MicroK8s** | Canonical MicroK8s bootstrap |
|
||||
| **Canonical Kubernetes** | Canonical K8s (snap-based) |
|
||||
|
||||
#### 3. Control Plane Provider
|
||||
Manages control plane nodes.
|
||||
|
||||
| Provider | Description |
|
||||
|----------|-------------|
|
||||
| **KubeadmControlPlane** (built-in) | Kubeadm-managed CP, stacked/external etcd |
|
||||
| **EKS** | AWS EKS managed control plane |
|
||||
| **Kamaji** | Hosted control plane (CP runs as deployment in management cluster) |
|
||||
| **K3s** | K3s control plane (edge-optimized) |
|
||||
| **RKE2** | RKE2 control plane |
|
||||
| **Talos** | Talos control plane, API-based management |
|
||||
| **k0smotron** | Hosted control plane (k0s-based) |
|
||||
| **Nested** | Nested virtualization control plane |
|
||||
|
||||
### ClusterClass and Managed Topologies
|
||||
|
||||
ClusterClass (stable since CAPI v1beta1, CAPI v1.0+) allows defining a **cluster template**:
|
||||
|
||||
```yaml
|
||||
apiVersion: cluster.x-k8s.io/v1beta1
|
||||
kind: ClusterClass
|
||||
metadata:
|
||||
name: standard-aws-cluster
|
||||
spec:
|
||||
controlPlane:
|
||||
ref:
|
||||
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
|
||||
kind: KubeadmControlPlaneTemplate
|
||||
name: aws-cp-tmpl
|
||||
machineInfrastructure:
|
||||
ref:
|
||||
kind: AWSMachineTemplate
|
||||
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
|
||||
name: aws-cp-machine-tmpl
|
||||
workers:
|
||||
machineDeployments:
|
||||
- class: default-worker
|
||||
template:
|
||||
bootstrap:
|
||||
ref:
|
||||
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
|
||||
kind: KubeadmConfigTemplate
|
||||
name: aws-worker-bootstrap-tmpl
|
||||
infrastructure:
|
||||
ref:
|
||||
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
|
||||
kind: AWSMachineTemplate
|
||||
name: aws-worker-machine-tmpl
|
||||
variables:
|
||||
- name: instanceType
|
||||
required: true
|
||||
schema:
|
||||
openAPIV3Schema:
|
||||
type: string
|
||||
enum: ["t3.large", "m5.large", "m5.xlarge"]
|
||||
```
|
||||
|
||||
Then create a cluster with variable overrides:
|
||||
|
||||
```yaml
|
||||
apiVersion: cluster.x-k8s.io/v1beta1
|
||||
kind: Cluster
|
||||
metadata:
|
||||
name: dev-team-alpha
|
||||
namespace: clusters
|
||||
spec:
|
||||
topology:
|
||||
class: standard-aws-cluster
|
||||
version: v1.30.2
|
||||
controlPlane:
|
||||
replicas: 1
|
||||
workers:
|
||||
machineDeployments:
|
||||
- class: default-worker
|
||||
name: md-0
|
||||
replicas: 2
|
||||
variables:
|
||||
- name: instanceType
|
||||
value: "m5.xlarge"
|
||||
```
|
||||
|
||||
### Cluster lifecycle with CAPI
|
||||
|
||||
| Phase | Action | CAPI mechanism |
|
||||
|-------|--------|----------------|
|
||||
| **Create** | `kubectl apply -f cluster.yaml` | Controller creates infra (VM, network), runs kubeadm init/join bootstrap |
|
||||
| **Scale** | Update `replicas` in MachineDeployment | Controller creates/removes Machine → VM → node join/drain |
|
||||
| **Upgrade** | Change `version` in KubeadmControlPlane / MachineDeployment | Rolling update: new CP node → upgrade → old drain & delete. Workers: MachineDeployment rolling update |
|
||||
| **Health check** | MachineHealthCheck | If node unhealthy > timeout, controller creates replacement Machine |
|
||||
| **Delete** | `kubectl delete cluster` | Controller drains, deletes VMs, cleans up infrastructure |
|
||||
| **Template update** | Change AWSMachineTemplate / KubeadmConfigTemplate | New Machines use the new template; existing Machines only affected via rolling update |
|
||||
|
||||
### Auto-remediation (MachineHealthCheck)
|
||||
|
||||
```yaml
|
||||
apiVersion: cluster.x-k8s.io/v1beta1
|
||||
kind: MachineHealthCheck
|
||||
metadata:
|
||||
name: prod-mhc
|
||||
namespace: clusters
|
||||
spec:
|
||||
clusterName: prod-us-east
|
||||
selector:
|
||||
matchLabels:
|
||||
cluster.x-k8s.io/deployment-name: prod-us-east-workers
|
||||
unhealthyConditions:
|
||||
- type: Ready
|
||||
status: "False"
|
||||
timeout: 5m
|
||||
- type: Ready
|
||||
status: Unknown
|
||||
timeout: 5m
|
||||
maxUnhealthy: "40%"
|
||||
nodeStartupTimeout: 10m
|
||||
```
|
||||
|
||||
### CAPI + GitOps
|
||||
|
||||
CAPI integrates naturally with GitOps:
|
||||
|
||||
- **ArgoCD** — Cluster and MachineDeployment manifests in Git repo, ArgoCD applies them to the management cluster
|
||||
- **Flux** — `Kustomization` + `OCIRepository` for CAPI objects
|
||||
- **Crossplane** — can be combined: Crossplane provisions cloud resources (VPC, subnets), CAPI manages K8s clusters on top
|
||||
|
||||
Pattern: a dedicated "fleet management" cluster running CAPI + ArgoCD. All workload clusters are defined as YAML in Git.
|
||||
|
||||
### CAPI for on-prem
|
||||
|
||||
| Provider | Use case | Note |
|
||||
|----------|----------|------|
|
||||
| **Metal3** (Ironic) | Bare metal provisioning (PXE, IPMI, Redfish) | Automatically provisions BM servers as K8s nodes |
|
||||
| **CAPV (vSphere)** | VMware VMs as K8s nodes | Most common enterprise on-prem |
|
||||
| **CAPO (OpenStack)** | OpenStack VMs as K8s nodes | OpenStack-native |
|
||||
| **Nutanix (CAPNX)** | Nutanix AHV/Prism | Community provider |
|
||||
|
||||
### CAPI for edge
|
||||
|
||||
| Provider | Use case | Note |
|
||||
|----------|----------|------|
|
||||
| **K3s bootstrap + control plane** | Lightweight K8s on edge devices | Single binary, SQLite/embedded etcd |
|
||||
| **RKE2 bootstrap + control plane** | Enterprise edge, air-gapped | CIS-hardened, FIPS |
|
||||
| **Talos** | Immutable OS, API-driven | Minimal footprint, no SSH |
|
||||
| **k0smotron** | Hosted control plane for edge clusters | CP runs in management cluster, worker on edge |
|
||||
|
||||
### CAPI vs alternatives
|
||||
|
||||
| Tool | Approach | CAPI advantage | CAPI disadvantage |
|
||||
|------|----------|----------------|-------------------|
|
||||
| **Terraform/Pulumi** | Imperative/declarative IaC | CAPI is K8s-native — same tool for apps and clusters; GitOps ready | Terraform has broader non-K8s resource support |
|
||||
| **kubeadm** | Manual or scripted | CAPI automates full lifecycle including upgrades and remediation | Higher complexity, requires management cluster |
|
||||
| **Rancher** | Web UI + API for K8s cluster management | CAPI is open-source, vendor-neutral | Rancher has GUI, monitoring, app catalog |
|
||||
| **OpenShift Hive/ACM** | Red Hat Advanced Cluster Management | CAPI is standard (SIG) — wider provider ecosystem | ACM has governance, policy, compliance |
|
||||
|
||||
### Limitations and maturity
|
||||
|
||||
- **Management cluster is SPOF** — needs its own HA and backup (etcd snapshots, certificates)
|
||||
- **CAPI is not a cluster autoscaler** — it handles cluster lifecycle, not pod auto-scaling within a cluster (use Cluster Autoscaler separately)
|
||||
- **Provider maturity varies** — AWS/Azure/vSphere stable, GCP/OpenStack beta, some community providers alpha
|
||||
- **etcd backup is not built-in** — must be handled externally (Velero, etcd snapshot)
|
||||
- **CAPI does not handle applications** — only K8s cluster lifecycle (monitoring, logging, ingress is user-managed)
|
||||
- **Learning curve** — requires understanding management cluster, provider model, CRDs
|
||||
- **CAPI v1.13+ (2026)** — stable release, v1beta1 API is GA, ClusterClass stable, EKS/AKS/GKE managed control plane support
|
||||
|
||||
### Recommended production CAPI stack
|
||||
|
||||
| Component | Recommendation |
|
||||
|-----------|---------------|
|
||||
| **Management cluster** | K3s (small footprint) or kubeadm (3 nodes HA) |
|
||||
| **Infra provider** | CAPA (AWS) / CAPV (vSphere) / CAPO (OpenStack) — based on platform |
|
||||
| **Bootstrap/CP provider** | Kubeadm or RKE2 |
|
||||
| **GitOps** | ArgoCD or Flux |
|
||||
| **Backup** | Velero + restic/Ceph |
|
||||
| **Cluster autoscaler** | Cluster Autoscaler (via CAPI integration) |
|
||||
| **Network** | Cilium (CAPI-native, support) |
|
||||
| **Secrets** | External Secrets Operator / Sealed Secrets |
|
||||
| **Monitoring** | Prometheus + Grafana (kube-prometheus-stack) |
|
||||
| **Ingress** | ingress-nginx / Kong / Traefik |
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-18*
|
||||
299
KUBERNETES.md
Normal file
299
KUBERNETES.md
Normal file
@@ -0,0 +1,299 @@
|
||||
# ☸ Kubernetes — architektura, platformy, Cluster API
|
||||
|
||||
## Přehled
|
||||
|
||||
Kubernetes (K8s) je open-source orchestrátor kontejnerů — de facto standard pro nasazování, škálování a správu containerizovaných aplikací. Postaven na modelu deklarativní konfigurace a control loopů (reconciliation).
|
||||
|
||||
## Způsoby nasazení Kubernetes
|
||||
|
||||
| Metoda | Popis | Správa control plane | Vhodné pro |
|
||||
|--------|-------|---------------------|------------|
|
||||
| **kubeadm** | Oficiální nástroj pro bootstrap K8s clusteru | Self-managed (stacked/external etcd) | On-prem, lab, learning |
|
||||
| **K3s** | Lightweight K8s (Rancher), single binary, embedded etcd/SQLite | Self-managed | Edge, IoT, low-resource, HA s embedded etcd |
|
||||
| **RKE2** | Rancher Kubernetes Engine 2, CIS-hardened, FIPS-ready | Self-managed | Enterprise on-prem, air-gapped, regulatory |
|
||||
| **OpenShift** | Red Hat enterprise K8s + operator lifecycle + SDN + routing | Self-managed (RHCOS) | Enterprise, multicluster, platform engineering |
|
||||
| **Vanilla K8s (CAPI)** | Cluster API — deklarativní provisioning a lifecycle management | Self-managed (CAPI managed) | Fleet management, GitOps, multi-provider |
|
||||
| **EKS** (AWS) | Managed K8s | AWS managed | AWS cloud-native, nejméně ops |
|
||||
| **AKS** (Azure) | Managed K8s | Azure managed | Azure cloud-native |
|
||||
| **GKE** (GCP) | Managed K8s, auto-pilot, autopilot modes | GCP managed | GCP cloud-native |
|
||||
| **SKE** (Sangfor) | Managed K8s on Sangfor HCI | Vendor managed | Sangfor HCI ekosystém |
|
||||
|
||||
---
|
||||
|
||||
## Cluster API (CAPI)
|
||||
|
||||
### Co je Cluster API
|
||||
|
||||
Cluster API je Kubernetes sub-projekt (SIG Cluster-Lifecycle), který přináší deklarativní API pro provisioning, upgrade a operace Kubernetes clusterů. Místo Terraform skriptů nebo manuálního `kubeadm` definujete cluster jako Kubernetes Custom Resources — `Cluster`, `Machine`, `MachineDeployment` atd.
|
||||
|
||||
Princip: **Kubernetes cluster, který spravuje Kubernetes clustery.**
|
||||
|
||||
### Architektura
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ Management Cluster │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────┐ │
|
||||
│ │ CAPI Controllers │ │
|
||||
│ │ ┌──────┐ ┌──────┐ ┌─────────┐ │ │
|
||||
│ │ │ Infra│ │Bootstrap│ │Control │ │ │
|
||||
│ │ │ Prov │ │ Prov │ │Plane Pr │ │ │
|
||||
│ │ └──────┘ └──────┘ └─────────┘ │ │
|
||||
│ └──────────────────────────────────┘ │
|
||||
│ │
|
||||
│ CR: Cluster, Machine, MachineDeployment│
|
||||
│ ... │
|
||||
└────────────────┬────────────────────────┘
|
||||
│ CAPI controller
|
||||
│ vytváří / spravuje
|
||||
┌────────┴────────┐
|
||||
▼ ▼
|
||||
┌───────────────┐ ┌───────────────┐
|
||||
│ Workload │ │ Workload │
|
||||
│ Cluster (dev) │ │ Cluster (prod)│
|
||||
│ ┌───┐ ┌───┐ │ │ ┌───┐ ┌───┐ │
|
||||
│ │ CP│ │ W │ │ │ │ CP│ │ W │ │
|
||||
│ └───┘ └───┘ │ │ └───┘ └───┘ │
|
||||
└───────────────┘ └───────────────┘
|
||||
```
|
||||
|
||||
- **Management cluster** — Kubernetes cluster, kde běží CAPI controllery. Může to být vyhrazený "admin" cluster (často velmi malý).
|
||||
- **Workload (managed) cluster** — Kubernetes clustery, které CAPI spravuje. Každý je reprezentován jako CRD v management clusteru.
|
||||
- **Machine** — abstrakce compute jednotky (VM, bare metal), která se stane K8s uzlem.
|
||||
|
||||
### Klíčové CRD (Custom Resource Definitions)
|
||||
|
||||
| CRD | API skupina | Účel |
|
||||
|-----|------------|------|
|
||||
| **Cluster** | `cluster.x-k8s.io` | Reprezentace clusteru (infra reference, control plane ref, networking) |
|
||||
| **Machine** | `cluster.x-k8s.io` | Jednotlivý uzel (VM/BM instance) |
|
||||
| **MachineDeployment** | `cluster.x-k8s.io` | Deklarativní škálování a rolling update workerů |
|
||||
| **MachineSet** | `cluster.x-k8s.io` | Replica set pro Machiny (lower-level) |
|
||||
| **MachineHealthCheck** | `cluster.x-k8s.io` | Auto-remediaci (automatické nahrazení unhealthy uzlu) |
|
||||
| **ClusterClass** | `cluster.x-k8s.io` | Šablona pro vytváření clusterů |
|
||||
| **KubeadmControlPlane** | `controlplane.cluster.x-k8s.io` | Control plane managed kubeadm (stacked/external etcd) |
|
||||
| **KubeadmConfig / KubeadmConfigTemplate** | `bootstrap.cluster.x-k8s.io` | Bootstrap konfigurace (kubeadm init/join) |
|
||||
|
||||
### Provider model
|
||||
|
||||
CAPI používá třívrstvý provider model:
|
||||
|
||||
#### 1. Infrastructure Provider
|
||||
Vytváří a spravuje infrastrukturu (VM, sítě, LB, storage).
|
||||
|
||||
| Provider | Platforma | Status |
|
||||
|----------|-----------|--------|
|
||||
| **AWS (CAPA)** | AWS EC2, VPC, ELB, EKS | Stable, SIG-sponsored |
|
||||
| **Azure (CAPZ)** | Azure VM, VNet, LB, AKS | Stable, SIG-sponsored |
|
||||
| **GCP (CAPG)** | GCP Compute, VPC, GKE | Beta |
|
||||
| **vSphere (CAPV)** | VMware vSphere | Stable |
|
||||
| **OpenStack (CAPO)** | OpenStack compute/network | Stable |
|
||||
| **Metal3** | Bare metal (Ironic) | Stable |
|
||||
| **Docker (CAPD)** | Docker containers (development) | Tilt/Dev only |
|
||||
| **Akamai (Linode)** | Linode | Community |
|
||||
| **Azure Stack HCI** | Azure Stack HCI | Community |
|
||||
| **cloudscale** | cloudscale.ch | Community |
|
||||
| **Exoscale** | Exoscale | Community |
|
||||
| **IBM Cloud** | IBM Cloud | Community |
|
||||
| **Equinix Metal** | Equinix (ex Packet) | Community |
|
||||
| **Hetzner** | Hetzner Cloud | Community |
|
||||
| **OpenNebula** | OpenNebula | Community |
|
||||
|
||||
#### 2. Bootstrap Provider
|
||||
Zajišťuje inicializaci K8s na node (kubeadm init/join, TLS certs, tokeny).
|
||||
|
||||
| Provider | Popis |
|
||||
|----------|-------|
|
||||
| **Kubeadm** (vestavěný) | Standardní kubeadm init/join, podpora stacked/external etcd |
|
||||
| **EKS** | Bootstrap pro EKS managed control plane (AWS) |
|
||||
| **K3s** | Lightweight K8s bootstrap (edge, IoT) |
|
||||
| **RKE2** | Rancher K8s bootstrap, CIS-hardened |
|
||||
| **Talos** | API-driven bootstrap (Sidero Labs), immutable OS |
|
||||
| **k0smotron** | K0s-based bootstrap + hosted control plane |
|
||||
| **MicroK8s** | Canonical MicroK8s bootstrap |
|
||||
| **Canonical Kubernetes** | Canonical K8s (snap-based) |
|
||||
|
||||
#### 3. Control Plane Provider
|
||||
Spravuje control plane uzly.
|
||||
|
||||
| Provider | Popis |
|
||||
|----------|-------|
|
||||
| **KubeadmControlPlane** (vestavěný) | Kubeadm-managed CP, stacked/external etcd |
|
||||
| **EKS** | AWS EKS managed control plane |
|
||||
| **Kamaji** | Hosted control plane (CP běží jako deployment v management clusteru) |
|
||||
| **K3s** | K3s control plane (edge-optimized) |
|
||||
| **RKE2** | RKE2 control plane |
|
||||
| **Talos** | Talos control plane, API-based management |
|
||||
| **k0smotron** | Hosted control plane (k0s-based) |
|
||||
| **Nested** | Nested virtualization control plane |
|
||||
|
||||
### ClusterClass a Managed Topologies
|
||||
|
||||
ClusterClass (stabilní od CAPI v1beta1, CAPI v1.0+) umožňuje definovat **šablonu clusteru**:
|
||||
|
||||
```yaml
|
||||
apiVersion: cluster.x-k8s.io/v1beta1
|
||||
kind: ClusterClass
|
||||
metadata:
|
||||
name: standard-aws-cluster
|
||||
spec:
|
||||
controlPlane:
|
||||
ref:
|
||||
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
|
||||
kind: KubeadmControlPlaneTemplate
|
||||
name: aws-cp-tmpl
|
||||
machineInfrastructure:
|
||||
ref:
|
||||
kind: AWSMachineTemplate
|
||||
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
|
||||
name: aws-cp-machine-tmpl
|
||||
workers:
|
||||
machineDeployments:
|
||||
- class: default-worker
|
||||
template:
|
||||
bootstrap:
|
||||
ref:
|
||||
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
|
||||
kind: KubeadmConfigTemplate
|
||||
name: aws-worker-bootstrap-tmpl
|
||||
infrastructure:
|
||||
ref:
|
||||
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
|
||||
kind: AWSMachineTemplate
|
||||
name: aws-worker-machine-tmpl
|
||||
variables:
|
||||
- name: instanceType
|
||||
required: true
|
||||
schema:
|
||||
openAPIV3Schema:
|
||||
type: string
|
||||
enum: ["t3.large", "m5.large", "m5.xlarge"]
|
||||
```
|
||||
|
||||
Pak lze vytvořit cluster s přetížením proměnných:
|
||||
|
||||
```yaml
|
||||
apiVersion: cluster.x-k8s.io/v1beta1
|
||||
kind: Cluster
|
||||
metadata:
|
||||
name: dev-team-alpha
|
||||
namespace: clusters
|
||||
spec:
|
||||
topology:
|
||||
class: standard-aws-cluster
|
||||
version: v1.30.2
|
||||
controlPlane:
|
||||
replicas: 1
|
||||
workers:
|
||||
machineDeployments:
|
||||
- class: default-worker
|
||||
name: md-0
|
||||
replicas: 2
|
||||
variables:
|
||||
- name: instanceType
|
||||
value: "m5.xlarge"
|
||||
```
|
||||
|
||||
### Životní cyklus clusteru s CAPI
|
||||
|
||||
| Fáze | Akce | CAPI mechanismus |
|
||||
|------|------|------------------|
|
||||
| **Create** | `kubectl apply -f cluster.yaml` | Controller vytvoří infra (VM, network), provede bootstrap kubeadm init/join |
|
||||
| **Scale** | Upravit `replicas` v MachineDeployment | Controller vytvoří/odstraní Machine → VM → node join/drain |
|
||||
| **Upgrade** | Změnit `version` v KubeadmControlPlane / MachineDeployment | Rolling update: nový CP node → upgrade → starý drain a delete. Workers: MachineDeployment rolling update |
|
||||
| **Health check** | MachineHealthCheck | Pokud node unhealthy > timeout, controller vytvoří náhradní Machine |
|
||||
| **Delete** | `kubectl delete cluster` | Controller provede drain, delete VMs, cleanup infrastruktury |
|
||||
| **Template update** | Změna AWSMachineTemplate / KubeadmConfigTemplate | Stroj se vytvoří s novou šablonou; stávající Machiny se dotýká jen přes rolling update |
|
||||
|
||||
### Auto-remediace (MachineHealthCheck)
|
||||
|
||||
```yaml
|
||||
apiVersion: cluster.x-k8s.io/v1beta1
|
||||
kind: MachineHealthCheck
|
||||
metadata:
|
||||
name: prod-mhc
|
||||
namespace: clusters
|
||||
spec:
|
||||
clusterName: prod-us-east
|
||||
selector:
|
||||
matchLabels:
|
||||
cluster.x-k8s.io/deployment-name: prod-us-east-workers
|
||||
unhealthyConditions:
|
||||
- type: Ready
|
||||
status: "False"
|
||||
timeout: 5m
|
||||
- type: Ready
|
||||
status: Unknown
|
||||
timeout: 5m
|
||||
maxUnhealthy: "40%"
|
||||
nodeStartupTimeout: 10m
|
||||
```
|
||||
|
||||
### CAPI + GitOps
|
||||
|
||||
CAPI se přirozeně integruje s GitOps:
|
||||
|
||||
- **ArgoCD** — Cluster a MachineDeployment manifesty v Git repozitáři, ArgoCD je aplikuje na management cluster
|
||||
- **Flux** — `Kustomization` + `OCIRepository` pro CAPI objekty
|
||||
- **Crossplane** — lze kombinovat: Crossplane pro provisioning cloud resources (VPC, subnets), CAPI pro K8s cluster na nich
|
||||
|
||||
Vzor: vyhrazený "fleet management" cluster, na kterém běží CAPI + ArgoCD. Všechny workload clustery jsou definované jako YAML v Gitu.
|
||||
|
||||
### CAPI pro on-prem
|
||||
|
||||
| Provider | Use case | Poznámka |
|
||||
|----------|----------|----------|
|
||||
| **Metal3** (Ironic) | Bare metal provisioning (PXE, IPMI, Redfish) | Automatické provisionování BM serverů jako K8s nodes |
|
||||
| **CAPV (vSphere)** | VMware VM jako K8s nodes | Většina enterprise on-prem |
|
||||
| **CAPO (OpenStack)** | OpenStack VM jako K8s nodes | OpenStack-native |
|
||||
| **Nutanix (CAPNX)** | Nutanix AHV/Prism | Community provider |
|
||||
|
||||
### CAPI pro edge
|
||||
|
||||
| Provider | Use case | Poznámka |
|
||||
|----------|----------|----------|
|
||||
| **K3s bootstrap + control plane** | Lightweight K8s na edge zařízeních | Single binary, SQLite/embedded etcd |
|
||||
| **RKE2 bootstrap + control plane** | Enterprise edge, air-gapped | CIS-hardened, FIPS |
|
||||
| **Talos** | Immutable OS, API-driven | Minimal footprint, no SSH |
|
||||
| **k0smotron** | Hosted control plane pro edge clustery | CP běží v management clusteru, worker na edge |
|
||||
|
||||
### CAPI vs alternativy
|
||||
|
||||
| Nástroj | Přístup | CAPI výhoda | CAPI nevýhoda |
|
||||
|---------|---------|-------------|---------------|
|
||||
| **Terraform/Pulumi** | Imperativní/declarativní IaC | CAPI je K8s-native — stejný nástroj pro appky i clustery; GitOps ready | Terraform má širší podporu non-K8s resources |
|
||||
| **kubeadm** | Manuální nebo skriptovaný | CAPI automatizuje celý lifecycle včetně upgradů a remediací | Vyšší komplexita, nutný management cluster |
|
||||
| **Rancher** | Web UI + API pro správu K8s clusterů | CAPI je open-source, vendor-neutral | Rancher má GUI, monitoring, katalog appek |
|
||||
| **OpenShift Hive/ACM** | Red Hat Advanced Cluster Management | CAPI je standardní (SIG) — širší provider ecosystem | ACM má governance, policy, compliance |
|
||||
|
||||
### Limitations a maturity
|
||||
|
||||
- **Management cluster je SPOF** — musí mít vlastní HA a backup (etcd zálohy, certifikáty)
|
||||
- **CAPI není cluster autoscaler** — řeší lifecycle clusterů, ne auto-scaling podů v rámci clusteru (používá se Cluster Autoscaler samostatně)
|
||||
- **Provider maturity se liší** — AWS/Azure/vSphere stabilní, GCP/OpenStack beta, některé community providers alpha
|
||||
- **etcd backup není built-in** — nutné řešit externě (Velero, etcd snapshot)
|
||||
- **CAPI neřeší aplikace** — pouze lifecycle K8s clusterů (monitoring, logging, ingress si řídí uživatel)
|
||||
- **Learning curve** — nutnost management clusteru, pochopení provider modelu, CRDs
|
||||
- **CAPI v1.13+ (2026)** — stable release, v1beta1 API je GA, ClusterClass stable, EKS/AKS/GKE managed control plane podpora
|
||||
|
||||
### Doporučený stack pro CAPI v produkci
|
||||
|
||||
| Komponenta | Doporučení |
|
||||
|------------|------------|
|
||||
| **Management cluster** | K3s (malý footprint) nebo kubeadm (3 nodes HA) |
|
||||
| **Infra provider** | CAPA (AWS) / CAPV (vSphere) / CAPO (OpenStack) — dle platformy |
|
||||
| **Bootstrap/CP provider** | Kubeadm nebo RKE2 |
|
||||
| **GitOps** | ArgoCD nebo Flux |
|
||||
| **Backup** | Velero + restic/Ceph |
|
||||
| **Cluster autoscaler** | Cluster Autoscaler (přes CAPI integration) |
|
||||
| **Network** | Cilium (CAPI-native, podpora) |
|
||||
| **Secrets** | External Secrets Operator / Sealed Secrets |
|
||||
| **Monitoring** | Prometheus + Grafana (kube-prometheus-stack) |
|
||||
| **Ingress** | ingress-nginx / Kong / Traefik |
|
||||
|
||||
## Zdroje
|
||||
|
||||
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
|
||||
*Poslední revize: 2026-06-18*
|
||||
@@ -270,6 +270,6 @@ See [DATACENTERS.en.md](DATACENTERS.en.md) — section "Impact of individual tec
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books, and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
Links, books, and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-12*
|
||||
@@ -111,6 +111,6 @@ MongoDB changed its license in 2018 from GNU AGPL v3 to **SSPL** (Server Side Pu
|
||||
|
||||
## Sources
|
||||
|
||||
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
|
||||
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-03*
|
||||
|
||||
@@ -497,6 +497,6 @@ OpenStack provides several services for telemetry and monitoring:
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books and standards: [sources/monitoring/sources.md](sources/monitoring/sources.md)
|
||||
Links, books and standards: [sources/monitoring/sources.en.md](sources/monitoring/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-03*
|
||||
|
||||
@@ -131,7 +131,7 @@ ProxySQL is an advanced proxy for MySQL with sophisticated routing:
|
||||
|
||||
## Sources
|
||||
|
||||
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
|
||||
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
|
||||
|
||||
### Recommended reading
|
||||
|
||||
|
||||
@@ -302,7 +302,7 @@ Anycast detail:
|
||||
|
||||
## Cloud Networking Resilience (2026)
|
||||
|
||||
See also: [CLOUD.md](CLOUD.md) — cloud architecture, multi-AZ, hybrid cloud connectivity.
|
||||
See also: [CLOUD.en.md](CLOUD.en.md) — cloud architecture, multi-AZ, hybrid cloud connectivity.
|
||||
|
||||
### Cell-based Architectures
|
||||
|
||||
@@ -577,7 +577,7 @@ In a private DC, Zero Trust is deployed via:
|
||||
|
||||
## Resources
|
||||
|
||||
Links, books and standards: [sources/networking/sources.md](sources/networking/sources.md)
|
||||
Links, books and standards: [sources/networking/sources.en.md](sources/networking/sources.en.md)
|
||||
- **MTU alignment** — consistent MTU across the entire path, check ICMP blocking for PMTUD
|
||||
- **IP planning** — RFC 1918 (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), avoid overlaps for peering
|
||||
|
||||
|
||||
@@ -195,7 +195,7 @@ Tip: For RAC, consider smaller CPUs (e.g., 64C instead of 96C) — license cost
|
||||
|
||||
## Sources
|
||||
|
||||
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
|
||||
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
|
||||
|
||||
### Recommended reading
|
||||
|
||||
|
||||
337
OS.en.md
Normal file
337
OS.en.md
Normal file
@@ -0,0 +1,337 @@
|
||||
# Operating Systems
|
||||
|
||||
> Overview of Linux distributions and Microsoft Windows for server, container, and AI/GPU workloads, including support lifecycle, EOL dates, and comparison.
|
||||
|
||||
---
|
||||
|
||||
## Distribution overview
|
||||
|
||||
| Distribution | Family | Package manager | Init | Security | Reference platform |
|
||||
|-------------|--------|----------------|------|----------|-------------------|
|
||||
| **Ubuntu LTS** | Debian | apt (deb) | systemd | AppArmor | NVIDIA DGX, widest AI/GPU support |
|
||||
| **Debian** | Debian | apt (deb) | systemd | AppArmor | General-purpose server, stability |
|
||||
| **RHEL** | Red Hat | dnf (rpm) | systemd | SELinux | Enterprise standard, SAP, Oracle DB |
|
||||
| **Rocky Linux** | Red Hat | dnf (rpm) | systemd | SELinux | RHEL binary compatible (free) |
|
||||
| **AlmaLinux** | Red Hat | dnf (rpm) | systemd | SELinux | RHEL binary compatible (free) |
|
||||
| **SLES** | SUSE | zypper (rpm) | systemd | AppArmor | HPC, SAP, mainframe |
|
||||
| **OpenSUSE Leap** | SUSE | zypper (rpm) | systemd | AppArmor | Desktop, development |
|
||||
| **OpenSUSE Tumbleweed** | SUSE | zypper (rpm) | systemd | AppArmor | Rolling release, bleeding edge |
|
||||
| **Fedora** | Red Hat | dnf (rpm) | systemd | SELinux | Desktop, technology preview |
|
||||
| **Arch Linux** | Independent | pacman | systemd | — | Rolling, power users |
|
||||
| **Alpine Linux** | Independent | apk | OpenRC | — | Container image, embedded |
|
||||
| **Flatcar Container Linux** | Independent | — (image-based) | systemd | — | K8s worker node, minimal footprint |
|
||||
| **Bottlerocket** | Independent | — (image-based) | systemd | — | AWS K8s, minimal footprint |
|
||||
|
||||
---
|
||||
|
||||
## Support lifecycle and EOL dates
|
||||
|
||||
> **Standard:** base support (bug fixes, security). **LTS/ELS:** extended support (security only).
|
||||
> ESM = Ubuntu Extended Security Maintenance, EUS = RHEL Extended Update Support, LTSS = SUSE Long Term Service Pack Support.
|
||||
|
||||
### Ubuntu LTS
|
||||
|
||||
| Version | Release | Standard support | ESM / Ubuntu Pro | Note |
|
||||
|---------|---------|-----------------|------------------|------|
|
||||
| **20.04 LTS** (Focal) | 2020-04 | End 2025-04 | End 2030-04 | Last release with Python 2 |
|
||||
| **22.04 LTS** (Jammy) | 2022-04 | End 2027-04 | End 2032-04 | NVIDIA DGX standard |
|
||||
| **24.04 LTS** (Noble) | 2024-04 | End 2029-04 | End 2034-04 | Latest GPU/CUDA support |
|
||||
| **26.04 LTS** (planned) | 2026-04 | End 2031-04 | End 2036-04 | — |
|
||||
|
||||
### RHEL
|
||||
|
||||
| Version | Release | Full support | Maintenance support | Extended life cycle |
|
||||
|---------|---------|-------------|-------------------|-------------------|
|
||||
| **7** | 2014-06 | End 2019-08 | End 2024-06 | End 2028-06 (ELS) |
|
||||
| **8** | 2019-05 | End 2024-05 | End 2029-05 | End 2034-06 (ELS) |
|
||||
| **9** | 2022-05 | End 2027-05 | End 2032-05 | End 2037-06 (ELS) |
|
||||
| **10** (planned) | 2025 | End 2029 | End 2034 | — |
|
||||
|
||||
### Rocky Linux / AlmaLinux
|
||||
|
||||
| Version | Release | Support until | RHEL compatible | Note |
|
||||
|---------|---------|-------------|-----------------|------|
|
||||
| **8** | 2021-06 | 2029-05 | Yes (since RHEL 8.4) | Alma/Rocky |
|
||||
| **9** | 2022-07 | 2032-05 | Yes (since RHEL 9.0) | Alma/Rocky |
|
||||
|
||||
### Debian
|
||||
|
||||
| Version | Release | Full support | LTS support | ELTS (paid) |
|
||||
|---------|---------|-------------|-------------|-------------|
|
||||
| **11** (Bullseye) | 2021-08 | 2024-08 | End 2026-08 | End 2028-08 |
|
||||
| **12** (Bookworm) | 2023-06 | 2026-06 | End 2028-06 | End 2030-06 |
|
||||
| **13** (Trixie) | 2025 (expected) | ~3 years post-release | ~5 years post-release | — |
|
||||
|
||||
### SLES
|
||||
|
||||
| Version | Release | General support | LTSS | Note |
|
||||
|---------|---------|---------------|------|------|
|
||||
| **15 SP3** | 2021-06 | End 2024-12 | End 2027-12 | — |
|
||||
| **15 SP4** | 2022-06 | End 2025-12 | End 2028-12 | — |
|
||||
| **15 SP5** | 2023-06 | End 2026-12 | End 2029-12 | Current SP |
|
||||
| **15 SP6** | 2024-10 | End 2027-12 | End 2030-12 | — |
|
||||
|
||||
### Fedora
|
||||
|
||||
| Version | Release | EOL | Note |
|
||||
|---------|---------|-----|------|
|
||||
| **38** | 2023-04 | 2024-05 | — |
|
||||
| **39** | 2023-11 | 2024-12 | — |
|
||||
| **40** | 2024-04 | 2025-05 | — |
|
||||
| **41** | 2024-11 | 2025-12 | — |
|
||||
|
||||
Fedora releases a new version every ~6 months, EOL ~13 months after release. Serves as upstream for RHEL.
|
||||
|
||||
### Alpine Linux
|
||||
|
||||
| Version | Release | EOL |
|
||||
|---------|---------|-----|
|
||||
| **3.18** | 2023-05 | 2025-05 |
|
||||
| **3.19** | 2023-12 | 2025-12 |
|
||||
| **3.20** | 2024-05 | 2026-05 |
|
||||
| **3.21** | 2024-12 | 2026-12 |
|
||||
|
||||
---
|
||||
|
||||
## Kernel version per distribution
|
||||
|
||||
| Distribution | Kernel (default) | Kernel (HWE/enhanced) | Note |
|
||||
|------------|-----------------|----------------------|------|
|
||||
| Ubuntu 22.04 LTS | 5.15 (GA) | 6.5+ (HWE) | HWE from 22.04.2 |
|
||||
| Ubuntu 24.04 LTS | 6.8 | — | — |
|
||||
| RHEL 8 | 4.18 | — | Backported features |
|
||||
| RHEL 9 | 5.14 | — | Backported features |
|
||||
| RHEL 10 | 6.11+ (expected) | — | — |
|
||||
| Rocky/Alma 8 | 4.18 | — | Same as RHEL 8 |
|
||||
| Rocky/Alma 9 | 5.14 | — | Same as RHEL 9 |
|
||||
| Debian 11 | 5.10 | 6.1 (backports) | — |
|
||||
| Debian 12 | 6.1 | — | — |
|
||||
| SLES 15 SP5 | 5.14 | — | — |
|
||||
| SLES 15 SP6 | 6.4 | — | — |
|
||||
| Fedora 40 | 6.8+ | — | Rolling upstream |
|
||||
| Alpine 3.20 | 6.6 | — | — |
|
||||
|
||||
---
|
||||
|
||||
## Use case comparison
|
||||
|
||||
| Use case | Recommended distribution | Rationale |
|
||||
|----------|------------------------|-----------|
|
||||
| **AI/GPU cluster (DGX)** | Ubuntu 22.04 LTS / DGX OS | NVIDIA standard, CUDA, MLNX_OFED |
|
||||
| **Enterprise K8s (OpenShift)** | RHEL 9 / RHCOS | Red Hat support, GPU Operator |
|
||||
| **Vanilla K8s (on-prem)** | Ubuntu 22.04 LTS + Flatcar (workers) | Community support, minimal worker image |
|
||||
| **HPC cluster (Slurm)** | Rocky Linux 9 / Ubuntu 22.04 | EL ecosystem + Lustre, or Ubuntu |
|
||||
| **Traditional enterprise DB (Oracle, SAP)** | RHEL 9 / SLES 15 | Vendor certification |
|
||||
| **Container host** | Ubuntu 22.04 / Alpine | Broad image compatibility / min size |
|
||||
| **Development / desktop** | Fedora / Ubuntu 24.04 / OpenSUSE Tumbleweed | Latest packages, HW support |
|
||||
| **Embedded / IoT** | Debian / Alpine / Yocto | Minimal footprint, stability |
|
||||
| **Edge inference** | Ubuntu (ARM) / NVIDIA JetPack | Jetson, GPU support |
|
||||
| **Mainframe (IBM z/Arch)** | SLES 15 / RHEL 9 | IBM certification |
|
||||
|
||||
---
|
||||
|
||||
## Package management comparison
|
||||
|
||||
| Feature | apt (Debian/Ubuntu) | dnf (RHEL/Rocky/Alma/Fedora) | zypper (SUSE) | pacman (Arch) | apk (Alpine) |
|
||||
|---------|--------------------|------------------------------|---------------|---------------|-------------|
|
||||
| **Package format** | .deb | .rpm | .rpm | .pkg.tar.zst | .apk |
|
||||
| **Repo management** | /etc/apt/sources.list | /etc/yum.repos.d/ | /etc/zypp/repos.d/ | /etc/pacman.conf | /etc/apk/repositories |
|
||||
| **Lock file** | — (apt-mark hold) | — (exclude) | — (lock) | — (IgnorePkg) | — |
|
||||
| **Transactional update** | No | Yes (dnf history) | Yes (zypper history) | No | No |
|
||||
| **Rollback** | No (manual) | Yes (dnf history rollback) | Yes (snapper + zypper) | No | No |
|
||||
| **Delta updates** | Yes (apt-xapian) | Yes (deltarpm) | Yes (zsync) | No | No |
|
||||
| **Version (as of 2025)** | apt 2.7+ | dnf 4.18+ | zypper 1.14+ | pacman 6.1+ | apk 2.14+ |
|
||||
|
||||
---
|
||||
|
||||
## Security model comparison
|
||||
|
||||
| Feature | SELinux (RHEL derivatives) | AppArmor (Ubuntu/Debian/SUSE) |
|
||||
|---------|--------------------------|-------------------------------|
|
||||
| **Type** | Mandatory Access Control (MAC) | Mandatory Access Control (MAC) |
|
||||
| **Labeling** | Context-based (user:role:type) | Path-based (profile per executable) |
|
||||
| **Configuration** | Policy (modules, booleans) | Profiles (text, in /etc/apparmor.d/) |
|
||||
| **Modes** | Enforcing / Permissive / Disabled | Enforce / Complain / Disabled |
|
||||
| **Learning curve** | Steep (complex policies) | Moderate (simpler profiles) |
|
||||
| **Default in** | RHEL, Rocky, Alma, Fedora | Ubuntu, Debian, SLES, OpenSUSE |
|
||||
| **Use case** | Enterprise multi-tenant, regulated | General-purpose server, app containment |
|
||||
| **Container integration** | SELinux labels on container | AppArmor profile on container |
|
||||
|
||||
Additional layers:
|
||||
- **seccomp** — syscall filtering (default in containerd, Docker)
|
||||
- **Capabilities** — Linux capabilities (drop all except required)
|
||||
- **cgroups v2** — resource isolation (CPU, memory, IO, PID)
|
||||
- **User namespaces** — rootless containers (Podman, Docker rootless)
|
||||
|
||||
---
|
||||
|
||||
## Recommended migration path for EOL distributions
|
||||
|
||||
| From | To | Recommended approach |
|
||||
|------|-----|---------------------|
|
||||
| Ubuntu 20.04 (EOL 2025) | Ubuntu 22.04 or 24.04 | `do-release-upgrade` or fresh install |
|
||||
| RHEL 7 (EOL 2024) | RHEL 8 or 9 | `leapp` upgrade, or fresh install |
|
||||
| Rocky/Alma 8 | Rocky/Alma 9 | `dnf upgrade --releasever=9` |
|
||||
| Debian 11 (EOL LTS 2026) | Debian 12 | `apt full-upgrade` + new sources.list |
|
||||
| SLES 15 SP4 (EOL 2025) | SLES 15 SP6 | `zypper migration` |
|
||||
| Fedora 40 (EOL 2025) | Fedora 42+ | `dnf system-upgrade` |
|
||||
|
||||
---
|
||||
|
||||
## Microsoft Windows
|
||||
|
||||
### Windows Server — editions
|
||||
|
||||
| Edition | Price (approx) | Core limits | VM rights | Use case |
|
||||
|---------|---------------|-------------|-----------|----------|
|
||||
| **Datacenter** | ~$6,155 (2025) | Unlimited | Unlimited Windows VMs per host | Virtualization, SDDC, S2D, HCI |
|
||||
| **Standard** | ~$1,069 (2025) | 2 CPU, unlimited cores | 2 Windows VMs + Hyper-V host | General server, AD, file server |
|
||||
| **Essentials** | ~$501 (2025) | 1 CPU, max 10 users | — | Small business (≤25 users) |
|
||||
| **Azure Edition** | Pay-as-you-go | Per Azure VM | Per Azure | Azure-only, hotpatching |
|
||||
|
||||
Licensing: Windows Server Standard and Datacenter are licensed **per core** (min 16 core/server + 8 core/VM).
|
||||
|
||||
### Windows Server — support lifecycle
|
||||
|
||||
> **Mainstream:** regular updates (bug fixes, security, features). **Extended:** security updates only (free).
|
||||
> **ESU:** Extended Security Updates (paid tier, ~$45–300/core/year).
|
||||
|
||||
| Version | Release | Mainstream support | Extended support | ESU | Note |
|
||||
|---------|---------|------------------|-----------------|-----|------|
|
||||
| **2012 R2** | 2013-11 | 2018-10 | 2023-10 | End 2026-10 (year 3) | ESU paid, final year |
|
||||
| **2016** | 2016-10 | 2022-01 | 2027-01 | — | Last with Desktop Experience |
|
||||
| **2019** | 2019-01 | 2024-01 | 2029-01 | — | Last with Nano Server (1803 only) |
|
||||
| **2022** | 2021-09 | 2026-10 | 2031-10 | — | Current, TPM 2.0, Credential Guard |
|
||||
| **2025** | 2024-11 | 2029-10 | 2034-10 | — | Hotpatching, PowerShell 7, SMB over QUIC |
|
||||
|
||||
### Windows Server — version vs edition feature grid
|
||||
|
||||
| Version | Hyper-V | Storage Spaces Direct | Software-defined networking | Containers | GPU DDA / vGPU | WSL2 |
|
||||
|---------|---------|---------------------|---------------------------|------------|---------------|------|
|
||||
| 2016 Standard | Yes | No (DC only) | No (DC only) | Windows only | Yes | No |
|
||||
| 2016 Datacenter | Yes | Yes | Yes | Windows | Yes | No |
|
||||
| 2019 Standard | Yes | No | No | Windows | Yes | No |
|
||||
| 2019 Datacenter | Yes | Yes | Yes | Windows | Yes | No |
|
||||
| 2022 Standard | Yes | No | No | Windows + Linux | Yes | No |
|
||||
| 2022 Datacenter | Yes | Yes | Yes | Windows + Linux (2022.2+) | Yes | No |
|
||||
| 2025 Datacenter | Yes | Yes | Yes | Windows + Linux | Yes | Yes |
|
||||
|
||||
### Windows Desktop — support lifecycle
|
||||
|
||||
> **E = Enterprise, Pro = Professional, Home = Consumer**
|
||||
> LTSC = Long Term Servicing Channel (stable, no feature updates)
|
||||
|
||||
| Version | Release | EOL (Home/Pro) | EOL (Enterprise) | LTSC EOL | Note |
|
||||
|---------|---------|---------------|-----------------|----------|------|
|
||||
| **10 21H2** | 2021-11 | — | 2024-06 | — |
|
||||
| **10 22H2** | 2022-10 | 2025-10 | 2025-10 | — | Final Windows 10 |
|
||||
| **10 LTSC 2021** | 2021-11 | — | — | 2032-01 | IoT Enterprise LTSC |
|
||||
| **11 22H2** | 2022-09 | 2024-10 | 2025-10 | — |
|
||||
| **11 23H2** | 2023-10 | 2025-11 | 2026-11 | — |
|
||||
| **11 24H2** | 2024-10 | 2026-10 | 2027-10 | — | First with Recall, Copilot+ |
|
||||
| **11 LTSC 2024** | 2024-10 | — | — | 2029-10 | Enterprise LTSC |
|
||||
|
||||
Windows 10 support **ended 2025-10-14** — last version with classic Control Panel.
|
||||
|
||||
### Windows vs Linux — comparison
|
||||
|
||||
| Feature | Windows Server | RHEL / Ubuntu |
|
||||
|---------|---------------|---------------|
|
||||
| **License (server)** | $500–6,000 (per core) + CAL | $0–800 (per node subscription) |
|
||||
| **License (desktop)** | $100–200 (OEM/retail) | Free |
|
||||
| **Support cost** | Included in license (SA/ESU) | $200–1,300/node/year (RHEL) |
|
||||
| **Package management** | MSI, AppX, winget, NuGet | APT, DNF, Zypper |
|
||||
| **Package count** | ~10,000 (chocolatey) | ~60,000+ (Ubuntu repo) |
|
||||
| **Desktop GUI** | Windows Shell (mandatory) | Optional (GNOME, KDE, XFCE…) |
|
||||
| **Server GUI** | Windows Shell (core-only since 2022) | CLI-only (standard) |
|
||||
| **Kernel** | NT hybrid kernel (kernel-mode Win32) | Monolithic Linux kernel |
|
||||
| **Device support** | OEM driver model (WHQL) | Open source + vendor drivers |
|
||||
| **Container types** | Windows + Linux (WSL2) | Linux (Docker, Podman, containerd) |
|
||||
| **Container registry** | Docker Hub, ACR, Nexus | Docker Hub, Quay, GHCR, Nexus… |
|
||||
| **Container image size** | ~4–8 GB (Windows Server Core) | ~100 MB – 1 GB (Alpine/Ubuntu) |
|
||||
| **GPU passthrough** | DDA (Discrete Device Assignment) | GPU Direct, VFIO, SR-IOV |
|
||||
| **AI/ML support** | WSL2 (CUDA), Azure ML | Native CUDA, ROCm, oneAPI |
|
||||
| **CUDA support** | Yes (via WSL2 or Docker) | Native (nvidia-container-toolkit) |
|
||||
| **Orchestration** | AD / GPO / SCCM / WAC | Ansible, Puppet, Salt, Foreman |
|
||||
| **RBAC/AAA** | Active Directory (+ Kerberos) | LDAP, FreeIPA, SSSD, AD |
|
||||
| **Remote management** | RDP, WinRM, PowerShell Remoting | SSH, Cockpit, Webmin |
|
||||
| **Filesystem** | NTFS, ReFS, CSVFS | ext4, XFS, Btrfs, ZFS |
|
||||
| **Max file system size** | 256 TB (NTFS), 1.2 YB (ReFS) | 1 EB (XFS), 16 EB (ZFS) |
|
||||
| **Hypervisor** | Hyper-V (Type 1) | KVM (Type 2-like), Xen |
|
||||
| **Dynamic memory** | Hyper-V Dynamic Memory | KSM, virtio-balloon (KVM) |
|
||||
| **Live migration** | Hyper-V Live Migration | KVM Live Migration, vMotion |
|
||||
|
||||
### Windows specific features
|
||||
|
||||
| Feature | Description | Linux alternative |
|
||||
|---------|------------|-------------------|
|
||||
| **Active Directory** | Identity, auth, GPO, DNS, DHCP | FreeIPA, Samba AD DC, 389-ds, SSSD |
|
||||
| **Group Policy** | Central desktop/server configuration | Ansible, Puppet, Salt (agent-based) |
|
||||
| **Hyper-V + S2D** | Hyper-converged storage and virtualization (HCI) | Proxmox Ceph / oVirt + Gluster |
|
||||
| **Failover Clustering** | Cluster-aware apps (SQL, File Server) | Pacemaker + Corosync + DRBD |
|
||||
| **IIS** | Web server, ASP.NET host | Nginx, Apache (.NET host possible) |
|
||||
| **PowerShell** | Scripting, Desired State Configuration | Bash, Python, Ansible |
|
||||
| **Windows Admin Center** | GUI management | Cockpit, Webmin |
|
||||
| **BitLocker** | Full disk encryption | LUKS + cryptsetup |
|
||||
| **Windows Defender** | Antivirus + EDR | ClamAV, Wazuh, Osquery |
|
||||
| **SQL Server** | Relational database | PostgreSQL, MySQL, MariaDB |
|
||||
|
||||
### Recommended OS per use case (including Windows)
|
||||
|
||||
| Use case | OS | Rationale |
|
||||
|----------|-----|-------|
|
||||
| **Active Directory / GPO / hybrid ID** | Windows Server 2022/2025 | AD is Windows-only |
|
||||
| **SQL Server (failover cluster)** | Windows Server Datacenter + SQL EE | Always On FCI, ReFS |
|
||||
| **Exchange / SharePoint** | Windows Server 2022 | Windows-only |
|
||||
| **Enterprise desktop management** | Windows 11 Enterprise + Intune/SCCM | GPO, AD, enterprise MDM |
|
||||
| **.NET / ASP.NET apps** | Windows Server / Linux (.NET Core) | .NET 6+ runs on Linux |
|
||||
| **HCI (Microsoft stack)** | Windows Server Datacenter + S2D + Hyper-V | Azure Stack HCI |
|
||||
| **Virtualization (mixed workload)** | Windows Server Datacenter (Hyper-V) | Linux + Windows VMs under one |
|
||||
| **AI/GPU inference** | Linux (Ubuntu) + CUDA | NVIDIA optimal; WSL2 alternative |
|
||||
| **Container orchestration (Windows nodes)** | Windows Server 2022/2025 + containerd | Windows Pods in AKS on-prem |
|
||||
| **Tier 2 apps / web / API** | Ubuntu or RHEL (Linux) | Lower TCO, smaller footprint |
|
||||
|
||||
### Windows Server migration paths
|
||||
|
||||
| From | To | Recommended approach |
|
||||
|------|-----|---------------------|
|
||||
| Windows Server 2012 R2 (EOL 2023) | Windows Server 2022/2025 | In-place upgrade or fresh + migration |
|
||||
| Windows Server 2016 (EOL 2027) | Windows Server 2022/2025 | In-place upgrade or fresh |
|
||||
| Windows Server 2019 | Windows Server 2022/2025 | In-place upgrade (`Setup.exe /auto upgrade`) |
|
||||
| Windows Server 2022 | Windows Server 2025 | In-place upgrade or fresh |
|
||||
| Windows Server → Cloud | Azure VM / Azure Stack HCI | Azure Migrate, Storage Migration Service |
|
||||
| Windows Server → Linux | Ubuntu / RHEL (re-platform) | Migrate app to .NET Core or alternative |
|
||||
|
||||
### Windows — API and operational limits
|
||||
|
||||
| Limit | Windows Server | Windows Desktop |
|
||||
|-------|---------------|----------------|
|
||||
| **Max RAM** | 24 TB (2025 Datacenter) | 2 TB (Pro/Enterprise), 128 GB (Home) |
|
||||
| **Max CPU sockets** | 64 (Datacenter), 2 (Standard) | 2 |
|
||||
| **Max CPU cores** | Unlimited | 128 (Pro), 64 (Home) |
|
||||
| **Max file size (NTFS)** | 256 TB | 256 TB |
|
||||
| **Max file size (ReFS)** | 18.4 EB (2025) | — |
|
||||
| **Max volume size (NTFS)** | 256 TB | 256 TB |
|
||||
| **Max volume size (ReFS)** | 1.2 YB (theoretical) | — |
|
||||
| **Max dedup volume** | 64 TB (Data Deduplication) | — |
|
||||
| **Max cluster nodes** | 64 (Failover Cluster) | — |
|
||||
| **Max VM per host** | Unlimited (Datacenter) | — |
|
||||
| **VM memory per VM** | 12 TB (2022+) | — |
|
||||
| **VM vCPU per VM** | 240 (2022+) | — |
|
||||
| **Concurrent RDP** | 2 (admin), 200+ (RDS CAL) | 1 (Home), more (RDP host) |
|
||||
| **PowerShell Remoting** | Unlimited (WinRM) | Yes (WinRM) |
|
||||
|
||||
---
|
||||
|
||||
## Related
|
||||
|
||||
- [AI-INFRASTRUCTURE.en.md](AI-INFRASTRUCTURE.en.md) — OS for AI workloads, GPU drivers, kernel parameters
|
||||
- [KUBERNETES.en.md](KUBERNETES.en.md) — container runtime, orchestration
|
||||
- [HYPERVISORS.en.md](HYPERVISORS.en.md) — hypervisors, VM host OS
|
||||
- [DATACENTERS.en.md](DATACENTERS.en.md) — DC layout, HW platforms
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books, and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-18*
|
||||
333
OS.md
Normal file
333
OS.md
Normal file
@@ -0,0 +1,333 @@
|
||||
# Operační systémy
|
||||
|
||||
> Přehled Linux distribucí a Microsoft Windows pro serverové, containerové a AI/GPU workloady, včetně support lifecycle, EOL dat a srovnání.
|
||||
|
||||
---
|
||||
|
||||
## Přehled distribucí
|
||||
|
||||
| Distribuce | Rodina | Package manager | Init | Security | Reference platforma |
|
||||
|-----------|--------|----------------|------|----------|-------------------|
|
||||
| **Ubuntu LTS** | Debian | apt (deb) | systemd | AppArmor | NVIDIA DGX, nejširší AI/GPU support |
|
||||
| **Debian** | Debian | apt (deb) | systemd | AppArmor | Univerzální server, stabilita |
|
||||
| **RHEL** | Red Hat | dnf (rpm) | systemd | SELinux | Enterprise standard, SAP, Oracle DB |
|
||||
| **Rocky Linux** | Red Hat | dnf (rpm) | systemd | SELinux | RHEL binary compatible (free) |
|
||||
| **AlmaLinux** | Red Hat | dnf (rpm) | systemd | SELinux | RHEL binary compatible (free) |
|
||||
| **SLES** | SUSE | zypper (rpm) | systemd | AppArmor | HPC, SAP, mainframe |
|
||||
| **OpenSUSE Leap** | SUSE | zypper (rpm) | systemd | AppArmor | Desktop, vývoj |
|
||||
| **OpenSUSE Tumbleweed** | SUSE | zypper (rpm) | systemd | AppArmor | Rolling release, bleeding edge |
|
||||
| **Fedora** | Red Hat | dnf (rpm) | systemd | SELinux | Desktop, technologický preview |
|
||||
| **Arch Linux** | Independent | pacman | systemd | — | Rolling, power users |
|
||||
| **Alpine Linux** | Independent | apk | OpenRC | — | Container image, embedded |
|
||||
| **Flatcar Container Linux** | Independent | — (image-based) | systemd | — | K8s worker node, minimal footprint |
|
||||
| **Bottlerocket** | Independent | — (image-based) | systemd | — | AWS K8s, minimal footprint |
|
||||
|
||||
---
|
||||
|
||||
## Support lifecycle a EOL data
|
||||
|
||||
> **Standard:** základní podpora (bug fixy, security). **LTS/ELS:** prodloužená podpora (jen security).
|
||||
> ESM = Ubuntu Extended Security Maintenance, EUS = RHEL Extended Update Support, LTSS = SUSE Long Term Service Pack Support.
|
||||
|
||||
### Ubuntu LTS
|
||||
|
||||
| Verze | Release | Standard support | ESM / Ubuntu Pro | Poznámka |
|
||||
|-------|---------|-----------------|------------------|----------|
|
||||
| **20.04 LTS** (Focal) | 2020-04 | Konec 2025-04 | Konec 2030-04 | Poslední verze s Python 2 |
|
||||
| **22.04 LTS** (Jammy) | 2022-04 | Konec 2027-04 | Konec 2032-04 | NVIDIA DGX standard |
|
||||
| **24.04 LTS** (Noble) | 2024-04 | Konec 2029-04 | Konec 2034-04 | Nejnovější GPU/CUDA support |
|
||||
| **26.04 LTS** (plán) | 2026-04 | Konec 2031-04 | Konec 2036-04 | — |
|
||||
|
||||
### RHEL
|
||||
|
||||
| Verze | Release | Full support | Maintenance support | Extended life cycle |
|
||||
|-------|---------|-------------|-------------------|-------------------|
|
||||
| **7** | 2014-06 | Konec 2019-08 | Konec 2024-06 | Konec 2028-06 (ELS) |
|
||||
| **8** | 2019-05 | Konec 2024-05 | Konec 2029-05 | Konec 2034-06 (ELS) |
|
||||
| **9** | 2022-05 | Konec 2027-05 | Konec 2032-05 | Konec 2037-06 (ELS) |
|
||||
| **10** (plán) | 2025 | Konec 2029 | Konec 2034 | — |
|
||||
|
||||
### Rocky Linux / AlmaLinux
|
||||
|
||||
| Verze | Release | Support do | Kompatibilní s RHEL | Poznámka |
|
||||
|-------|---------|-----------|-------------------|----------|
|
||||
| **8** | 2021-06 | 2029-05 | Ano (od RHEL 8.4) | Alma/rocky |
|
||||
| **9** | 2022-07 | 2032-05 | Ano (od RHEL 9.0) | Alma/rocky |
|
||||
|
||||
### Debian
|
||||
|
||||
| Verze | Release | Full support | LTS support | ELTS (paid) |
|
||||
|-------|---------|-------------|-------------|-------------|
|
||||
| **11** (Bullseye) | 2021-08 | 2024-08 | Konec 2026-08 | Konec 2028-08 |
|
||||
| **12** (Bookworm) | 2023-06 | 2026-06 | Konec 2028-06 | Konec 2030-06 |
|
||||
| **13** (Trixie) | 2025 (oček.) | ~3 roky po release | ~5 let po release | — |
|
||||
|
||||
### SLES
|
||||
|
||||
| Verze | Release | General support | LTSS | Poznámka |
|
||||
|-------|---------|---------------|------|----------|
|
||||
| **15 SP3** | 2021-06 | Konec 2024-12 | Konec 2027-12 | — |
|
||||
| **15 SP4** | 2022-06 | Konec 2025-12 | Konec 2028-12 | — |
|
||||
| **15 SP5** | 2023-06 | Konec 2026-12 | Konec 2029-12 | Aktuální SP |
|
||||
| **15 SP6** | 2024-10 | Konec 2027-12 | Konec 2030-12 | — |
|
||||
|
||||
### Fedora
|
||||
|
||||
| Verze | Release | EOL | Poznámka |
|
||||
|-------|---------|-----|----------|
|
||||
| **38** | 2023-04 | 2024-05 | — |
|
||||
| **39** | 2023-11 | 2024-12 | — |
|
||||
| **40** | 2024-04 | 2025-05 | — |
|
||||
| **41** | 2024-11 | 2025-12 | — |
|
||||
|
||||
Fedora vydává novou verzi každých ~6 měsíců, EOL ~13 měsíců po release. Slouží jako upstream pro RHEL.
|
||||
|
||||
### Alpine Linux
|
||||
|
||||
| Verze | Release | EOL |
|
||||
|-------|---------|-----|
|
||||
| **3.18** | 2023-05 | 2025-05 |
|
||||
| **3.19** | 2023-12 | 2025-12 |
|
||||
| **3.20** | 2024-05 | 2026-05 |
|
||||
| **3.21** | 2024-12 | 2026-12 |
|
||||
|
||||
---
|
||||
|
||||
## Kernel verze per distribuce
|
||||
|
||||
| Distribuce | Kernel (default) | Kernel (HWE/enhanced) | Poznámka |
|
||||
|-----------|-----------------|----------------------|----------|
|
||||
| Ubuntu 22.04 LTS | 5.15 (GA) | 6.5+ (HWE) | HWE od 22.04.2 |
|
||||
| Ubuntu 24.04 LTS | 6.8 | — | — |
|
||||
| RHEL 8 | 4.18 | — | Backportované featur |
|
||||
| RHEL 9 | 5.14 | — | Backportované featur |
|
||||
| RHEL 10 | 6.11+ (oček.) | — | — |
|
||||
| Rocky/Alma 8 | 4.18 | — | Stejný jako RHEL 8 |
|
||||
| Rocky/Alma 9 | 5.14 | — | Stejný jako RHEL 9 |
|
||||
| Debian 11 | 5.10 | 6.1 (backports) | — |
|
||||
| Debian 12 | 6.1 | — | — |
|
||||
| SLES 15 SP5 | 5.14 | — | — |
|
||||
| SLES 15 SP6 | 6.4 | — | — |
|
||||
| Fedora 40 | 6.8+ | — | Rolling upstream |
|
||||
| Alpine 3.20 | 6.6 | — | — |
|
||||
|
||||
---
|
||||
|
||||
## Srovnání dle use case
|
||||
|
||||
| Use case | Doporučená distribuce | Zdůvodnění |
|
||||
|----------|---------------------|-------|
|
||||
| **AI/GPU cluster (DGX)** | Ubuntu 22.04 LTS / DGX OS | NVIDIA standard, CUDA, MLNX_OFED |
|
||||
| **Enterprise K8s (OpenShift)** | RHEL 9 / RHCOS | Red Hat support, GPU Operator |
|
||||
| **Vanilla K8s (on-prem)** | Ubuntu 22.04 LTS + Flatcar (workers) | Community support, minimal worker image |
|
||||
| **HPC cluster (Slurm)** | Rocky Linux 9 / Ubuntu 22.04 | EL ekosystém + Lustre, nebo Ubuntu |
|
||||
| **Traditional enterprise DB (Oracle, SAP)** | RHEL 9 / SLES 15 | Vendor certifikace |
|
||||
| **Container host** | Ubuntu 22.04 / Alpine | Široká image kompatibilita / min size |
|
||||
| **Vývoj / desktop** | Fedora / Ubuntu 24.04 / OpenSUSE Tumbleweed | Aktuální balíčky, HW support |
|
||||
| **Embedded / IoT** | Debian / Alpine / Yocto | Minimal footprint, stabilita |
|
||||
| **Edge inference** | Ubuntu (ARM) / NVIDIA JetPack | Jetson, GPU support |
|
||||
| **Mainframe (IBM z/Arch)** | SLES 15 / RHEL 9 | IBM certifikace |
|
||||
|
||||
---
|
||||
|
||||
## Package management srovnání
|
||||
|
||||
| Vlastnost | apt (Debian/Ubuntu) | dnf (RHEL/Rocky/Alma/Fedora) | zypper (SUSE) | pacman (Arch) | apk (Alpine) |
|
||||
|-----------|--------------------|------------------------------|---------------|---------------|-------------|
|
||||
| **Formát balíčků** | .deb | .rpm | .rpm | .pkg.tar.zst | .apk |
|
||||
| **Repo management** | /etc/apt/sources.list | /etc/yum.repos.d/ | /etc/zypp/repos.d/ | /etc/pacman.conf | /etc/apk/repositories |
|
||||
| **Lock file** | — (apt-mark hold) | — (exclude) | — (lock) | — (IgnorePkg) | — |
|
||||
| **Transactional update** | Ne | Ano (dnf history) | Ano (zypper history) | Ne | Ne |
|
||||
| **Rollback** | Ne (manual) | Ano (dnf history rollback) | Ano (snapper + zypper) | Ne | Ne |
|
||||
| **Delta updates** | Ano (apt-xapian) | Ano (deltarpm) | Ano (zsync) | Ne | Ne |
|
||||
| **Verze (k 2025)** | apt 2.7+ | dnf 4.18+ | zypper 1.14+ | pacman 6.1+ | apk 2.14+ |
|
||||
|
||||
---
|
||||
|
||||
## Security model porovnání
|
||||
|
||||
| Vlastnost | SELinux (RHEL deriváty) | AppArmor (Ubuntu/Debian/SUSE) |
|
||||
|-----------|----------------------|------------------------------|
|
||||
| **Typ** | Mandatory Access Control (MAC) | Mandatory Access Control (MAC) |
|
||||
| **Labelování** | Kontextové (user:role:type) | Path-based (profil k executable) |
|
||||
| **Konfigurace** | Policy (moduly, booleany) | Profily (textové, v /etc/apparmor.d/) |
|
||||
| **Režimy** | Enforcing / Permissive / Disabled | Enforce / Complain / Disabled |
|
||||
| **Křivka učení** | Strmá (politiky komplexní) | Mírná (profily jednodušší) |
|
||||
| **Default v** | RHEL, Rocky, Alma, Fedora | Ubuntu, Debian, SLES, OpenSUSE |
|
||||
| **Use case** | Enterprise multiclient, regulované prostředí | Univerzální server, containment aplikací |
|
||||
| **Container integrace** | SELinux labels na kontejner | AppArmor profile na kontejner |
|
||||
|
||||
Další vrstvy:
|
||||
- **seccomp** — syscall filtering (default v containerd, Docker)
|
||||
- **Capabilities** — Linux capabilities (drop vše kromě nutných)
|
||||
- **cgroups v2** — resource isolation (CPU, memory, IO, PID)
|
||||
- **User namespaces** — rootless kontejnery (Podman, Docker rootless)
|
||||
|
||||
---
|
||||
|
||||
## Doporučená migrační cesta pro EOL distribuce
|
||||
|
||||
| Ze staré verze | Na | Doporučený postup |
|
||||
|----------------|-----|-------------------|
|
||||
| Ubuntu 20.04 (EOL 2025) | Ubuntu 22.04 nebo 24.04 | `do-release-upgrade` nebo fresh install |
|
||||
| RHEL 7 (EOL 2024) | RHEL 8 nebo 9 | `leapp` upgrade, nebo fresh install |
|
||||
| Rocky/Alma 8 | Rocky/Alma 9 | `dnf upgrade --releasever=9` |
|
||||
| Debian 11 (EOL LTS 2026) | Debian 12 | `apt full-upgrade` + nové sources.list |
|
||||
| SLES 15 SP4 (EOL 2025) | SLES 15 SP6 | `zypper migration` |
|
||||
| Fedora 40 (EOL 2025) | Fedora 42+ | `dnf system-upgrade` |
|
||||
|
||||
---
|
||||
|
||||
## Microsoft Windows
|
||||
|
||||
### Windows Server — edice
|
||||
|
||||
| Edice | Cena (approx) | Core limity | VM rights | Use case |
|
||||
|-------|--------------|-------------|-----------|----------|
|
||||
| **Datacenter** | ~$6 155 (2025) | Neomezen | Neomezené Windows VM na hostiteli | Virtualizace, SDDC, S2D, HCI |
|
||||
| **Standard** | ~$1 069 (2025) | 2 CPU, neomezen jader | 2 Windows VM + Hyper-V host | Běžný server, AD, file server |
|
||||
| **Essentials** | ~$501 (2025) | 1 CPU, max 10 uživatelů | — | Malé firmy (do 25 uživatelů) |
|
||||
| **Azure Edition** | Pay-as-you-go | Dle Azure VM | Dle Azure | Azure-only, hotpatching |
|
||||
|
||||
Licencování: Windows Server Standard a Datacenter se licencují **per core** (min 16 core/server + 8 core/VM).
|
||||
|
||||
### Windows Server — support lifecycle
|
||||
|
||||
> **Mainstream:** běžné aktualizace (bug fixy, security, feature). **Extended:** jen security aktualizace (zdarma).
|
||||
> **ESU:** Extended Security Updates (placená vrstva navíc, cca $45–300/core/rok).
|
||||
|
||||
| Verze | Release | Mainstream support | Extended support | ESU | Poznámka |
|
||||
|-------|---------|------------------|-----------------|-----|----------|
|
||||
| **2012 R2** | 2013-11 | 2018-10 | 2023-10 | Konec 2026-10 (3. rok) | ESU placená, poslední rok |
|
||||
| **2016** | 2016-10 | 2022-01 | 2027-01 | — | Poslední s Desktop Experience |
|
||||
| **2019** | 2019-01 | 2024-01 | 2029-01 | — | Poslední s Nano Server (jen 1803) |
|
||||
| **2022** | 2021-09 | 2026-10 | 2031-10 | — | Aktuální, TPM 2.0, Credential Guard |
|
||||
| **2025** | 2024-11 | 2029-10 | 2034-10 | — | Hotpatching, PowerShell 7, SMB over QUIC |
|
||||
|
||||
### Windows Server — verze vs edice grid
|
||||
|
||||
| Verze | Hyper-V | Storage Spaces Direct | Software-defined networking | Containers | GPU DDA / vGPU | WSL2 |
|
||||
|-------|---------|---------------------|---------------------------|------------|---------------|------|
|
||||
| 2016 Standard | Ano | Ne (jen Datacenter) | Ne (jen Datacenter) | Jen Windows | Ano | Ne |
|
||||
| 2016 Datacenter | Ano | Ano | Ano | Windows | Ano | Ne |
|
||||
| 2019 Standard | Ano | Ne | Ne | Windows | Ano | Ne |
|
||||
| 2019 Datacenter | Ano | Ano | Ano | Windows | Ano | Ne |
|
||||
| 2022 Standard | Ano | Ne | Ne | Windows + Linux | Ano | Ne |
|
||||
| 2022 Datacenter | Ano | Ano | Ano | Windows + Linux (2022.2+) | Ano | Ne |
|
||||
| 2025 Datacenter | Ano | Ano | Ano | Windows + Linux | Ano | Ano |
|
||||
|
||||
### Windows Desktop — support lifecycle
|
||||
|
||||
> **E = Enterprise, Pro = Professional, Home = Consumer**
|
||||
> LTSC = Long Term Servicing Channel (stabilní, bez feature updatů)
|
||||
|
||||
| Verze | Release | EOL (Home/Pro) | EOL (Enterprise) | LTSC EOL | Poznámka |
|
||||
|-------|---------|---------------|-----------------|----------|----------|
|
||||
| **10 21H2** | 2021-11 | — | 2024-06 | — |
|
||||
| **10 22H2** | 2022-10 | 2025-10 | 2025-10 | — | Poslední Windows 10 |
|
||||
| **10 LTSC 2021** | 2021-11 | — | — | 2032-01 | IoT Enterprise LTSC |
|
||||
| **11 22H2** | 2022-09 | 2024-10 | 2025-10 | — |
|
||||
| **11 23H2** | 2023-10 | 2025-11 | 2026-11 | — |
|
||||
| **11 24H2** | 2024-10 | 2026-10 | 2027-10 | — | První s Recall, Copilot+ |
|
||||
| **11 LTSC 2024** | 2024-10 | — | — | 2029-10 | Enterprise LTSC |
|
||||
|
||||
Podpora Windows 10 **skončila 2025-10-14** — poslední verze s klasickým ovládacím panelem.
|
||||
|
||||
### Windows vs Linux — srovnání
|
||||
|
||||
| Vlastnost | Windows Server | RHEL / Ubuntu |
|
||||
|-----------|---------------|---------------|
|
||||
| **Licence (server)** | $500–6 000 (per core) + CAL | $0–800 (per node subscription) |
|
||||
| **Licence (desktop)** | $100–200 (OEM/retail) | Zdarma |
|
||||
| **Cena za support** | Zahrnuto v licenci (SA/ESU) | $200–1 300/node/rok (RHEL) |
|
||||
| **Package management** | MSI, AppX, winget, NuGet | APT, DNF, Zypper |
|
||||
| **Package count** | ~10 000 (chocolatey) | ~60 000+ (Ubuntu repo) |
|
||||
| **Desktop GUI** | Windows Shell (mandatory) | Volitelný (GNOME, KDE, XFCE…) |
|
||||
| **Server GUI** | Windows Shell (od 2022 Core only) | CLI-only (standard) |
|
||||
| **Kernel** | NT hybrid kernel (kernel-mode Win32) | Monolithic Linux kernel |
|
||||
| **Device support** | OEM driver model (WHQL) | Open source + vendor drivers |
|
||||
| **Container types** | Windows + Linux (WSL2) | Linux (Docker, Podman, containerd) |
|
||||
| **Container registry** | Docker Hub, ACR, Nexus | Docker Hub, Quay, GHCR, Nexus… |
|
||||
| **Container image size** | ~4–8 GB (Windows Server Core) | ~100 MB – 1 GB (Alpine/Ubuntu) |
|
||||
| **GPU passthrough** | DDA (Discrete Device Assignment) | GPU Direct, VFIO, SR-IOV |
|
||||
| **AI/ML support** | WSL2 (CUDA), Azure ML | Native CUDA, ROCm, oneAPI |
|
||||
| **CUDA support** | Ano (přes WSL2 nebo Docker) | Native (nvidia-container-toolkit) |
|
||||
| **Orchestration** | AD / GPO / SCCM / WAC | Ansible, Puppet, Salt, Foreman |
|
||||
| **RBAC/AAA** | Active Directory (+ Kerberos) | LDAP, FreeIPA, SSSD, AD |
|
||||
| **Remote management** | RDP, WinRM, PowerShell Remoting | SSH, Cockpit, Webmin |
|
||||
| **Filesystem** | NTFS, ReFS, CSVFS | ext4, XFS, Btrfs, ZFS |
|
||||
| **Max file system size** | 256 TB (NTFS), 1.2 YB (ReFS) | 1 EB (XFS), 16 EB (ZFS) |
|
||||
| **Hypervisor** | Hyper-V (Type 1) | KVM (Type 2-ish), Xen |
|
||||
| **Dynamic memory** | Hyper-V Dynamic Memory | KSM, virtio-balloon (KVM) |
|
||||
| **Live migration** | Hyper-V Live Migration | KVM Live Migration, vMotion |
|
||||
|
||||
### Windows specific features
|
||||
|
||||
| Feature | Popis | Lze nahradit na Linuxu? |
|
||||
|---------|-------|------------------------|
|
||||
| **Active Directory** | Identity, auth, GPO, DNS, DHCP | FreeIPA, Samba AD DC, 389-ds, SSSD |
|
||||
| **Group Policy** | Centrální konfigurace desktopů/serverů | Ansible, Puppet, Salt (agent-based) |
|
||||
| **Hyper-V + S2D** | Hyper-converged storage a virtualizace (HCI) | Proxmox Ceph / oVirt + Gluster |
|
||||
| **Failover Clustering** | Cluster-aware aplikace (SQL, File Server) | Pacemaker + Corosync + DRBD |
|
||||
| **IIS** | Web server, ASP.NET host | Nginx, Apache (bez ASP.NET, nebo .NET host) |
|
||||
| **PowerShell** | Scripting, Desired State Configuration | Bash, Python, Ansible |
|
||||
| **Windows Admin Center** | GUI management | Cockpit, Webmin |
|
||||
| **BitLocker** | Full disk encryption | LUKS + cryptsetup |
|
||||
| **Windows Defender** | Antivirus + EDR | ClamAV, Wazuh, Osquery |
|
||||
| **SQL Server** | Relační DB | PostgreSQL, MySQL, MariaDB |
|
||||
|
||||
### Doporučený OS dle use case (včetně Windows)
|
||||
|
||||
| Use case | OS | Zdůvodnění |
|
||||
|----------|-----|-------|
|
||||
| **Active Directory / GPO / hybrid ID** | Windows Server 2022/2025 | AD jen na Windows |
|
||||
| **SQL Server (failover cluster)** | Windows Server Datacenter + SQL EE | Always On FCI, ReFS |
|
||||
| **Exchange / SharePoint** | Windows Server 2022 | Jen na Windows |
|
||||
| **Enterprise desktop management** | Windows 11 Enterprise + Intune/SCCM | GPO, AD, enterprise MDM |
|
||||
| **.NET / ASP.NET aplikace** | Windows Server / Linux (.NET Core) | .NET 6+ běží na Linuxu |
|
||||
| **HCI (Microsoft stack)** | Windows Server Datacenter + S2D + Hyper-V | Azure Stack HCI |
|
||||
| **Virtualizace (mixed workload)** | Windows Server Datacenter (Hyper-V) | Linux i Windows VM pod jedním |
|
||||
| **AI/GPU inference** | Linux (Ubuntu) + CUDA | NVIDIA optimální; WSL2 alternativa |
|
||||
| **Container orchestration (Windows nodes)** | Windows Server 2022/2025 + containerd | Windows Pods v AKS on-prem |
|
||||
| **Tier 2 aplikace / web / API** | Ubuntu nebo RHEL (Linux) | Nižší TCO, menší footprint |
|
||||
|
||||
### Windows Server migrační cesty
|
||||
|
||||
| Ze staré verze | Na | Doporučený postup |
|
||||
|---------------|-----|-------------------|
|
||||
| Windows Server 2012 R2 (EOL 2023) | Windows Server 2022/2025 | In-place upgrade nebo fresh + migration |
|
||||
| Windows Server 2016 (EOL 2027) | Windows Server 2022/2025 | In-place upgrade nebo fresh |
|
||||
| Windows Server 2019 | Windows Server 2022/2025 | In-place upgrade (`Setup.exe /auto upgrade`) |
|
||||
| Windows Server 2022 | Windows Server 2025 | In-place upgrade nebo fresh |
|
||||
| Windows Server → Cloud | Azure VM / Azure Stack HCI | Azure Migrate, Storage Migration Service |
|
||||
| Windows Server → Linux | Ubuntu / RHEL (re-platform) | Migrace aplikace na .NET Core nebo alternativu |
|
||||
|
||||
### Windows — API a provozní limity
|
||||
|
||||
| Limit | Windows Server | Windows Desktop |
|
||||
|-------|---------------|----------------|
|
||||
| **Max RAM** | 24 TB (2025 Datacenter) | 2 TB (Pro/Enterprise), 128 GB (Home) |
|
||||
| **Max CPU sockets** | 64 (Datacenter), 2 (Standard) | 2 |
|
||||
| **Max CPU cores** | Neomezen | 128 (Pro), 64 (Home) |
|
||||
| **Max file size (NTFS)** | 256 TB | 256 TB |
|
||||
| **Max file size (ReFS)** | 18.4 EB (2025) | — |
|
||||
| **Max volume size (NTFS)** | 256 TB | 256 TB |
|
||||
| **Max volume size (ReFS)** | 1.2 YB (teoreticky) | — |
|
||||
| **Max dedup volume** | 64 TB (Data Deduplication) | — |
|
||||
| **Max cluster nodes** | 64 (Failover Cluster) | — |
|
||||
| **Max VM per host** | Neomezen (Datacenter) | — |
|
||||
| **VM memory per VM** | 12 TB (2022+) | — |
|
||||
| **VM vCPU per VM** | 240 (2022+) | — |
|
||||
| **Concurrent RDP** | 2 (admin), 200+ (RDS CAL) | 1 (Home), více (RDP host) |
|
||||
| **PowerShell Remoting** | Neomezen (WinRM) | Ano (WinRM) |
|
||||
|
||||
- [AI-INFRASTRUCTURE.md](AI-INFRASTRUCTURE.md) — OS pro AI workloady, GPU drivery, kernel parametry
|
||||
- [KUBERNETES.md](KUBERNETES.md) — container runtime, orchestrace
|
||||
- [HYPERVISORS.md](HYPERVISORS.md) — hypervisory, VM host OS
|
||||
- [DATACENTERS.md](DATACENTERS.md) — DC layout, HW platformy
|
||||
|
||||
## Zdroje
|
||||
|
||||
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
|
||||
*Poslední revize: 2026-06-18*
|
||||
@@ -166,7 +166,7 @@ LIMIT 10;
|
||||
|
||||
## Sources
|
||||
|
||||
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
|
||||
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
|
||||
|
||||
### Recommended reading
|
||||
|
||||
|
||||
@@ -167,7 +167,7 @@ resource "vsphere_virtual_machine" "web" {
|
||||
}
|
||||
```
|
||||
|
||||
More in [CICD.md](CICD.md#infrastructure-as-code-iac).
|
||||
More in [CICD.en.md](CICD.en.md#infrastructure-as-code-iac).
|
||||
|
||||
## Firmware management
|
||||
|
||||
@@ -188,7 +188,7 @@ More in [CICD.md](CICD.md#infrastructure-as-code-iac).
|
||||
| **Chef** | Ruby DSL | Pull (agent) | Compliance, infrastructure automation |
|
||||
| **SaltStack** | YAML/Python | Both (salt-minion) | High-speed config, event-driven |
|
||||
|
||||
More in [CICD.md](CICD.md).
|
||||
More in [CICD.en.md](CICD.en.md).
|
||||
|
||||
## OpenStack Provisioning
|
||||
|
||||
@@ -223,6 +223,6 @@ OpenStack offers several methods for provisioning infrastructure:
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-03*
|
||||
|
||||
80
README.en.md
80
README.en.md
@@ -35,11 +35,11 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
│ PCIe,BM) │ │(BIOS, │ │ AMD) │ │ Terraform) │
|
||||
└──────────┘ │ NUMA) │ └────────┘ └──────────────┘
|
||||
└──────────┘
|
||||
┌──────────┐ ┌──────────┐ ┌────────┐
|
||||
│HYPERVISOR│ │ MONITOR │ │ CICD │
|
||||
│(VMware, │ │(Prom, │ │(GitOps, │
|
||||
│ KVM, ...)│ │ Grafana) │ │ IaC) │
|
||||
└──────────┘ └──────────┘ └────────┘
|
||||
┌──────────┐ ┌──────────┐ ┌────────┐ ┌────────────┐
|
||||
│HYPERVISOR│ │ MONITOR │ │ CICD │ │ ☸ K8s │
|
||||
│(VMware, │ │(Prom, │ │(GitOps, │ │(CAPI, K3s, │
|
||||
│ KVM, ...)│ │ Grafana) │ │ IaC) │ │ RKE2...) │
|
||||
└──────────┘ └──────────┘ └────────┘ └────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
@@ -52,16 +52,22 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
| 🌐 Network architecture | [NETWORKING.md](NETWORKING.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
|
||||
| 📊 Monitoring & observability | [MONITORING.md](MONITORING.md) | Prometheus, Grafana, OTel, logging, alerting | — |
|
||||
| 🔄 CI/CD & DevOps | [CICD.md](CICD.md) | Pipelines, GitOps, IaC (Terraform), deployment | — |
|
||||
| 💻 Operační systémy | [OS.md](OS.md) | Linux distribuce, Windows Server, lifecycle, EOL, kernel | KUBERNETES, HYPERVISORS, AI-INFRASTRUCTURE |
|
||||
| 🔄 Disaster Recovery | [DR.md](DR.md) | RTO, RPO, scenarios, prevention, uptime calculation | CLOUD, DATACENTERS, MONITORING |
|
||||
| 🗄️ Database architecture | [DATABASES.md](DATABASES.md) | Classification, sharding, replication, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VEKTOROVE-DB, DATABAZOVE-ENGINY |
|
||||
| 🗄️ Big Data | [BIG-DATA.md](BIG-DATA.md) | HDFS, Spark, Flink, Trino, Iceberg, Delta Lake, Lakehouse | DATABASES, CLOUD, MESSAGING, KUBERNETES |
|
||||
| 🖥️ Hypervisors | [HYPERVISORS.md](HYPERVISORS.md) | VMware, Hyper-V, KVM, Proxmox, migration | STORAGE, SERVER-HW |
|
||||
| 🏭 Data centers | [DATACENTERS.md](DATACENTERS.md) | Tier, power, cooling, layout, DC services, secondary DC topologies | MONITORING |
|
||||
| 🏭 Data centers | [DATACENTERS.md](DATACENTERS.md) | Tier, power, cooling, layout, DC services, secondary DC topologies | MONITORING, MESSAGING |
|
||||
| 💾 Storage | [STORAGE.md](STORAGE.md) | SAN/NAS/object, RAID, SDS, Ceph, OpenStack Cinder/Swift/Manila | — |
|
||||
| 🔌 Server connectivity | [CONNECTIVITY.md](CONNECTIVITY.md) | Ethernet, FC SAN, iSCSI, NVMe-oF, SAS | — |
|
||||
| 🔧 Server hardware | [SERVER-HW.md](SERVER-HW.md) | CPU, RAM, PCIe, NUMA, BMC | CONNECTIVITY |
|
||||
| 🎮 GPU | [GPU.md](GPU.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
|
||||
| ⚙️ Server config | [SERVER-CONFIG.md](SERVER-CONFIG.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
|
||||
| 📦 Provisioning | [PROVISIONING.md](PROVISIONING.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
|
||||
| ☸ Kubernetes | [KUBERNETES.md](KUBERNETES.md) | K8s architektura, deployment, Cluster API (CAPI) | CICD, CLOUD, NETWORKING |
|
||||
| 📨 Messaging & streaming | [MESSAGING.md](MESSAGING.md) | Kafka, RabbitMQ, Pulsar, NATS, managed queue/pubsub | DATACENTERS, CLOUD |
|
||||
| 🏗️ Migrace DC | [DC-MIGRATION.md](DC-MIGRATION.md) | Strategie, fáze, network, DB, rollback | DATACENTERS, CLOUD, DR, NETWORKING, STORAGE |
|
||||
| 🧠 AI infrastruktura | [AI-INFRASTRUCTURE.md](AI-INFRASTRUCTURE.md) | GPU, AI networking, storage, cluster, cooling, training/inference | GPU, NETWORKING, STORAGE, DATACENTERS, CLOUD |
|
||||
| 📋 Legacy index | [HARDWARE.md](HARDWARE.md) | → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
|
||||
| 📋 Legacy infra | [INFRASTRUCTURE.md](INFRASTRUCTURE.md) | → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
|
||||
| 📋 Review workflow | [REVIEW.md](REVIEW.md) | Review and content control process | — |
|
||||
@@ -71,12 +77,12 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| [POSTGRESQL.md](POSTGRESQL.md) | PostgreSQL — architecture, replication, tuning |
|
||||
| [MYSQL.md](MYSQL.md) | MySQL & MariaDB |
|
||||
| [ORACLE.md](ORACLE.md) | Oracle Database — RAC, Data Guard, tuning |
|
||||
| [MONGODB.md](MONGODB.md) | MongoDB — document DB, sharding, replica sets |
|
||||
| [REDIS.md](REDIS.md) | Redis — cache, session store, streams |
|
||||
| [CASSANDRA.md](CASSANDRA.md) | Cassandra & ScyllaDB — wide-column, nosql |
|
||||
| [POSTGRESQL.en.md](POSTGRESQL.en.md) | PostgreSQL — architecture, replication, tuning |
|
||||
| [MYSQL.en.md](MYSQL.en.md) | MySQL & MariaDB |
|
||||
| [ORACLE.en.md](ORACLE.en.md) | Oracle Database — RAC, Data Guard, tuning |
|
||||
| [MONGODB.en.md](MONGODB.en.md) | MongoDB — document DB, sharding, replica sets |
|
||||
| [REDIS.en.md](REDIS.en.md) | Redis — cache, session store, streams |
|
||||
| [CASSANDRA.en.md](CASSANDRA.en.md) | Cassandra & ScyllaDB — wide-column, nosql |
|
||||
| [VEKTOROVE-DB.md](VEKTOROVE-DB.md) | Vector databases — Pinecone, Qdrant, Milvus, pgvector |
|
||||
| [DATABAZOVE-ENGINY.md](DATABAZOVE-ENGINY.md) | Common DB concepts — transactions, indexes, locking |
|
||||
|
||||
@@ -90,16 +96,22 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
| 🌐 Network architecture | [NETWORKING.en.md](NETWORKING.en.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
|
||||
| 📊 Monitoring & observability | [MONITORING.en.md](MONITORING.en.md) | Prometheus, Grafana, OTel, logging, alerting | — |
|
||||
| 🔄 CI/CD & DevOps | [CICD.en.md](CICD.en.md) | Pipelines, GitOps, IaC (Terraform), deployment | — |
|
||||
| 💻 Operating systems | [OS.en.md](OS.en.md) | Linux distributions, Windows Server, lifecycle, EOL, kernel | KUBERNETES, HYPERVISORS, AI-INFRASTRUCTURE |
|
||||
| 🔄 Disaster Recovery | [DR.en.md](DR.en.md) | RTO, RPO, scenarios, prevention, uptime calculation | CLOUD, DATACENTERS, MONITORING |
|
||||
| 🗄️ Database architecture | [DATABASES.en.md](DATABASES.en.md) | Classification, sharding, replication, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VECTOR-DBS, DATABASE-ENGINES |
|
||||
| 🗄️ Big Data | [BIG-DATA.en.md](BIG-DATA.en.md) | HDFS, Spark, Flink, Trino, Iceberg, Delta Lake, Lakehouse | DATABASES, CLOUD, MESSAGING, KUBERNETES |
|
||||
| 🖥️ Hypervisors | [HYPERVISORS.en.md](HYPERVISORS.en.md) | VMware, Hyper-V, KVM, Proxmox, migration | STORAGE, SERVER-HW |
|
||||
| 🏭 Data centers | [DATACENTERS.en.md](DATACENTERS.en.md) | Tier, power, cooling, layout, DC services, secondary DC topologies | MONITORING |
|
||||
| 🏭 Data centers | [DATACENTERS.en.md](DATACENTERS.en.md) | Tier, power, cooling, layout, DC services, secondary DC topologies | MONITORING, MESSAGING |
|
||||
| 💾 Storage | [STORAGE.en.md](STORAGE.en.md) | SAN/NAS/object, RAID, SDS, Ceph | — |
|
||||
| 🔌 Server connectivity | [CONNECTIVITY.en.md](CONNECTIVITY.en.md) | Ethernet, FC SAN, iSCSI, NVMe-oF, SAS | — |
|
||||
| 🔧 Server hardware | [SERVER-HW.en.md](SERVER-HW.en.md) | CPU, RAM, PCIe, NUMA, BMC | CONNECTIVITY |
|
||||
| 🎮 GPU | [GPU.en.md](GPU.en.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
|
||||
| ⚙️ Server config | [SERVER-CONFIG.en.md](SERVER-CONFIG.en.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
|
||||
| 📦 Provisioning | [PROVISIONING.en.md](PROVISIONING.en.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
|
||||
| ☸ Kubernetes | [KUBERNETES.en.md](KUBERNETES.en.md) | K8s architecture, deployment, Cluster API (CAPI) | CICD, CLOUD, NETWORKING |
|
||||
| 📨 Messaging & streaming | [MESSAGING.en.md](MESSAGING.en.md) | Kafka, RabbitMQ, Pulsar, NATS, managed queue/pubsub | DATACENTERS, CLOUD |
|
||||
| 🏗️ DC Migration | [DC-MIGRATION.en.md](DC-MIGRATION.en.md) | Strategies, phases, network, DB, rollback | DATACENTERS, CLOUD, DR, NETWORKING, STORAGE |
|
||||
| 🧠 AI Infrastructure | [AI-INFRASTRUCTURE.en.md](AI-INFRASTRUCTURE.en.md) | GPU, AI networking, storage, cluster, cooling, training/inference | GPU, NETWORKING, STORAGE, DATACENTERS, CLOUD |
|
||||
| 📋 Legacy index | [HARDWARE.en.md](HARDWARE.en.md) | → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
|
||||
| 📋 Legacy infra | [INFRASTRUCTURE.en.md](INFRASTRUCTURE.en.md) | → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
|
||||
| 📋 Review workflow | [REVIEW.en.md](REVIEW.en.md) | Review and content control process | — |
|
||||
@@ -124,7 +136,7 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| [case-studies/proxmox-demo/README.md](case-studies/proxmox-demo/README.md) | Proxmox VE demo cluster — design (CZ) |
|
||||
| [case-studies/proxmox-demo/README.md](case-studies/proxmox-demo/README.md) | Proxmox VE demo cluster — návrh (CZ) |
|
||||
| [case-studies/proxmox-demo/README.en.md](case-studies/proxmox-demo/README.en.md) | Proxmox VE demo cluster — design (EN) |
|
||||
|
||||
---
|
||||
@@ -133,22 +145,28 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
|
||||
| File | References |
|
||||
|------|------------|
|
||||
| `CLOUD.md` / `CLOUD.en.md` | [`GPU.md`](GPU.md), [`NETWORKING.md`](NETWORKING.md), [`sources/cloud/sources.md`](sources/cloud/sources.md) |
|
||||
| `NETWORKING.md` / `NETWORKING.en.md` | [`CLOUD.md`](CLOUD.md), [`sources/networking/sources.md`](sources/networking/sources.md) |
|
||||
| `DATACENTERS.md` / `DATACENTERS.en.md` | [`MONITORING.md`](MONITORING.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `MONITORING.md` / `MONITORING.en.md` | [`sources/monitoring/sources.md`](sources/monitoring/sources.md) |
|
||||
| `CICD.md` / `CICD.en.md` | [`sources/cicd/sources.md`](sources/cicd/sources.md) |
|
||||
| `DR.md` / `DR.en.md` | [`CLOUD.md`](CLOUD.md), [`DATACENTERS.md`](DATACENTERS.md), [`MONITORING.md`](MONITORING.md), [`CICD.md`](CICD.md), [`STORAGE.md`](STORAGE.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `PROVISIONING.md` / `PROVISIONING.en.md` | [`CICD.md`](CICD.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `STORAGE.md` / `STORAGE.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `GPU.md` / `GPU.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `SERVER-HW.md` / `SERVER-HW.en.md` | [`CONNECTIVITY.md`](CONNECTIVITY.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `SERVER-CONFIG.md` / `SERVER-CONFIG.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `CONNECTIVITY.md` / `CONNECTIVITY.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `HYPERVISORS.md` / `HYPERVISORS.en.md` | [`STORAGE.md`](STORAGE.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `DATABASES.md` / `DATABASES.en.md` | [`POSTGRESQL.md`](POSTGRESQL.md), [`MYSQL.md`](MYSQL.md), [`ORACLE.md`](ORACLE.md), [`MONGODB.md`](MONGODB.md), [`REDIS.md`](REDIS.md), [`CASSANDRA.md`](CASSANDRA.md), [`VEKTOROVE-DB.md`](VEKTOROVE-DB.md), [`DATABAZOVE-ENGINY.md`](DATABAZOVE-ENGINY.md), [`sources/databases/sources.md`](sources/databases/sources.md) |
|
||||
| `HARDWARE.md` / `HARDWARE.en.md` | [`SERVER-HW.md`](SERVER-HW.md), [`GPU.md`](GPU.md), [`SERVER-CONFIG.md`](SERVER-CONFIG.md), [`PROVISIONING.md`](PROVISIONING.md) |
|
||||
| `INFRASTRUCTURE.md` / `INFRASTRUCTURE.en.md` | [`HYPERVISORS.md`](HYPERVISORS.md), [`DATACENTERS.md`](DATACENTERS.md), [`STORAGE.md`](STORAGE.md), [`HARDWARE.md`](HARDWARE.md) |
|
||||
| `CLOUD.md` / `CLOUD.en.md` | [`GPU.en.md`](GPU.en.md), [`NETWORKING.en.md`](NETWORKING.en.md), [`sources/cloud/sources.en.md`](sources/cloud/sources.en.md) |
|
||||
| `NETWORKING.md` / `NETWORKING.en.md` | [`CLOUD.en.md`](CLOUD.en.md), [`sources/networking/sources.en.md`](sources/networking/sources.en.md) |
|
||||
| `DATACENTERS.md` / `DATACENTERS.en.md` | [`MONITORING.en.md`](MONITORING.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `MONITORING.md` / `MONITORING.en.md` | [`sources/monitoring/sources.en.md`](sources/monitoring/sources.en.md) |
|
||||
| `CICD.md` / `CICD.en.md` | [`sources/cicd/sources.en.md`](sources/cicd/sources.en.md) |
|
||||
| `DR.md` / `DR.en.md` | [`CLOUD.en.md`](CLOUD.en.md), [`DATACENTERS.en.md`](DATACENTERS.en.md), [`MONITORING.en.md`](MONITORING.en.md), [`CICD.en.md`](CICD.en.md), [`STORAGE.en.md`](STORAGE.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `MESSAGING.md` / `MESSAGING.en.md` | [`DATACENTERS.en.md`](DATACENTERS.en.md), [`CLOUD.en.md`](CLOUD.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `DC-MIGRATION.md` / `DC-MIGRATION.en.md` | [`DATACENTERS.en.md`](DATACENTERS.en.md), [`CLOUD.en.md`](CLOUD.en.md), [`DR.en.md`](DR.en.md), [`NETWORKING.en.md`](NETWORKING.en.md), [`STORAGE.en.md`](STORAGE.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `AI-INFRASTRUCTURE.md` / `AI-INFRASTRUCTURE.en.md` | [`GPU.en.md`](GPU.en.md), [`NETWORKING.en.md`](NETWORKING.en.md), [`STORAGE.en.md`](STORAGE.en.md), [`DATACENTERS.en.md`](DATACENTERS.en.md), [`CLOUD.en.md`](CLOUD.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `PROVISIONING.md` / `PROVISIONING.en.md` | [`CICD.en.md`](CICD.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `STORAGE.md` / `STORAGE.en.md` | [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `GPU.md` / `GPU.en.md` | [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `SERVER-HW.md` / `SERVER-HW.en.md` | [`CONNECTIVITY.en.md`](CONNECTIVITY.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `SERVER-CONFIG.md` / `SERVER-CONFIG.en.md` | [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `CONNECTIVITY.md` / `CONNECTIVITY.en.md` | [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `HYPERVISORS.md` / `HYPERVISORS.en.md` | [`STORAGE.en.md`](STORAGE.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `DATABASES.md` / `DATABASES.en.md` | [`POSTGRESQL.en.md`](POSTGRESQL.en.md), [`MYSQL.en.md`](MYSQL.en.md), [`ORACLE.en.md`](ORACLE.en.md), [`MONGODB.en.md`](MONGODB.en.md), [`REDIS.en.md`](REDIS.en.md), [`CASSANDRA.en.md`](CASSANDRA.en.md), [`VEKTOROVE-DB.md`](VEKTOROVE-DB.md), [`DATABAZOVE-ENGINY.md`](DATABAZOVE-ENGINY.md), [`sources/databases/sources.en.md`](sources/databases/sources.en.md) |
|
||||
| `HARDWARE.md` / `HARDWARE.en.md` | [`SERVER-HW.en.md`](SERVER-HW.en.md), [`GPU.en.md`](GPU.en.md), [`SERVER-CONFIG.en.md`](SERVER-CONFIG.en.md), [`PROVISIONING.en.md`](PROVISIONING.en.md) |
|
||||
| `OS.md` / `OS.en.md` | [`AI-INFRASTRUCTURE.en.md`](AI-INFRASTRUCTURE.en.md), [`KUBERNETES.en.md`](KUBERNETES.en.md), [`HYPERVISORS.en.md`](HYPERVISORS.en.md), [`DATACENTERS.en.md`](DATACENTERS.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `KUBERNETES.md` / `KUBERNETES.en.md` | [`CICD.en.md`](CICD.en.md), [`CLOUD.en.md`](CLOUD.en.md), [`NETWORKING.en.md`](NETWORKING.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `BIG-DATA.md` / `BIG-DATA.en.md` | [`DATABASES.en.md`](DATABASES.en.md), [`CLOUD.en.md`](CLOUD.en.md), [`MESSAGING.en.md`](MESSAGING.en.md), [`KUBERNETES.en.md`](KUBERNETES.en.md), [`sources/infrastructure/sources.en.md`](sources/infrastructure/sources.en.md) |
|
||||
| `INFRASTRUCTURE.md` / `INFRASTRUCTURE.en.md` | [`HYPERVISORS.en.md`](HYPERVISORS.en.md), [`DATACENTERS.en.md`](DATACENTERS.en.md), [`STORAGE.en.md`](STORAGE.en.md), [`HARDWARE.en.md`](HARDWARE.en.md) |
|
||||
|
||||
---
|
||||
|
||||
@@ -190,4 +208,4 @@ Raw reference data (documentation, books, standards) by area:
|
||||
|
||||
---
|
||||
|
||||
*This index is automatically maintained by the `kb-index` agent. Last updated: 2026-06-11.*
|
||||
*This index is automatically maintained by the `kb-index` agent. Last updated: 2026-06-18.*
|
||||
24
README.md
24
README.md
@@ -35,11 +35,11 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
│ PCIe,BM) │ │(BIOS, │ │ AMD) │ │ Terraform) │
|
||||
└──────────┘ │ NUMA) │ └────────┘ └──────────────┘
|
||||
└──────────┘
|
||||
┌──────────┐ ┌──────────┐ ┌────────┐
|
||||
│HYPERVISOR│ │ MONITOR │ │ CICD │
|
||||
│(VMware, │ │(Prom, │ │(GitOps, │
|
||||
│ KVM, ...)│ │ Grafana) │ │ IaC) │
|
||||
└──────────┘ └──────────┘ └────────┘
|
||||
┌──────────┐ ┌──────────┐ ┌────────┐ ┌────────────┐
|
||||
│HYPERVISOR│ │ MONITOR │ │ CICD │ │ ☸ K8s │
|
||||
│(VMware, │ │(Prom, │ │(GitOps, │ │(CAPI, K3s, │
|
||||
│ KVM, ...)│ │ Grafana) │ │ IaC) │ │ RKE2...) │
|
||||
└──────────┘ └──────────┘ └────────┘ └────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
@@ -52,8 +52,10 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
| 🌐 Síťová architektura | [NETWORKING.md](NETWORKING.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
|
||||
| 📊 Monitoring a observabilita | [MONITORING.md](MONITORING.md) | Prometheus, Grafana, OTel, logging, alerting, SLO | — |
|
||||
| 🔄 CI/CD a DevOps | [CICD.md](CICD.md) | Pipelines, GitOps, IaC (Terraform), deployment strategie | — |
|
||||
| 💻 Operační systémy | [OS.md](OS.md) | Linux distribuce, Windows Server, lifecycle, EOL, kernel | KUBERNETES, HYPERVISORS, AI-INFRASTRUCTURE |
|
||||
| 🔄 Disaster Recovery | [DR.md](DR.md) | RTO, RPO, scénáře, prevence, výpočet uptimu | CLOUD, DATACENTERS, MONITORING |
|
||||
| 🗄️ Databázová architektura | [DATABASES.md](DATABASES.md) | Klasifikace, sharding, replikace, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VEKTOROVE-DB, DATABAZOVE-ENGINY |
|
||||
| 🗄️ Big Data | [BIG-DATA.md](BIG-DATA.md) | HDFS, Spark, Flink, Trino, Iceberg, Delta Lake, Lakehouse | DATABASES, CLOUD, MESSAGING, KUBERNETES |
|
||||
| 🖥️ Hypervisory | [HYPERVISORS.md](HYPERVISORS.md) | VMware, Hyper-V, KVM, Proxmox, migrace | STORAGE, SERVER-HW |
|
||||
| 🏭 Datová centra | [DATACENTERS.md](DATACENTERS.md) | Tier, power, cooling, layout, DC služby, sekundární DC topologie | MONITORING, MESSAGING |
|
||||
| 💾 Storage | [STORAGE.md](STORAGE.md) | SAN/NAS/object, RAID, SDS, Ceph, OpenStack Cinder/Swift/Manila | — |
|
||||
@@ -62,8 +64,10 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
| 🎮 GPU | [GPU.md](GPU.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
|
||||
| ⚙️ Server config | [SERVER-CONFIG.md](SERVER-CONFIG.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
|
||||
| 📦 Provisioning | [PROVISIONING.md](PROVISIONING.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
|
||||
| ☸ Kubernetes | [KUBERNETES.md](KUBERNETES.md) | K8s architektura, deployment, Cluster API (CAPI) | CICD, CLOUD, NETWORKING |
|
||||
| 📨 Messaging & streaming | [MESSAGING.md](MESSAGING.md) | Kafka, RabbitMQ, Pulsar, NATS, managed queue/pubsub | DATACENTERS, CLOUD |
|
||||
| 🏗️ Migrace DC | [DC-MIGRATION.md](DC-MIGRATION.md) | Strategie, fáze, network, DB, rollback | DATACENTERS, CLOUD, DR, NETWORKING, STORAGE |
|
||||
| 🧠 AI infrastruktura | [AI-INFRASTRUCTURE.md](AI-INFRASTRUCTURE.md) | GPU, AI networking, storage, cluster, cooling, training/inference | GPU, NETWORKING, STORAGE, DATACENTERS, CLOUD |
|
||||
| 📋 Původní rozcestník | [HARDWARE.md](HARDWARE.md) | Legacy index → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
|
||||
| 📋 Původní infrastruktura | [INFRASTRUCTURE.md](INFRASTRUCTURE.md) | Legacy index → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
|
||||
| 📋 Review workflow | [REVIEW.md](REVIEW.md) | Proces oponentury a kontroly obsahu | — |
|
||||
@@ -92,8 +96,10 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
| 🌐 Network architecture | [NETWORKING.en.md](NETWORKING.en.md) | DNS, BGP, VPC, Zero Trust, EVPN VXLAN, TLS | CLOUD |
|
||||
| 📊 Monitoring & observability | [MONITORING.en.md](MONITORING.en.md) | Prometheus, Grafana, OTel, logging, alerting | — |
|
||||
| 🔄 CI/CD & DevOps | [CICD.en.md](CICD.en.md) | Pipelines, GitOps, IaC (Terraform), deployment | — |
|
||||
| 💻 Operating systems | [OS.en.md](OS.en.md) | Linux distributions, Windows Server, lifecycle, EOL, kernel | KUBERNETES, HYPERVISORS, AI-INFRASTRUCTURE |
|
||||
| 🔄 Disaster Recovery | [DR.en.md](DR.en.md) | RTO, RPO, scenarios, prevention, uptime calculation | CLOUD, DATACENTERS, MONITORING |
|
||||
| 🗄️ Database architecture | [DATABASES.en.md](DATABASES.en.md) | Classification, sharding, replication, caching | POSTGRESQL, MYSQL, ORACLE, MONGODB, REDIS, CASSANDRA, VECTOR-DBS, DATABASE-ENGINES |
|
||||
| 🗄️ Big Data | [BIG-DATA.en.md](BIG-DATA.en.md) | HDFS, Spark, Flink, Trino, Iceberg, Delta Lake, Lakehouse | DATABASES, CLOUD, MESSAGING, KUBERNETES |
|
||||
| 🖥️ Hypervisors | [HYPERVISORS.en.md](HYPERVISORS.en.md) | VMware, Hyper-V, KVM, Proxmox, migration | STORAGE, SERVER-HW |
|
||||
| 🏭 Data centers | [DATACENTERS.en.md](DATACENTERS.en.md) | Tier, power, cooling, layout, DC services, secondary DC topologies | MONITORING, MESSAGING |
|
||||
| 💾 Storage | [STORAGE.en.md](STORAGE.en.md) | SAN/NAS/object, RAID, SDS, Ceph | — |
|
||||
@@ -102,8 +108,10 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
| 🎮 GPU | [GPU.en.md](GPU.en.md) | NVIDIA/AMD, NVLink, MIG/vGPU, AI, Cyborg | — |
|
||||
| ⚙️ Server config | [SERVER-CONFIG.en.md](SERVER-CONFIG.en.md) | BIOS tuning, DB/hypervisor/K8s/storage best practices | — |
|
||||
| 📦 Provisioning | [PROVISIONING.en.md](PROVISIONING.en.md) | PXE, Redfish, Terraform, Ironic, OpenStack deploy | CICD |
|
||||
| ☸ Kubernetes | [KUBERNETES.en.md](KUBERNETES.en.md) | K8s architecture, deployment, Cluster API (CAPI) | CICD, CLOUD, NETWORKING |
|
||||
| 📨 Messaging & streaming | [MESSAGING.en.md](MESSAGING.en.md) | Kafka, RabbitMQ, Pulsar, NATS, managed queue/pubsub | DATACENTERS, CLOUD |
|
||||
| 🏗️ DC Migration | [DC-MIGRATION.en.md](DC-MIGRATION.en.md) | Strategies, phases, network, DB, rollback | DATACENTERS, CLOUD, DR, NETWORKING, STORAGE |
|
||||
| 🧠 AI Infrastructure | [AI-INFRASTRUCTURE.en.md](AI-INFRASTRUCTURE.en.md) | GPU, AI networking, storage, cluster, cooling, training/inference | GPU, NETWORKING, STORAGE, DATACENTERS, CLOUD |
|
||||
| 📋 Legacy index | [HARDWARE.en.md](HARDWARE.en.md) | → SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING | SERVER-HW, GPU, SERVER-CONFIG, PROVISIONING |
|
||||
| 📋 Legacy infra | [INFRASTRUCTURE.en.md](INFRASTRUCTURE.en.md) | → HYPERVISORS, DATACENTERS, STORAGE, HARDWARE | HYPERVISORS, DATACENTERS, STORAGE, HARDWARE |
|
||||
| 📋 Review workflow | [REVIEW.en.md](REVIEW.en.md) | Review and content control process | — |
|
||||
@@ -145,6 +153,8 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
| `DR.md` / `DR.en.md` | [`CLOUD.md`](CLOUD.md), [`DATACENTERS.md`](DATACENTERS.md), [`MONITORING.md`](MONITORING.md), [`CICD.md`](CICD.md), [`STORAGE.md`](STORAGE.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `MESSAGING.md` / `MESSAGING.en.md` | [`DATACENTERS.md`](DATACENTERS.md), [`CLOUD.md`](CLOUD.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `DC-MIGRATION.md` / `DC-MIGRATION.en.md` | [`DATACENTERS.md`](DATACENTERS.md), [`CLOUD.md`](CLOUD.md), [`DR.md`](DR.md), [`NETWORKING.md`](NETWORKING.md), [`STORAGE.md`](STORAGE.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `OS.md` / `OS.en.md` | [`AI-INFRASTRUCTURE.md`](AI-INFRASTRUCTURE.md), [`KUBERNETES.md`](KUBERNETES.md), [`HYPERVISORS.md`](HYPERVISORS.md), [`DATACENTERS.md`](DATACENTERS.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `AI-INFRASTRUCTURE.md` / `AI-INFRASTRUCTURE.en.md` | [`GPU.md`](GPU.md), [`NETWORKING.md`](NETWORKING.md), [`STORAGE.md`](STORAGE.md), [`DATACENTERS.md`](DATACENTERS.md), [`CLOUD.md`](CLOUD.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `PROVISIONING.md` / `PROVISIONING.en.md` | [`CICD.md`](CICD.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `STORAGE.md` / `STORAGE.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `GPU.md` / `GPU.en.md` | [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
@@ -155,6 +165,8 @@ Bilingual: Czech (`.md`) and English (`.en.md`).
|
||||
| `DATABASES.md` / `DATABASES.en.md` | [`POSTGRESQL.md`](POSTGRESQL.md), [`MYSQL.md`](MYSQL.md), [`ORACLE.md`](ORACLE.md), [`MONGODB.md`](MONGODB.md), [`REDIS.md`](REDIS.md), [`CASSANDRA.md`](CASSANDRA.md), [`VEKTOROVE-DB.md`](VEKTOROVE-DB.md), [`DATABAZOVE-ENGINY.md`](DATABAZOVE-ENGINY.md), [`sources/databases/sources.md`](sources/databases/sources.md) |
|
||||
| `HARDWARE.md` / `HARDWARE.en.md` | [`SERVER-HW.md`](SERVER-HW.md), [`GPU.md`](GPU.md), [`SERVER-CONFIG.md`](SERVER-CONFIG.md), [`PROVISIONING.md`](PROVISIONING.md) |
|
||||
| `INFRASTRUCTURE.md` / `INFRASTRUCTURE.en.md` | [`HYPERVISORS.md`](HYPERVISORS.md), [`DATACENTERS.md`](DATACENTERS.md), [`STORAGE.md`](STORAGE.md), [`HARDWARE.md`](HARDWARE.md) |
|
||||
| `KUBERNETES.md` / `KUBERNETES.en.md` | [`CICD.md`](CICD.md), [`CLOUD.md`](CLOUD.md), [`NETWORKING.md`](NETWORKING.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
| `BIG-DATA.md` / `BIG-DATA.en.md` | [`DATABASES.md`](DATABASES.md), [`CLOUD.md`](CLOUD.md), [`MESSAGING.md`](MESSAGING.md), [`KUBERNETES.md`](KUBERNETES.md), [`sources/infrastructure/sources.md`](sources/infrastructure/sources.md) |
|
||||
|
||||
---
|
||||
|
||||
@@ -196,4 +208,4 @@ Raw referenční data (dokumentace, knihy, standardy) podle oblastí:
|
||||
|
||||
---
|
||||
|
||||
*Rozcestník je automaticky udržován agentem `kb-index`. Poslední aktualizace: 2026-06-12.*
|
||||
*Rozcestník je automaticky udržován agentem `kb-index`. Poslední aktualizace: 2026-06-18.*
|
||||
|
||||
@@ -114,6 +114,6 @@ Redis underwent a major license change in 2024:
|
||||
|
||||
## Sources
|
||||
|
||||
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
|
||||
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-03*
|
||||
|
||||
@@ -752,6 +752,6 @@ flowchart TD
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-03*
|
||||
|
||||
@@ -230,6 +230,10 @@ Conclusion: 8 DIMMs per CPU (1DPC) = highest performance
|
||||
| AI training (CPU preprocessing) | 2-4 GB/core | 128-512 GB | 8× 32-64 GB RDIMM, 1DPC |
|
||||
| HPC | 1-2 GB/core | 64-128 GB | 8× 16 GB RDIMM, 1DPC, high-speed |
|
||||
| In-memory DB (SAP HANA) | 8-32 GB/core | 1-6 TB+ | 16× 128-256 GB LRDIMM/3DS |
|
||||
| Big Data — Spark worker | 4-8 GB/core | 128-512 GB | 8-16× 32-64 GB RDIMM, 1DPC, NVMe scratch |
|
||||
| Big Data — Flink worker | 8-16 GB/core (incl. managed state) | 128-512 GB | 8-16× 32-64 GB RDIMM, 1DPC, RocksDB on NVMe |
|
||||
| Big Data — Trino worker | 4-8 GB/core | 64-256 GB | 8× 16-32 GB RDIMM, 1DPC |
|
||||
| Big Data — HDFS DataNode | 1-2 GB/core (metadata cache) | 64-256 GB | 8× 16-32 GB RDIMM, 1DPC, max storage density |
|
||||
|
||||
## PCIe
|
||||
|
||||
@@ -324,7 +328,7 @@ Socket 0 (NUMA node 0) Socket 1 (NUMA node 1)
|
||||
|
||||
## Server connectivity
|
||||
|
||||
Detailed chapter on network and storage connectivity: [CONNECTIVITY.md](CONNECTIVITY.md)
|
||||
Detailed chapter on network and storage connectivity: [CONNECTIVITY.en.md](CONNECTIVITY.en.md)
|
||||
|
||||
## Storage controllers
|
||||
|
||||
@@ -346,8 +350,51 @@ Detailed chapter on network and storage connectivity: [CONNECTIVITY.md](CONNECTI
|
||||
| **Use case** | SDS (Ceph, MinIO), ZFS | VMware VMFS, Windows, legacy |
|
||||
| **Battery/Backup** | Not needed | Write-back cache requires BBU |
|
||||
|
||||
## Pricing (2026)
|
||||
|
||||
### CPU pricing (2026)
|
||||
| CPU | Cores | TDP | 1ku price | $/core |
|
||||
|-----|-------|-----|----------|--------|
|
||||
| AMD EPYC 9965 (Turin) | 192 | 500 W | ~$11,988 | $62 |
|
||||
| AMD EPYC 9655 (Turin) | 96 | 400 W | ~$6,500 | $68 |
|
||||
| AMD EPYC 9475F (Turin) | 48 | 360 W | ~$5,000 | $104 |
|
||||
| Intel Xeon 6980P (Granite Rapids) | 128 | 500 W | ~$12,460 | $97 |
|
||||
| Intel Xeon 6980P (Granite Rapids-AP) | 128 | 500 W | $13,955 | $109 |
|
||||
| Intel Xeon 6767P (Granite Rapids) | 64 | 350 W | ~$7,000 | $109 |
|
||||
|
||||
Sources: AMD 1ku pricing, Intel RCP, Newegg verified.
|
||||
|
||||
### DDR5 RDIMM pricing (2026 — AI-driven price surge)
|
||||
| Capacity | Speed | Price 2025 | Price Q2 2026 | Change |
|
||||
|----------|---------|-----------|-------------|-------|
|
||||
| 32 GB (2R×8) | DDR5-5600 | ~$95 | ~$400–550 | +400–500 % |
|
||||
| 64 GB (2R×4) | DDR5-4800 | ~$180 | ~$700–900 | +400 % |
|
||||
| 96 GB (2R×4) | DDR5-6400 | ~$300 | ~$1,200–1,600 | +400 % |
|
||||
| 128 GB (2R×4) | DDR5-6400 | ~$450 | ~$1,800–2,500 | +450 % |
|
||||
| 256 GB (LRDIMM) | DDR5-6400 | ~$900 | ~$4,000–5,000 | +450 % |
|
||||
|
||||
Trend: DDR5 prices have risen ~400–500 % since mid-2025 due to AI-driven demand. Further increases expected in H2 2026. Source: Counterpoint, TrendForce.
|
||||
|
||||
### NVMe SSD pricing (enterprise, 2026)
|
||||
| Capacity | Type | Price 2024 | Price Q2 2026 | Change |
|
||||
|----------|-----|-----------|-------------|-------|
|
||||
| 1.92 TB | NVMe U.3 (read-intensive) | ~$200 | ~$500–600 | +150 % |
|
||||
| 3.84 TB | NVMe U.3 (mixed-use) | ~$400 | ~$1,000–1,200 | +150 % |
|
||||
| 7.68 TB | NVMe U.3 (mixed-use) | ~$800 | ~$2,000–2,500 | +150 % |
|
||||
| 15.36 TB | NVMe U.3 (mixed-use) | ~$1,500 | ~$4,000–5,000 | +170 % |
|
||||
|
||||
Trend: NAND flash prices have risen ~100–200 % since 2025, average enterprise SSD now costs 2–3× more. Source: TrendForce, Xinnor.
|
||||
|
||||
### Total server cost (example configurations)
|
||||
| Configuration | CPU | RAM | Storage | Estimated Price |
|
||||
|-------------|-----|-----|------|-----------|
|
||||
| DB server (OLTP) | 2× EPYC 9655 (96C) | 1 TB DDR5 | 6× 1.92 TB NVMe | ~$45,000–60,000 |
|
||||
| GPU server (AI) | 2× Xeon 6980P | 2 TB DDR5 | 4× 3.84 TB NVMe | ~$80,000–120,000 (w/o GPU) |
|
||||
| Hypervisor host | 2× EPYC 9475F (48C) | 512 GB DDR5 | 2× 1.92 TB NVMe + 4× 16 TB HDD | ~$25,000–35,000 |
|
||||
| Storage server (Ceph) | 1× EPYC 9655 (96C) | 256 GB DDR5 | 24× 15.36 TB NVMe | ~$60,000–80,000 |
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
*Last revision: 2026-06-03*
|
||||
|
||||
47
SERVER-HW.md
47
SERVER-HW.md
@@ -230,6 +230,10 @@ Závěr: 8 DIMMů na CPU (1DPC) = nejvyšší výkon
|
||||
| AI training (CPU preprocessing) | 2-4 GB/core | 128-512 GB | 8× 32-64 GB RDIMM, 1DPC |
|
||||
| HPC | 1-2 GB/core | 64-128 GB | 8× 16 GB RDIMM, 1DPC, high-speed |
|
||||
| In-memory DB (SAP HANA) | 8-32 GB/core | 1-6 TB+ | 16× 128-256 GB LRDIMM/3DS |
|
||||
| Big Data — Spark worker | 4-8 GB/core | 128-512 GB | 8-16× 32-64 GB RDIMM, 1DPC, NVMe scratch |
|
||||
| Big Data — Flink worker | 8-16 GB/core (vč. managed state) | 128-512 GB | 8-16× 32-64 GB RDIMM, 1DPC, RocksDB na NVMe |
|
||||
| Big Data — Trino worker | 4-8 GB/core | 64-256 GB | 8× 16-32 GB RDIMM, 1DPC |
|
||||
| Big Data — HDFS DataNode | 1-2 GB/core (metadata cache) | 64-256 GB | 8× 16-32 GB RDIMM, 1DPC, max storage density |
|
||||
|
||||
## PCIe
|
||||
|
||||
@@ -346,6 +350,49 @@ Detailní kapitola o síťové a storage konektivitě: [CONNECTIVITY.md](CONNECT
|
||||
| **Use case** | SDS (Ceph, MinIO), ZFS | VMware VMFS, Windows, legacy |
|
||||
| **Battery/Backup** | Není potřeba | Write-back cache vyžaduje BBU |
|
||||
|
||||
## Ceny (2026)
|
||||
|
||||
### CPU ceny (2026)
|
||||
| CPU | Cores | TDP | 1ku cena | $/core |
|
||||
|-----|-------|-----|----------|--------|
|
||||
| AMD EPYC 9965 (Turin) | 192 | 500 W | ~$11 988 | $62 |
|
||||
| AMD EPYC 9655 (Turin) | 96 | 400 W | ~$6 500 | $68 |
|
||||
| AMD EPYC 9475F (Turin) | 48 | 360 W | ~$5 000 | $104 |
|
||||
| Intel Xeon 6980P (Granite Rapids) | 128 | 500 W | ~$12 460 | $97 |
|
||||
| Intel Xeon 6980P (Granite Rapids-AP) | 128 | 500 W | $13 955 | $109 |
|
||||
| Intel Xeon 6767P (Granite Rapids) | 64 | 350 W | ~$7 000 | $109 |
|
||||
|
||||
Sources: AMD 1ku pricing, Intel RCP, Newegg verified.
|
||||
|
||||
### DDR5 RDIMM ceny (2026 — AI-driven price surge)
|
||||
| Kapacita | Rychlost | Cena 2025 | Cena Q2 2026 | Změna |
|
||||
|----------|---------|-----------|-------------|-------|
|
||||
| 32 GB (2R×8) | DDR5-5600 | ~$95 | ~$400–550 | +400–500 % |
|
||||
| 64 GB (2R×4) | DDR5-4800 | ~$180 | ~$700–900 | +400 % |
|
||||
| 96 GB (2R×4) | DDR5-6400 | ~$300 | ~$1 200–1 600 | +400 % |
|
||||
| 128 GB (2R×4) | DDR5-6400 | ~$450 | ~$1 800–2 500 | +450 % |
|
||||
| 256 GB (LRDIMM) | DDR5-6400 | ~$900 | ~$4 000–5 000 | +450 % |
|
||||
|
||||
Trend: DDR5 ceny vzrostly ~400–500 % od mid-2025 kvůli AI-driven poptávce. Očekává se další růst v H2 2026. Zdroj: Counterpoint, TrendForce.
|
||||
|
||||
### NVMe SSD ceny (enterprise, 2026)
|
||||
| Kapacita | Typ | Cena 2024 | Cena Q2 2026 | Změna |
|
||||
|----------|-----|-----------|-------------|-------|
|
||||
| 1.92 TB | NVMe U.3 (read-intensive) | ~$200 | ~$500–600 | +150 % |
|
||||
| 3.84 TB | NVMe U.3 (mixed-use) | ~$400 | ~$1 000–1 200 | +150 % |
|
||||
| 7.68 TB | NVMe U.3 (mixed-use) | ~$800 | ~$2 000–2 500 | +150 % |
|
||||
| 15.36 TB | NVMe U.3 (mixed-use) | ~$1 500 | ~$4 000–5 000 | +170 % |
|
||||
|
||||
Trend: NAND flash ceny vzrostly ~100–200 % od 2025, průměrný enterprise SSD stojí 2–3× více. Zdroj: TrendForce, Xinnor.
|
||||
|
||||
### Celková cena serveru (příkladové konfigurace)
|
||||
| Konfigurace | CPU | RAM | Disk | Odhad ceny |
|
||||
|-------------|-----|-----|------|-----------|
|
||||
| DB server (OLTP) | 2× EPYC 9655 (96C) | 1 TB DDR5 | 6× 1.92 TB NVMe | ~$45 000–60 000 |
|
||||
| GPU server (AI) | 2× Xeon 6980P | 2 TB DDR5 | 4× 3.84 TB NVMe | ~$80 000–120 000 (bez GPU) |
|
||||
| Hypervisor host | 2× EPYC 9475F (48C) | 512 GB DDR5 | 2× 1.92 TB NVMe + 4× 16 TB HDD | ~$25 000–35 000 |
|
||||
| Storage server (Ceph) | 1× EPYC 9655 (96C) | 256 GB DDR5 | 24× 15.36 TB NVMe | ~$60 000–80 000 |
|
||||
|
||||
## Zdroje
|
||||
|
||||
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
|
||||
@@ -270,9 +270,60 @@ OpenStack offers three main storage services:
|
||||
|
||||
Ceph is the most common storage backend for OpenStack: Cinder (RBD), Swift (RGW), Manila (CephFS), Glance (RBD images).
|
||||
|
||||
## Big Data storage
|
||||
|
||||
### HDFS cluster
|
||||
|
||||
HDFS is the primary storage for the Hadoop ecosystem (on-prem). Typical configuration:
|
||||
|
||||
| Parameter | Value | Note |
|
||||
|-----------|-------|------|
|
||||
| **Disk per DataNode** | 8–24 × HDD (14–22 TB) + 2× NVMe (metadata, cache) | Balance capacity / performance |
|
||||
| **Replication factor** | 3× | Rack-aware |
|
||||
| **Network** | 2× 25/100 GbE (data) + 1× 1 GbE (management) | Data + replication traffic |
|
||||
| **RAM** | 64–256 GB (OS cache + metadata) | HDFS cache + OS buffer cache |
|
||||
| **CPU** | 16–32 cores | HDFS overhead is low |
|
||||
| **NameNode HA** | Active + Standby + JN (JournalNode) | Quorum-based HA |
|
||||
| **Use case** | Sequential read/write, large files, Spark YARN |
|
||||
|
||||
**Model cluster — 1 PB usable:**
|
||||
|
||||
- 10× DataNode (12× 18 TB HDD, 2× 1.9 TB NVMe)
|
||||
- 2× NameNode (HA, 256 GB RAM)
|
||||
- 3× JournalNode (small VMs)
|
||||
- Replication 3× → raw ~ 2.2 PB
|
||||
- Network: 25 GbE for data, 100 GbE for shuffle-heavy Spark
|
||||
|
||||
### Object storage as Data Lake (S3/GCS/MinIO)
|
||||
|
||||
For new projects (Spark on K8s, Iceberg/Delta, lakehouse), object storage is preferred over HDFS:
|
||||
|
||||
| Platform | Advantages | Limits |
|
||||
|----------|-----------|--------|
|
||||
| **MinIO** (on-prem) | S3 API, erasure coding, NVMe direct, high throughput | Single tenant (per cluster) |
|
||||
| **Pure //C** (on-prem) | QLC NVMe, dedupe, S3 + NFS | Higher $/TB |
|
||||
| **AWS S3** (cloud) | Unlimited capacity, Iceberg/Delta support | Egress fees |
|
||||
| **Azure ADLS** (cloud) | Hierarchical namespace, HNS, POSIX-like ACLs | Vendor lock |
|
||||
| **GCP GCS** (cloud) | Uniform + fine-grained ACLs, object versioning | Region restrictions |
|
||||
|
||||
### Comparison: HDFS vs Object Storage for Big Data
|
||||
|
||||
| Criteria | HDFS | Object Storage (S3/MinIO) |
|
||||
|----------|------|-------------------------|
|
||||
| **Architecture** | Master/worker (NameNode SPOF) | Distributed, no SPOF (erasure coding) |
|
||||
| **Consistency** | Strong (single writer per file) | Eventual (S3) / Strong (MinIO) |
|
||||
| **Throughput** | High (rack-aware, locality) | High (network-bound) |
|
||||
| **Scaling** | Horizontal (DataNode) | Horizontal (stateless) |
|
||||
| **Cost** | Low (HDD) | Medium (S3 API) |
|
||||
| **Metadata** | NameNode (1M blocks ~ 1 GB RAM) | Object-level (flat namespace) |
|
||||
| **Spark integration** | Native (locality-optimized) | S3A connector, Hadoop Compatible |
|
||||
| **2026 trend** | Legacy, declining | Standard for new projects |
|
||||
|
||||
For more information about Big Data see [BIG-DATA.en.md](BIG-DATA.en.md).
|
||||
|
||||
## Sources
|
||||
|
||||
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
Links, books and standards: [sources/infrastructure/sources.en.md](sources/infrastructure/sources.en.md)
|
||||
|
||||
### Recommended reading
|
||||
|
||||
|
||||
51
STORAGE.md
51
STORAGE.md
@@ -270,6 +270,57 @@ OpenStack nabízí tři hlavní storage služby:
|
||||
|
||||
Ceph je nejčastější storage backend pro OpenStack: Cinder (RBD), Swift (RGW), Manila (CephFS), Glance (RBD images).
|
||||
|
||||
## Big Data storage
|
||||
|
||||
### HDFS cluster
|
||||
|
||||
HDFS je primární storage pro Hadoop ekosystém (on-prem). Typická konfigurace:
|
||||
|
||||
| Parametr | Hodnota | Poznámka |
|
||||
|----------|---------|----------|
|
||||
| **Disk per DataNode** | 8–24 × HDD (14–22 TB) + 2× NVMe (metadata, cache) | Balance capacity / performance |
|
||||
| **Replication factor** | 3× | Rack-aware |
|
||||
| **Network** | 2× 25/100 GbE (data) + 1× 1 GbE (management) | Data + replication traffic |
|
||||
| **RAM** | 64–256 GB (OS cache + metadata) | HDFS cache + OS buffer cache |
|
||||
| **CPU** | 16–32 cores | HDFS overhead je nízký |
|
||||
| **NameNode HA** | Active + Standby + JN (JournalNode) | Quorum-based HA |
|
||||
| **Use case** | Secvenční čtení/zápis, velké soubory, Spark YARN |
|
||||
|
||||
**Modelový cluster — 1 PB usable:**
|
||||
|
||||
- 10× DataNode (12× 18 TB HDD, 2× 1.9 TB NVMe)
|
||||
- 2× NameNode (HA, 256 GB RAM)
|
||||
- 3× JournalNode (malé VM)
|
||||
- Replication 3× → raw ~ 2.2 PB
|
||||
- Network: 25 GbE pro data, 100 GbE pro shuffle-heavy Spark
|
||||
|
||||
### Object storage jako Data Lake (S3/GCS/MinIO)
|
||||
|
||||
Pro nové projekty (Spark on K8s, Iceberg/Delta, lakehouse) se preferuje object storage před HDFS:
|
||||
|
||||
| Platforma | Výhody | Limity |
|
||||
|-----------|--------|--------|
|
||||
| **MinIO** (on-prem) | S3 API, erasure coding, NVMe direct, high throughput | Single tenant (per cluster) |
|
||||
| **Pure //C** (on-prem) | QLC NVMe, dedupe, S3 + NFS | Vyšší cena/TB |
|
||||
| **AWS S3** (cloud) | Neomezená kapacita, Iceberg/Delta support | Egress fees |
|
||||
| **Azure ADLS** (cloud) | Hierarchical namespace, HNS, POSIX-like ACLs | Vendor lock |
|
||||
| **GCP GCS** (cloud) | Uniform + fine-grained ACLs, object versioning | Region restrictions |
|
||||
|
||||
### Srovnání: HDFS vs Object Storage pro Big Data
|
||||
|
||||
| Kritérium | HDFS | Object Storage (S3/MinIO) |
|
||||
|-----------|------|-------------------------|
|
||||
| **Architektura** | Master/worker (NameNode SPOF) | Distributed, no SPOF (erasure coding) |
|
||||
| **Konzistence** | Strong (jediný writer per file) | Eventual (S3) / Strong (MinIO) |
|
||||
| **Propustnost** | Vysoká (rack-aware, locality) | Vysoká (network-bound) |
|
||||
| **Škálování** | Horizontální (DataNode) | Horizontální (stateless) |
|
||||
| **Cena** | Nízká (HDD) | Střední (S3 API) |
|
||||
| **Metadata** | NameNode (1 mil. bloků ~ 1 GB RAM) | Object-level (flat namespace) |
|
||||
| **Spark integration** | Native (locality optimalizace) | S3A connector, Hadoop Compatible |
|
||||
| **2026 trend** | Legacy, klesající | Standard pro nové projekty |
|
||||
|
||||
Podrobnější informace o Big Data viz [BIG-DATA.md](BIG-DATA.md).
|
||||
|
||||
## Zdroje
|
||||
|
||||
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||||
|
||||
@@ -94,7 +94,7 @@ Variants:
|
||||
|
||||
## Sources
|
||||
|
||||
References, books, and standards: [sources/databases/sources.md](sources/databases/sources.md)
|
||||
References, books, and standards: [sources/databases/sources.en.md](sources/databases/sources.en.md)
|
||||
|
||||
### Recommended reading
|
||||
|
||||
|
||||
@@ -1,10 +1,10 @@
|
||||
# Infrastructure — Sources
|
||||
|
||||
Split into separate files:
|
||||
- [HYPERVISORS.md](../../HYPERVISORS.md) — hypervisors and virtualization
|
||||
- [DATACENTERS.md](../../DATACENTERS.md) — data centers
|
||||
- [STORAGE.md](../../STORAGE.md) — storage
|
||||
- [HARDWARE.md](../../HARDWARE.md) — hardware and servers
|
||||
- [HYPERVISORS.en.md](../../HYPERVISORS.en.md) — hypervisors and virtualization
|
||||
- [DATACENTERS.en.md](../../DATACENTERS.en.md) — data centers
|
||||
- [STORAGE.en.md](../../STORAGE.en.md) — storage
|
||||
- [HARDWARE.en.md](../../HARDWARE.en.md) — hardware and servers
|
||||
|
||||
## Official documentation
|
||||
|
||||
@@ -112,7 +112,65 @@ Split into separate files:
|
||||
| Complete guide to modern vSphere alternatives — Spectro Cloud | https://www.spectrocloud.com/blog/vsphere-alternatives | `[done]` |
|
||||
| Broadcom VMware Acquisition: What's Next — Sayers | https://www.sayers.com/blog/after-the-deal-whats-next-for-vmware-customers | `[done]` |
|
||||
| Stanford University migration from VMware to Proxmox | https://itcommunity.stanford.edu/news/enterprise-technology-completes-successful-virtual-infrastructure-migration-vmware-proxmox | `[done]` |
|
||||
|
||||
| | **Sangfor** | |
|
||||
| Sangfor HCI — product page | https://www.sangfor.com/cloud-and-infrastructure/products/hci-hyper-converged-infrastructure | `[done]` |
|
||||
| Sangfor aSV — hypervisor | https://www.sangfor.com/cloud-and-infrastructure/products/asv-hypervisor-server-virtualization | `[done]` |
|
||||
| Sangfor vs VMware — feature comparison | https://www.sangfor.com/blog/cloud-and-infrastructure/sangfor-hci-vs-vmware-feature-comparison | `[done]` |
|
||||
| | **AI infrastructure** | |
|
||||
| NVIDIA DGX — documentation | https://www.nvidia.com/en-us/data-center/dgx-platform/ | `[done]` |
|
||||
| InfiniBand — Mellanox/NVIDIA | https://www.nvidia.com/en-us/networking/products/infiniband/ | `[done]` |
|
||||
| Lustre parallel filesystem | https://www.lustre.org/ | `[done]` |
|
||||
| WekaFS — AI storage | https://www.weka.io/ | `[done]` |
|
||||
| vLLM — inference server | https://github.com/vllm-project/vllm | `[done]` |
|
||||
| Megatron-LM — distributed training | https://github.com/NVIDIA/Megatron-LM | `[done]` |
|
||||
| | **Kubernetes / Cluster API** | |
|
||||
| Cluster API (CAPI) — official documentation (The CAPI Book) | https://cluster-api.sigs.k8s.io/ | `[done]` |
|
||||
| Cluster API — GitHub (kubernetes-sigs/cluster-api) | https://github.com/kubernetes-sigs/cluster-api | `[done]` |
|
||||
| Cluster API — provider list | https://cluster-api.sigs.k8s.io/reference/providers.html | `[done]` |
|
||||
| Kubernetes — official documentation | https://kubernetes.io/docs/ | `[done]` |
|
||||
| K3s — lightweight Kubernetes | https://k3s.io/ | `[done]` |
|
||||
| RKE2 — Rancher Kubernetes Engine 2 | https://docs.rke2.io/ | `[done]` |
|
||||
| Talos — API-driven Kubernetes OS | https://www.talos.dev/ | `[done]` |
|
||||
| Kamaji — hosted control plane provider | https://kamaji.clastix.io/ | `[done]` |
|
||||
| Metal3 — bare metal provider for CAPI | https://metal3.io/ | `[done]` |
|
||||
| Cluster API — ClusterClass and topologies | https://kubernetes.io/blog/2021/10/08/capi-clusterclass-and-managed-topologies/ | `[done]` |
|
||||
| | **Big Data** | |
|
||||
| Apache Spark — official documentation | https://spark.apache.org/docs/latest/ | `[done]` |
|
||||
| Apache Flink — official documentation | https://flink.apache.org/ | `[done]` |
|
||||
| Trino — distributed SQL engine | https://trino.io/docs/current/ | `[done]` |
|
||||
| Apache Iceberg — table format | https://iceberg.apache.org/ | `[done]` |
|
||||
| Delta Lake — documentation | https://docs.delta.io/ | `[done]` |
|
||||
| Apache Hudi | https://hudi.apache.org/ | `[done]` |
|
||||
| Apache Paimon | https://paimon.apache.org/ | `[done]` |
|
||||
| Apache Hadoop — documentation | https://hadoop.apache.org/docs/stable/ | `[done]` |
|
||||
| Apache Airflow — documentation | https://airflow.apache.org/docs/ | `[done]` |
|
||||
| Dagster — documentation | https://docs.dagster.io/ | `[done]` |
|
||||
| Prefect — documentation | https://docs.prefect.io/ | `[done]` |
|
||||
| HDFS architecture (Apache) | https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html | `[done]` |
|
||||
| | **Operating Systems** | |
|
||||
| Ubuntu lifecycle — Ubuntu Pro + ESM | https://ubuntu.com/about/release-cycle | `[done]` |
|
||||
| RHEL lifecycle — Red Hat Enterprise Linux | https://access.redhat.com/support/policy/updates/errata | `[done]` |
|
||||
| Rocky Linux lifecycle | https://rockylinux.org/download/ | `[done]` |
|
||||
| AlmaLinux lifecycle | https://almalinux.org/ | `[done]` |
|
||||
| Debian releases / LTS | https://wiki.debian.org/LTS | `[done]` |
|
||||
| SLES lifecycle — SUSE | https://www.suse.com/lifecycle/ | `[done]` |
|
||||
| Alpine Linux releases | https://alpinelinux.org/releases/ | `[done]` |
|
||||
| Fedora lifecycle | https://docs.fedoraproject.org/en-US/releases/lifecycle/ | `[done]` |
|
||||
| SELinux — Red Hat docs | https://www.redhat.com/en/topics/linux/what-is-selinux | `[done]` |
|
||||
| AppArmor — Ubuntu wiki | https://wiki.ubuntu.com/AppArmor | `[done]` |
|
||||
| | **Windows** | |
|
||||
| Windows Server lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-server-2022/ | `[done]` |
|
||||
| Windows Server 2025 lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-server-2025/ | `[done]` |
|
||||
| Windows 11 lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-11-enterprise/ | `[done]` |
|
||||
| Windows 10 EOL | https://learn.microsoft.com/en-us/lifecycle/products/windows-10-enterprise/ | `[done]` |
|
||||
| Windows Server licensing (per core) | https://learn.microsoft.com/en-us/windows-server/get-started/editions-and-support | `[done]` |
|
||||
| | **GPU pricing** | |
|
||||
| NVIDIA AI GPU pricing guide (2026) | https://intuitionlabs.ai/articles/nvidia-ai-gpu-pricing-guide | `[done]` |
|
||||
| GPU cloud pricing comparison (2026) | https://www.spheron.network/blog/gpu-cloud-pricing-comparison-2026/ | `[done]` |
|
||||
| GPU pricing trends 2026 — CompuX | https://compux.net/docs/guides/gpu-pricing-trends-2026 | `[done]` |
|
||||
| AMD MI300X pricing (2026) | https://www.thundercompute.com/blog/amd-mi300x-pricing | `[done]` |
|
||||
| GPU price/performance frontier — Silicon Analysts | https://siliconanalysts.com/tools/frontier | `[done]` |
|
||||
|
||||
## Hardware manufacturers
|
||||
|
||||
| Manufacturer | Server series | Management |
|
||||
|
||||
@@ -127,7 +127,65 @@ Rozděleno do samostatných souborů:
|
||||
| VMware Site Recovery Manager — documentation | https://docs.vmware.com/en/Site-Recovery-Manager/ | `[done]` |
|
||||
| Zerto — Disaster Recovery & Migration | https://www.zerto.com/resources/ | `[done]` |
|
||||
| The Phoenix Project — IT Ops & Migration patterns | https://itrevolution.com/product/the-phoenix-project/ | `[done]` |
|
||||
|
||||
| | **Sangfor** | |
|
||||
| Sangfor HCI — product page | https://www.sangfor.com/cloud-and-infrastructure/products/hci-hyper-converged-infrastructure | `[done]` |
|
||||
| Sangfor aSV — hypervisor | https://www.sangfor.com/cloud-and-infrastructure/products/asv-hypervisor-server-virtualization | `[done]` |
|
||||
| Sangfor vs VMware — feature comparison | https://www.sangfor.com/blog/cloud-and-infrastructure/sangfor-hci-vs-vmware-feature-comparison | `[done]` |
|
||||
| | **AI infrastruktura** | |
|
||||
| NVIDIA DGX — documentation | https://www.nvidia.com/en-us/data-center/dgx-platform/ | `[done]` |
|
||||
| InfiniBand — Mellanox/NVIDIA | https://www.nvidia.com/en-us/networking/products/infiniband/ | `[done]` |
|
||||
| Lustre parallel filesystem | https://www.lustre.org/ | `[done]` |
|
||||
| WekaFS — AI storage | https://www.weka.io/ | `[done]` |
|
||||
| vLLM — inference server | https://github.com/vllm-project/vllm | `[done]` |
|
||||
| Megatron-LM — distributed training | https://github.com/NVIDIA/Megatron-LM | `[done]`
|
||||
| | **Kubernetes / Cluster API** | |
|
||||
| Cluster API (CAPI) — oficiální dokumentace (The CAPI Book) | https://cluster-api.sigs.k8s.io/ | `[done]` |
|
||||
| Cluster API — GitHub (kubernetes-sigs/cluster-api) | https://github.com/kubernetes-sigs/cluster-api | `[done]` |
|
||||
| Cluster API — seznam providerů | https://cluster-api.sigs.k8s.io/reference/providers.html | `[done]` |
|
||||
| Kubernetes — oficiální dokumentace | https://kubernetes.io/docs/ | `[done]` |
|
||||
| K3s — lightweigh Kubernetes | https://k3s.io/ | `[done]` |
|
||||
| RKE2 — Rancher Kubernetes Engine 2 | https://docs.rke2.io/ | `[done]` |
|
||||
| Talos — API-driven Kubernetes OS | https://www.talos.dev/ | `[done]` |
|
||||
| Kamaji — hosted control plane provider | https://kamaji.clastix.io/ | `[done]` |
|
||||
| Metal3 — bare metal provider pro CAPI | https://metal3.io/ | `[done]` |
|
||||
| Cluster API — ClusterClass a topologies | https://kubernetes.io/blog/2021/10/08/capi-clusterclass-and-managed-topologies/ | `[done]` |
|
||||
| | **Big Data** | |
|
||||
| Apache Spark — oficiální dokumentace | https://spark.apache.org/docs/latest/ | `[done]` |
|
||||
| Apache Flink — oficiální dokumentace | https://flink.apache.org/ | `[done]` |
|
||||
| Trino — distribuovaný SQL engine | https://trino.io/docs/current/ | `[done]` |
|
||||
| Apache Iceberg — tabulkový formát | https://iceberg.apache.org/ | `[done]` |
|
||||
| Delta Lake — dokumentace | https://docs.delta.io/ | `[done]` |
|
||||
| Apache Hudi | https://hudi.apache.org/ | `[done]` |
|
||||
| Apache Paimon | https://paimon.apache.org/ | `[done]` |
|
||||
| Apache Hadoop — dokumentace | https://hadoop.apache.org/docs/stable/ | `[done]` |
|
||||
| Apache Airflow — dokumentace | https://airflow.apache.org/docs/ | `[done]` |
|
||||
| Dagster — dokumentace | https://docs.dagster.io/ | `[done]` |
|
||||
| Prefect — dokumentace | https://docs.prefect.io/ | `[done]` |
|
||||
| HDFS architektura (Apache) | https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html | `[done]` |
|
||||
| | **Operační systémy** | |
|
||||
| Ubuntu lifecycle — Ubuntu Pro + ESM | https://ubuntu.com/about/release-cycle | `[done]` |
|
||||
| RHEL lifecycle — Red Hat Enterprise Linux | https://access.redhat.com/support/policy/updates/errata | `[done]` |
|
||||
| Rocky Linux lifecycle | https://rockylinux.org/download/ | `[done]` |
|
||||
| AlmaLinux lifecycle | https://almalinux.org/ | `[done]` |
|
||||
| Debian releases / LTS | https://wiki.debian.org/LTS | `[done]` |
|
||||
| SLES lifecycle — SUSE | https://www.suse.com/lifecycle/ | `[done]` |
|
||||
| Alpine Linux releases | https://alpinelinux.org/releases/ | `[done]` |
|
||||
| Fedora lifecycle | https://docs.fedoraproject.org/en-US/releases/lifecycle/ | `[done]` |
|
||||
| SELinux — Red Hat docs | https://www.redhat.com/en/topics/linux/what-is-selinux | `[done]` |
|
||||
| AppArmor — Ubuntu wiki | https://wiki.ubuntu.com/AppArmor | `[done]` |
|
||||
| | **Windows** | |
|
||||
| Windows Server lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-server-2022/ | `[done]` |
|
||||
| Windows Server 2025 lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-server-2025/ | `[done]` |
|
||||
| Windows 11 lifecycle | https://learn.microsoft.com/en-us/lifecycle/products/windows-11-enterprise/ | `[done]` |
|
||||
| Windows 10 EOL | https://learn.microsoft.com/en-us/lifecycle/products/windows-10-enterprise/ | `[done]` |
|
||||
| Windows Server licensing (per core) | https://learn.microsoft.com/en-us/windows-server/get-started/editions-and-support | `[done]` |
|
||||
| | **GPU ceny** | |
|
||||
| NVIDIA AI GPU pricing guide (2026) | https://intuitionlabs.ai/articles/nvidia-ai-gpu-pricing-guide | `[done]` |
|
||||
| GPU cloud pricing comparison (2026) | https://www.spheron.network/blog/gpu-cloud-pricing-comparison-2026/ | `[done]` |
|
||||
| GPU pricing trends 2026 — CompuX | https://compux.net/docs/guides/gpu-pricing-trends-2026 | `[done]` |
|
||||
| AMD MI300X pricing (2026) | https://www.thundercompute.com/blog/amd-mi300x-pricing | `[done]` |
|
||||
| GPU price/performance frontier — Silicon Analysts | https://siliconanalysts.com/tools/frontier | `[done]` |
|
||||
|
||||
## Výrobci hardware
|
||||
|
||||
| Výrobce | Serverové řady | Management |
|
||||
|
||||
Reference in New Issue
Block a user