Files
knowledge-base/CONNECTIVITY.md
2026-06-03 22:50:00 +02:00

271 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 🔌 Server connectivity — síťová a storage konektivita
## Ethernet — síťová konektivita
### Rychlosti a formáty
| Rychlost | Označení | Form factor | Kabeláž | Rok standardu | Use case |
|----------|----------|-------------|---------|---------------|----------|
| **1 GbE** | 1000BASE-T | RJ45 (copper) | Cat5e/Cat6 | 1999 | Management, legacy |
| **10 GbE** | 10GBASE-T / SFP+ | RJ45 / SFP+ | Cat6A (30m) / Cat7 (100m) / DAC / SR/LR | 2006 | Běžný server, storage |
| **25 GbE** | 25GBASE-R | SFP28 | Cat8 (30m) / DAC (5m) / SR/LR (100m/10km) | 2016 | Standard pro servery (2020+) |
| **40 GbE** | 40GBASE-R | QSFP+ | DAC (7m) / SR (150m) / LR (10km) | 2010 | Legacy, spine |
| **50 GbE** | 50GBASE-R | SFP56 | DAC / SR / LR | 2018 | Emerging server |
| **100 GbE** | 100GBASE-R | QSFP28 | DAC (3m) / SR4 (100m) / LR4 (10km) / PSM4 (500m) | 2015 | Spine, storage, AI |
| **200 GbE** | 200GBASE-R | QSFP56 | DAC / SR4 / DR4 | 2019 | AI/ML, HPC |
| **400 GbE** | 400GBASE-R | QSFP-DD / OSFP | DAC (2.5m) / SR8 (100m) / DR4 (500m) / FR4 (2km) | 2017 | AI training, hyperscale |
| **800 GbE** | 800GBASE-R | QSFP-DD800 / OSFP | DAC (2m) / SR8 (100m) / DR8 (500m) | 2024 | Next-gen AI/ML |
**Doporučení pro servery (2026)**:
- **Standard**: 2× 25 GbE (management + data) nebo 2× 100 GbE pro náročné workloady
- **AI/ML training**: 8× 400 GbE (InfiniBand preferován pro GPU communication)
- **Storage**: 2× 25/100 GbE (iSCSI/NFS) nebo dedikovaná FC (16/32 Gbps)
### Form factor NIC
| Form factor | PCIe lanes | Rychlost | Use case |
|------------|-----------|----------|----------|
| **OCP 3.0** | x8/x16 | 25/100/200 GbE | Moderní servery (Dell, HPE), small form factor |
| **PCIe HHHL** | x8 | 25/50 GbE | Standardní 1U/2U servery |
| **PCIe FHHL** | x16 | 100/200/400 GbE | GPU servery, high-density |
| **Mezzanine** | x8 | 10/25 GbE | Blade servery (HPE Synergy, Dell MX) |
| **LOM (LAN on Motherboard)** | — | 1/10/25 GbE | Integrovaný, základní konektivita |
### NIC features
| Feature | Popis | Benefit |
|---------|-------|---------|
| **TSO/GRO** | TCP Segmentation Offload / Generic Receive Offload | Snížení CPU zátěže pro TCP |
| **LRO/LSO** | Large Receive/Send Offload | Obdoba TSO/GRO pro legacy |
| **RSS** | Receive Side Scaling | Distribuce příchozích packetů přes více CPU jader |
| **RPS/RFS** | Receive Packet Steering / Flow Steering | Softwarové RSS, cache affinity |
| **XDP** | eXpress Data Path | BPF-based packet processing (DDoS, load balancer) |
| **RDMA (RoCE v2)** | RDMA over Converged Ethernet | GPU direct communication, storage (NVMe-oF) |
| **iWARP** | RDMA over TCP | RDMA bez speciálního switch (vyšší latence) |
| **DPDK** | Data Plane Development Kit | Uživatelský prostor pro packet processing (VNF, vSwitch) |
| **VXLAN/NVGRE offload** | HW offload pro tunelování | Overlay networking (VMware NSX, OpenStack) |
| **SR-IOV** | Single Root I/O Virtualization | Direct NIC access pro VM (VF), nízká latence |
| **Flow Bifurcation** | Split NIC traffic mezi kernel a DPDK | Souběžný management a high-speed data path |
| **PTP (IEEE 1588)** | Precision Time Protocol | Finanční služby, 5G, telco |
### NIC selection per workload
| Workload | Doporučená NIC | Zdůvodnění |
|----------|---------------|------------|
| **Web / API servery** | 2× 25 GbE SFP28, OCP | Nízká cena, dostatečná bandwidth |
| **Virtualizace (VMware)** | 2× 25 GbE (SR-IOV, VXLAN offload) | SR-IOV pro VM, VXLAN pro NSX |
| **Databáze (OLTP)** | 2× 25/100 GbE (RSS, low latency) | Nízká latence, RSS pro CPU scaling |
| **Storage (NFS/iSCSI)** | 2× 25/100 GbE (RoCE v2) | RDMA pro NVMe-oF, low latency |
| **Storage (FC SAN)** | 2× 32 Gb FC HBA | SAN pro VMware VMFS, block storage |
| **AI/ML training** | 8× 400 GbE + InfiniBand NDR | GPU communication, data ingestion |
| **AI/ML inference** | 4× 100 GbE (RoCE v2) | Model serving, GPU direct |
| **HPC** | InfiniBand NDR 400 Gbps | MPI communication, low latency |
| **Telco / Edge** | 2× 25 GbE (DPDK, PTP) | VNF, 5G UPF, low latency |
---
## Storage connectivity
### Fibre Channel (FC) SAN
| Generace | Rychlost | Označení | Form factor | Dosah (SMF) | Use case |
|----------|----------|----------|-------------|-------------|----------|
| **Gen 5** | 16 Gbps | 16GFC | SFP+ | 10 km | Legacy SAN |
| **Gen 6** | 32 Gbps | 32GFC | SFP28 | 10 km | Současný standard |
| **Gen 7** | 64 Gbps | 64GFC | SFP56 | 10 km | Emerging, high-performance |
| **Gen 8** | 128 Gbps | 128GFC | QSFP28 | 10 km | Emerging (první produkční nasazení) |
**HBA (Host Bus Adapter)**:
| Výrobce | Model | Rychlost | PCIe | Porty | Features |
|---------|-------|----------|------|-------|----------|
| **Broadcom / Emulex** | LPe35000 | 32 GFC | PCIe 3.0 x8 | 1-2 | NVMe-FC, T10-PI, SR-IOV |
| **Broadcom / Emulex** | LPe36000 | 64 GFC | PCIe 4.0 x16 | 1-2 | NVMe-FC, FC-NVMe |
| **Marvell / QLogic** | QLE2770 | 32 GFC | PCIe 3.0 x8 | 1-2 | FC-NVMe, T10-PI |
| **Marvell / QLogic** | QLE2870 | 64 GFC | PCIe 4.0 x8 | 1-2 | NVMe-FC, 64GFC |
**FC SAN topology**:
```
Server ──HBA── FC Switch ──── Storage Array (FC port)
│ │
│ ┌────┴────┐
│ │ Fabric │
│ └─────────┘
──── ISL (Inter-Switch Link) ──── backup fabric (B)
```
**Zoning** (FC):
```
Zone A: Server1_HBA1 + Storage_Port1 (production)
Zone B: Server1_HBA2 + Storage_Port2 (backup fabric)
Zone C: Backup_Server + Storage_Target (backup)
```
### iSCSI
| Vlastnost | iSCSI | Poznámka |
|-----------|-------|----------|
| **Transport** | TCP/IP (port 3260) | Po standardním ethernetu |
| **Rychlost** | 1/10/25/100 GbE | Stejná jako Ethernet |
| **Initiator** | SW (OS) nebo HW (TOE) | SW initiator zdarma, ~5-10 % CPU load |
| **Multipathing** | MPIO (Multiple Connections per Session) | Až 8 cest, active/active nebo active/passive |
| **CHAP** | Authentication | Mutual CHAP doporučen |
| **Jumbo frames** | Doporučeno MTU 9000 | Snížení CPU overhead, vyšší throughput |
| **Use case** | Malé a střední SAN, backup, DR | Levnější než FC, nižší výkon |
**iSCSI configuration**:
```
# Software initiator (Linux)
iscsiadm -m discovery -t sendtargets -p 10.0.0.100:3260
iscsiadm -m node --login -T iqn.2024-05.storage:array01
# Multipath (dm-multipath)
mpathconf --enable --with_multipathd y
# /etc/multipath.conf: aliases, failback, rr_min_io
```
### NVMe-oF (NVMe over Fabrics)
| Transport | Protokol | Latence | CPU overhead | Use case |
|-----------|----------|---------|-------------|----------|
| **NVMe over FC** | FC-NVMe (FC Gen 6/7) | <10 µs | Nízký | Enterprise SAN, VMware |
| **NVMe over RDMA (RoCE v2)** | RDMA (RoCE) | <5 µs | Velmi nízký | AI/ML, HPC, K8s (CSI) |
| **NVMe over TCP** | TCP | ~50 µs | Střední (10-20 % CPU) | Standardní Ethernet, bez RDMA |
| **NVMe over InfiniBand** | IB RC/UC | <3 µs | Nejnižší | HPC, AI training |
**NVMe-oF comparison**:
| Vlastnost | FC-NVMe | NVMe/RoCE | NVMe/TCP | NVMe/IB |
|-----------|---------|-----------|----------|---------|
| **Latence (target)** | ~8 µs | ~4 µs | ~50 µs | ~3 µs |
| **Bandwidth** | 64 Gbps | 100/200 GbE | 25/100 GbE | NDR 400 Gbps |
| **Requires special HW** | FC HBA + switch | RoCE NIC + DCB switch | Standard NIC | IB HCA + switch |
| **Ecosystem** | Broadcom, Marvell | NVIDIA, Broadcom | OS built-in | NVIDIA Mellanox |
| **Use case** | VMware, enterprise SAN | AI/ML, K8s, HPC | SMB, K8s, cost-effective | HPC, large AI |
### SAS (Serial Attached SCSI)
| Generace | Rychlost | Kabeláž | Dosah | Use case |
|----------|----------|---------|-------|----------|
| **SAS 3** | 12 Gbps | SAS cable (SFF-8644) | 6-10 m | Legacy storage, DAS |
| **SAS 4** | 22.5 Gbps | SAS cable (SFF-8644) | 6-10 m | Současný standard |
| **SAS 5** | 45 Gbps | SAS cable (SFF-8644) | 6-10 m | Emerging |
**SAS topology**: Server → SAS HBA → SAS expander → SAS disk (point-to-point, ne shared jako FC)
---
## Server connectivity — decision matrix
| Workload | Primární | Sekundární | Management |
|----------|----------|-----------|------------|
| **Web / API** | 2× 25 GbE (LACP) | — | 1× 1 GbE BMC |
| **Databáze** | 2× 25/100 GbE (RSS) | 2× 32 Gb FC (SAN) | 1× 1 GbE BMC |
| **Virtualizace** | 4× 25 GbE (SR-IOV) | 2× 32 Gb FC (VMFS) | 1× 1 GbE BMC |
| **Kubernetes** | 2× 25/100 GbE | — | 1× 1 GbE BMC |
| **Storage node** | 2× 100 GbE (RoCE) | 2× 25 GbE (management) | 1× 1 GbE BMC |
| **AI training** | 8× 400 GbE + IB NDR | 4× 100 GbE (storage) | 1× 1 GbE BMC |
| **AI inference** | 4× 100 GbE (RoCE) | 2× 25 GbE (management) | 1× 1 GbE BMC |
| **HPC** | InfiniBand NDR | 2× 100 GbE (storage) | 1× 1 GbE BMC |
---
## Server NIC placement (PCIe slot optimization)
```
2U Server (GPU/AI):
┌─────────────────────────────────────────────────┐
│ PCIe 0: GPU (x16) — NVLink / InfiniBand (x16) │
│ PCIe 1: GPU (x16) — NIC 100 GbE (x16) │
│ PCIe 2: GPU (x16) │
│ PCIe 3: GPU (x16) │
│ PCIe 4: GPU (x16) │
│ PCIe 5: GPU (x16) — NIC 100 GbE (x16) │
│ PCIe 6: Storage HBA / NIC (x8) │
│ PCIe 7: Management / OCP (x8) │
└─────────────────────────────────────────────────┘
1U Standard:
┌─────────────────────────────────┐
│ OCP: 2× 25 GbE (management) │
│ PCIe 0: NIC 25 GbE (x8) │
│ PCIe 1: Storage HBA / FC (x8) │
│ PCIe 2: GPU (x16, optional) │
│ PCIe 3: NVMe (x4, M.2) │
└─────────────────────────────────┘
```
### NVIDIA Mellanox ConnectX NICs
NVIDIA Mellanox je přední výrobce NIC adaptérů pro AI/HPC a cloud datová centra.
| Model | PCIe | Max rychlost | Form factor | Klíčové features |
|-------|------|-------------|-------------|------------------|
| **ConnectX-5** | PCIe 3.0 x16 | 100 GbE (dual) | HHHL | RoCE, NVMe-oF target offload, MPI offload |
| **ConnectX-6 Dx** | PCIe 4.0 x16 | 200 GbE (1-port) / 100 GbE (2-port) | HHHL, OCP 3.0 | ASAP² vSwitch offload, IPsec/TLS inline crypto, AES-XTS, 215 Mpps DPDK |
| **ConnectX-6 Lx** | PCIe 4.0 x8 | 25 GbE (dual) | HHHL, OCP 3.0 | RoCE, Secure Boot, low-power |
| **ConnectX-7** | PCIe 5.0 x16 | 400 GbE (1-port) / 200 GbE (2-port) | HHHL | NDR InfiniBand + 400GbE, GPUDirect, SHARP |
| **ConnectX-8** | PCIe 6.0 x16 | 800 GbE (1-port) / 400 GbE (2-port) | HHHL | XDR InfiniBand, sub-500ns latence, in-network computing, multi-host |
**Platformy**: Spectrum-X Ethernet (end-to-end AI networking), Quantum InfiniBand, BlueField DPU.
### Broadcom Emulex FC HBA
| Model | Rychlost | PCIe | Porty | Features |
|-------|----------|------|-------|----------|
| **LPe35000** (Gen 7) | 32 GFC | PCIe 3.0 x8 | 1-2 | NVMe-FC, T10-PI (DIF), SR-IOV, Silicon Root of Trust |
| **LPe35002** (Gen 7) | 32 GFC | PCIe 3.0 x8 | 2 | NVMe-FC, Secure Boot, digitálně podepsaný firmware |
| **LPe36000** (Gen 7) | 64 GFC | PCIe 4.0 x16 | 1-2 | První 64GFC HBA na trhu, 10M IOPS, 3× lepší latence než Gen 6 |
**Klíčové vlastnosti**: podpora NVMe over FC, T10 DIF (Data Integrity Field), 10M MTBF, NIST SP 800-193 compliant. Gen 7 přináší až 10M IOPS a 3× nižší latenci oproti Gen 6.
### NVMe-oF specifikace
NVMe over Fabrics (NVMe-oF) rozšiřuje NVMe protokol z lokálního PCIe na síťové transporty. První specifikace 1.0 vydána v červnu 2016, aktuálně součástí NVMe 2.3 (srpen 2025). Podporované transporty:
| Transport | Specifikace | Use case |
|-----------|------------|----------|
| **NVMe over PCIe** | NVMe Base | Lokální NVMe SSD |
| **NVMe over RDMA** (RoCE, InfiniBand, iWARP) | NVMe Transport | AI/ML, HPC, nejnižší latence <5 µs |
| **NVMe over TCP** | NVMe Transport | Standardní Ethernet, bez RDMA, latence ~50 µs |
| **NVMe over FC** (FC-NVMe) | INCITS T11 | Enterprise SAN, FC fabric |
NVMe 2.3 přidává Computational Programs Command Set, Storage Level Management (SLM), a Zoned Namespaces (ZNS). NVMe-MI definuje management rozhraní.
### Dell PowerEdge R760 — NIC placement
Server Dell R760 podporuje:
- **OCP 3.0** adaptéry (až 2×) — 1/10/25/100 GbE
- **PCIe Gen5** sloty — 8× slotů (6× FHHL + 2× LP)
- **LOM** — 2× 1 GbE Broadcom 5720 na základní desce
- Maximální rychlost NIC: 100 GbE (QSFP56)
- Supported typy: RJ45, SFP+, SFP28, QSFP28, QSFP56
Doporučené konfigurace:
- Standard: OCP 3.0 2× 25 GbE + PCIe storage HBA
- AI/ML: PCIe 100 GbE (riser config 1, slot 1-2) + GPU v ostatních slotech
### HPE Gen11 NIC options
HPE ProLiant Gen11 (DL360/DL380) podporuje:
- **OCP 3.0** sloty (až 2) — 10/25/100/200 GbE (Broadcom, Intel, NVIDIA Mellanox)
- **PCIe Gen5** adaptéry — 8× slotů (DL380) / 3× sloty (DL360)
- **iLO 6** dedikovaný management port (1 GbE)
- Podporované NIC: Broadcom BCM57412 (10GbE), BCM57504 (25GbE), NVIDIA ConnectX-6 Dx (100GbE)
## Zdroje
Odkazy, knihy a standardy: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
### Doporučená literatura
| Kniha | Autoři | ISBN | Popis |
|-------|--------|------|-------|
| AI Data Center Network Design and Technologies (1st ed., 2026) | Subramaniam, Styszynski, Tambakuwala | 978-0-13-543628-8 | První vendor-agnostický průvodce návrhem sítí pro AI trénování a inferenci. Pokrývá high-radix fabric, lossless Ethernet/IP, UEC technologie, chlazení a power pro AI klastry. Autoři z HPE Juniper Networking. |
*Poslední revize: 2026-06-03*