Files
knowledge-base/SERVER-HW.en.md
Stanislav Hubacek 3fa11ef0f6 comiiit
2026-06-11 15:27:28 +02:00

16 KiB
Raw Blame History

🔧 Server hardware — components and architecture

Form factors

Type Description Advantages Disadvantages
Rack (1U/2U/4U) Standard rack mount, 19" width Wide range of configurations, easy replacement Limited PCIe slots in 1U
Blade Modular server into chassis (HPE Synergy, Dell MX) High density, shared power/cooling Vendor lock-in, higher chassis cost
Tower Standalone cabinet Quiet, expandable Takes space, not rack-optimized
Edge / Micro Small, low power, industrial design Environmental resistance, low consumption Limited performance, fewer PCIe

Processors (CPU)

Intel Xeon vs AMD EPYC

Feature Intel Xeon (6th gen Granite Rapids) AMD EPYC (5th gen Turin)
Max cores 128 (P-cores) 192 (Zen 5c) / 128 (Zen 5)
PCIe lanes 80-96 per socket 128 per socket
Memory channels 8 (DDR5) 12 (DDR5)
Max memory 4 TB 6 TB+
Cache L3 ~200 MB ~384 MB
AVX-512 Yes (full width) Yes (256bit)
AMX (matrix) Yes (AMX, Intel AMX) No
TDP 350-500 W 360-500 W
Infrastructure Intel QuickAssist, DSA, IAA AMD Infinity Architecture
Use case AI inference, networking, HPC Virtualization, databases, general purpose

CPU selection guide

Workload Recommended CPU Rationale
Database (OLTP) EPYC (high core count, more memory channels) More PCIe lanes for NVMe, higher memory bandwidth
Database (OLAP/DW) Xeon (AVX-512, AMX) Vector instructions for analytical queries
Virtualization EPYC (more cores, lower TCO) Higher core density, lower price per core
HPC / AI training Xeon + GPU (AMX for preprocessing) AMX for data preprocessing, GPU for training
Web / API servers EPYC (good perf/core, low TDP variants) Good performance/W ratio
Storage EPYC (128 PCIe lanes for NVMe) Maximum NVMe drives

Memory (RAM)

DIMM types

Type Description Use case Server support
RDIMM (Registered) Registered, buffered address lines (1 register) Standard server memory All servers
LRDIMM (Load-Reduced) Reduced electrical load (2 registers — data + addresses) High-capacity configurations (more DIMMs per channel) Enterprise, 4R+
NVDIMM (Non-Volatile) Battery-backed DRAM + flash Write cache, metadata, persistence Legacy (Intel Optane PMEM)
3D XPoint / Optane PCM-based persistence (discontinued by Intel) Legacy Intel-only, discontinued

DDR5 vs DDR4 key differences

Feature DDR4 DDR5
Channel architecture 1× 64-bit channel per DIMM 2× 32-bit sub-channel per DIMM
Bank groups 4 (single rank) 8 (single rank)
Burst length 8 (BL8) 16 (BL16)
On-die ECC No Yes (for correcting bit errors in DRAM)
PMIC On motherboard On DIMM (power management IC)
VDD 1.2 V 1.1 V
RCD 1× RCD per DIMM 2× RCD (one per sub-channel)
Max DIMM capacity 64 GB (LRDIMM) 256 GB (RDIMM 3DS)
Max speed 3200 MT/s 6400 MT/s (currently 4800-5600)

Memory rank — detail

Rank = set of DRAM chips on a DIMM that are accessible simultaneously (64bit data + 8bit ECC).

Rank Number of DRAM chips (x8) DIMM capacity (typ.) Description
Single Rank (1R) 8-9 8-32 GB All DRAM chips in one bank
Dual Rank (2R) 16-18 16-128 GB Two banks, rank interleaving
Quad Rank (4R) 32-36 64-256 GB (3DS) Four banks, higher capacity
Octa Rank (8R) 64-72 256 GB (3DS) Highest capacity, enterprise

Rank interleaving: Dual-rank DIMM can address two ranks alternately, increasing effective bandwidth (up to 5-15 % over single-rank at the same speed).

DDR5 rank vs DDR4: DDR5 single-rank already contains 8 bank groups (equivalent to dual-rank DDR4), therefore rank upgrade is less significant on DDR5 than DDR4.

Rule: Always prefer dual-rank DIMMs over single-rank for higher density and bandwidth. Quad-rank and octa-rank only LRDIMM or 3DS.

DIMM population — basic rules

1DPC vs 2DPC (DIMMs Per Channel)

Configuration DIMMs per channel Max speed DDR5 Bandwidth Capacity
1DPC 1 4800-5600 MT/s 100 % Lower
2DPC 2 4000-4400 MT/s ~80 % Higher

Important: Populating 2 DIMMs per channel reduces memory speed. E.g. Dell R760:

  • 1DPC: 5600 MT/s (with 5th Gen Xeon)
  • 2DPC: 4400 MT/s (always)

Channel architecture (Intel Xeon 4th/5th Gen — 8 channels per CPU)

CPU 1 — Channel A  [Slot A1 (white)] [Slot A9 (black)]    1DPC: populate white slots
      ─ Channel B  [Slot A7 (white)] [Slot A15 (black)]   2DPC: populate white + black
      ─ Channel C  [Slot A3 (white)] [Slot A11 (black)]
      ─ Channel D  [Slot A5 (white)] [Slot A13 (black)]
      ─ Channel E  [Slot A4 (white)] [Slot A12 (black)]
      ─ Channel F  [Slot A6 (white)] [Slot A14 (black)]
      ─ Channel G  [Slot A2 (white)] [Slot A10 (black)]
      ─ Channel H  [Slot A8 (white)] [Slot A16 (black)]

Channel architecture (AMD EPYC — 12 channels per CPU)

CPU 1 ─ Channel 0-11 (12× single channel, 2 DPC)
       Slot A0 (P0) / Slot A1 (P1) — per specific server model

AMD EPYC has 12 memory channels (vs Intel 8), giving 50 % higher theoretical memory bandwidth.

Population rules by vendor

Dell PowerEdge (R660 / R760)

Number of DIMMs per CPU 1DPC (white slots) 2DPC (white + black) Speed
1 DIMM per CPU A1 (Channel A) 5600 MT/s
2 DIMMs per CPU A1, A7 5600 MT/s
4 DIMMs per CPU A1, A7, A3, A5 5600 MT/s
8 DIMMs per CPU A1-A8 (all white) 5600 MT/s
16 DIMMs per CPU A1-A8 (white) A9-A16 (black) 4400 MT/s

Key Dell rules:

  1. All DIMMs must be DDR5 (do not mix generations)
  2. Do not mix DIMM capacities (all identical)
  3. Do not mix x4 and x8 DRAM chips
  4. Do not mix 3DS and non-3DS RDIMM
  5. If mixing DIMM speeds, all run at the lowest
  6. Balance capacity across processors
  7. Optimal configuration: 16× identical DIMM (1DPC on each channel)
  8. Fault Resilient Memory (FRM): only 8 or 16 DIMMs per processor

HPE ProLiant (DL360 / DL380 Gen11)

Population order (16 slots per CPU, Intel):

DIMMs Population order
1 10
2 1, 3
4 1, 3, 7, 10
6 3, 5, 7, 10, 14, 16
8 1, 3, 5, 7, 10, 12, 14, 16
12 1, 2, 3, 5, 6, 7, 10, 11, 12, 14, 15, 16
16 1-16

HPE SmartMemory rules:

  1. Most qualified configuration: 1DPC (white slots)
  2. 2DPC (black slots) only after populating all white
  3. HBM + 4th Gen Intel: does not support Hemi (hemisphere) and SGX
  4. Heterogeneous mix: higher rank count into white slots
  5. Do not mix: 3DS with non-3DS, x4 with x8, different ranks in channel, 16 Gb / 24 Gb / 32 Gb DRAM

HPE Gen11/Gen12 with AMD EPYC 9005 (a50012817enw)

AMD EPYC 9005 (Turin) delivers 12 memory channels per CPU and supports DDR5-6400.

Feature Detail
Memory channels 12 per CPU (vs 8 on Intel)
Max DIMM slots 24 per CPU (2 DPC)
Max speed DDR5-6400 (1 DPC), DDR5-48005600 (2 DPC)
Max capacity 6 TB+ (12× 256 GB 3DS RDIMM)
DIMM types RDIMM (1R/2R/4R/8R), 3DS RDIMM, LRDIMM
Population 1 DPC (white slots): 12 DIMMs, full speed; 2 DPC: 24 DIMMs, reduced speed
Optimum 12× identical DIMMs (1 DPC on each channel) = max bandwidth

Rules for AMD EPYC 9005:

  1. Populate with equal capacities within a channel
  2. 1 DPC = full speed 6400 MT/s, 2 DPC = lower speed
  3. For optimal bandwidth: 12 DIMMs (1DPC) per CPU — all 12 channels utilized
  4. Maximum capacity: 24 DIMMs (2DPC) — 24× 256 GB = 6 TB per CPU
  5. Do not mix RDIMM and LRDIMM in the same system

Memory population — decision flow

How many DIMMs per CPU?
│
├── 1 DIMM → Channel A (slot 1), losing 87.5 % bandwidth
│
├── 2 DIMMs → Channels A+B, still losing 75 % bandwidth
│
├── 4 DIMMs → Channels A,B,C,D, better but not optimal
│
├── 8 DIMMs → 1DPC on all channels = MAX SPEED (5600 MT/s)
│             ✅ Recommended for performance
│
├── 12 DIMMs → 8× 1DPC + 4× 2DPC = mixed speed (4400 MT/s)
│
├── 16 DIMMs → 2DPC on all channels = MAX CAPACITY (4400 MT/s)
│              ✅ For capacity-intensive workloads
│
└── More than 16 → LRDIMM / 3DS only, speed penalty

Conclusion: 8 DIMMs per CPU (1DPC) = highest performance
            16 DIMMs per CPU (2DPC) = highest capacity

Impact of configuration on performance

Configuration Relative bandwidth Latency Use case
1DPC, 8 ch, 5600 MT/s (8 DIMM) 100 % Lowest OLTP databases, HPC, real-time
2DPC, 8 ch, 4400 MT/s (16 DIMM) ~78 % +10-15 % Virtualization, VDI, in-memory DB
Mixed 1+2DPC (12 DIMM) ~85 % Medium Capacity/performance compromise
Unbalanced channels 50-70 % High Avoid

Vendor recommendations:

  • Dell: 16× identical DIMMs (8 per CPU), 1DPC, 5600 MT/s = optimal performance
  • HPE Intel: Always populate white slots first, 1DPC for max performance, 2DPC for max capacity
  • HPE AMD EPYC 9005: 12 channels per CPU, 1DPC = 12 DIMMs per CPU at 6400 MT/s (max bandwidth); 2DPC = 24 DIMMs per CPU (max capacity 6 TB)
  • Supermicro: Consult specific manual for the given model (DSG, GPU, storage)
  • Lenovo: Same rules as Intel/AMD platform — prefer 1DPC

Memory sizing per workload

Workload RAM/core ratio Typical pool Recommended configuration
Database (OLTP) 8-16 GB/core, DB in RAM 256 GB - 2 TB 8× 32-64 GB RDIMM, 1DPC
Database (OLAP) 16-64 GB/core, columnstore 512 GB - 4 TB+ 16× 64-128 GB RDIMM, 2DPC
Virtualization (VM) 4-8 GB/core, per VM density 256 GB - 2 TB 8-16× 32-64 GB RDIMM
Kubernetes (general) 2-4 GB/core 64-256 GB 8× 16-32 GB RDIMM, 1DPC
AI training (CPU preprocessing) 2-4 GB/core 128-512 GB 8× 32-64 GB RDIMM, 1DPC
HPC 1-2 GB/core 64-128 GB 8× 16 GB RDIMM, 1DPC, high-speed
In-memory DB (SAP HANA) 8-32 GB/core 1-6 TB+ 16× 128-256 GB LRDIMM/3DS

PCIe

Generation Year Speed per lane x16 throughput x24 (GPU)
PCIe 3.0 2010 985 MB/s 15.8 GB/s 23.6 GB/s
PCIe 4.0 2017 1.97 GB/s 31.5 GB/s 47.3 GB/s
PCIe 5.0 2022 3.94 GB/s 63 GB/s 94.5 GB/s
PCIe 6.0 2025 7.88 GB/s 126 GB/s 189 GB/s

PCIe lane allocation:

  • GPU (x16): NVIDIA H100, AMD MI300X
  • NVMe U.2 (x4): each NVMe drive
  • NIC 100 GbE (x16): dual-port 100 GbE
  • RAID/HBA (x8): storage controller

CPU PCIe lane count:

  • Intel Xeon Scalable (4th gen): 64-80 lanes per socket
  • AMD EPYC (4th gen Genoa): 128 lanes per socket
  • Dual-socket: 256 lanes total

NUMA

Topology

Socket 0 (NUMA node 0)              Socket 1 (NUMA node 1)
    ├── Cores 0-31                      ├── Cores 32-63
    ├── Memory 0-256 GB                 ├── Memory 256-512 GB
    ├── PCIe root complex (GPU, NVMe)   ├── PCIe root complex (NIC, NVMe)
    └── I/O hub                        └── I/O hub
               │                           │
               └───────── Infinity Fabric / UPI ──┘
  • Local access — CPU → own memory (low latency, full bandwidth)
  • Remote access — CPU → second socket memory (higher latency, ~1.5×, lower bandwidth)
  • NUMA-aware applications: databases, VMs, DPDK, AI training

Cross-NUMA penalty

CPU Local latency Remote latency Penalty
AMD EPYC (Genoa) ~80 ns ~150 ns ~1.9×
Intel Xeon (Sapphire Rapids) ~90 ns ~160 ns ~1.8×

TDP and cooling

CPU TDP Core count Cooling
Intel Xeon Platinum 8480+ 350 W 56 Air (high-performance)
Intel Xeon 6980P (Granite Rapids) 500 W 128 Liquid recommended
AMD EPYC 9654 (Genoa) 360 W 96 Air / Liquid
AMD EPYC 9965 (Turin) 500 W 192 Liquid recommended

Cooling requirements per rack density

Rack density kW/rack Cooling
Low 1-5 kW Free air cooling
Medium 5-15 kW CRAC/CRAH, hot/cold aisle
High 15-40 kW In-row cooling, rear-door HX
Ultra 40-100+ kW Direct-to-chip liquid, immersion

BMC and management

Vendor BMC API Remote console Features
Dell iDRAC (9/10) Redfish, RACADM Virtual Console (HTML5) Lifecycle Controller, SUU
HPE iLO (5/6) Redfish, iLOREST Integrated Remote Console Smart Update Manager, SUM
Supermicro BMC / IPMI IPMI, Redfish IPMIView, HTML5 KVM SuperDoctor, SSM
Lenovo XClarity Controller Redfish, IPMI Remote Console XClarity Administrator
Cisco CIMC / UCSM Redfish, XML API KVM Console UCS Manager, Intersight

Standard functions

  • Power: on/off/cycle/reset
  • Boot: one-shot PXE, CD-ROM redirect, BIOS setup
  • Monitoring: sensors (temp, voltage, fan, PSU)
  • Alerting: SNMP traps, email, Redfish events
  • Remote media: ISO mount over network
  • Serial over LAN (SOL)

Vendors and series

Vendor Rack series Blade series Management
Dell PowerEdge R6xx/R7xx (R660, R760) MX7000, FX2 iDRAC, OpenManage Enterprise
HPE ProLiant DL (DL360, DL380) Synergy, BladeSystem iLO, OneView, OpsRamp
Cisco UCS C-Series (C240, C245) UCS B-Series, Fabric Interconnect UCS Manager, Intersight
Lenovo ThinkSystem SR (SR630, SR650) ThinkSystem SN XClarity
Supermicro SuperServer (for GPU, storage, cloud) FatTwin, MicroBlade IPMI, SuperDoctor

Server connectivity

Detailed chapter on network and storage connectivity: CONNECTIVITY.md

Storage controllers

Controller Type RAID Cache Protocol
Dell PERC (H755, H965) HW RAID 0/1/5/6/10/50/60 4-8 GB NV NVMe, SAS, SATA
Broadcom / LSI (9560, 9670) HW RAID / HBA 0/1/5/6/10/50/60 4 GB NV NVMe, SAS, SATA
Intel VROC SW RAID (CPU) 0/1/5/10 NVMe only
M.2 HW RAID (BOSS-S1) HW RAID 0/1 2× M.2 NVMe/SATA

IT vs HW RAID mode

Feature IT (Initiator Target) / HBA HW RAID
OS sees Each disk individually RAID virtual disk
Caching OS cache RAID controller cache (BBU)
RAID Software (mdadm, ZFS, Ceph) Hardware + SW driver
Passthrough Yes No
Use case SDS (Ceph, MinIO), ZFS VMware VMFS, Windows, legacy
Battery/Backup Not needed Write-back cache requires BBU

Sources

Links, books and standards: sources/infrastructure/sources.md

Last revision: 2026-06-03