789 lines
38 KiB
Markdown
789 lines
38 KiB
Markdown
# 🏭 Datacenters
|
||
|
||
## Tier classification (TIA-942 / Uptime Institute)
|
||
|
||
| Tier | Availability | Downtime / year | Redundancy |
|
||
|------|-------------|-----------------|------------|
|
||
| **Tier I** | 99.671 % | 28.8 h | N — no redundancy |
|
||
| **Tier II** | 99.741 % | 22.7 h | N+1 — redundant components |
|
||
| **Tier III** | 99.982 % | 1.6 h | N+1 — concurrently maintainable |
|
||
| **Tier IV** | 99.995 % | 26.3 min | 2N+1 — fault tolerant |
|
||
|
||
## Key subsystems
|
||
|
||
| System | Description |
|
||
|--------|-------------|
|
||
| **Power** | UPS, generators (diesel), ATS, PDU, redundant feeds (A/B feed) |
|
||
| **Cooling** | CRAC/CRAH, chilled water, free cooling, containment (hot/cold aisle) |
|
||
| **Physical security** | CCTV, biometric access, mantrap, rack security locks |
|
||
| **Cabling** | Structured cabling (Cat6A/7/8, OM3/OM4 single-mode fiber), patch panels |
|
||
| **Fire suppression** | Alarm, inert gases (Novec, FM-200), VESDA (very early smoke detection) |
|
||
| **Monitoring** | DCIM (Data Center Infrastructure Management), SNMP, BMS (Building Management System) |
|
||
|
||
## Aisle containment
|
||
|
||
```
|
||
┌────────────────────────────────────┐
|
||
│ Rack Row │
|
||
│ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ │
|
||
Cold │ │ │ │ │ │ │ │ │ │ │ │ │ │ Cold
|
||
Aisle <──│ └──┘ └──┘ └──┘ └──┘ └──┘ └──┘ ──> Aisle
|
||
│ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ ┌──┐ │
|
||
Hot │ │ │ │ │ │ │ │ │ │ │ │ │ │ Hot
|
||
Aisle ──>│ └──┘ └──┘ └──┘ └──┘ └──┘ └──┘ <── Aisle
|
||
└────────────────────────────────────┘
|
||
```
|
||
|
||
## Environmental classes (ASHRAE TC 9.9)
|
||
|
||
ASHRAE Technical Committee 9.9 defines temperature and humidity envelopes for IT equipment in DC.
|
||
|
||
| Class | Temperature (recommended) | Temperature (allowable) | Usage |
|
||
|-------|--------------------------|-------------------------|-------|
|
||
| **A1** | 18-27 °C | 15-32 °C | Enterprise DC, strict control |
|
||
| **A2** | 18-27 °C | 10-35 °C | Standard DC |
|
||
| **A3** | 18-27 °C | 5-40 °C | Looser environment |
|
||
| **A4** | 18-27 °C | 5-45 °C | Maximum cooling savings |
|
||
| **H1** | 18-22 °C | 5-25 °C | High-density air-cooled (AI/ML) |
|
||
|
||
- 5th edition (2021) added class H1 for high-density and extended liquid cooling W-classes (W17, W27, W32, W40, W45, W+)
|
||
- 2024: new S-classes for Technology Cooling System (TCS) liquid cooling
|
||
- Humidity: recommended −9 °C DP to 70 % RH (at low pollutants); max 50 % RH at high corrosivity
|
||
|
||
## Power
|
||
|
||
### Power chain
|
||
|
||
```
|
||
Grid ──> Transformer ──> UPS ──> PDU ──> Rack PDU ──> Server PSU
|
||
│
|
||
├──> Generator (ATS switches on outage)
|
||
└──> STS/ATS (Static Transfer Switch)
|
||
```
|
||
|
||
A/B feed topology:
|
||
```
|
||
Grid A ──> UPS A ──> PDU A1 ──> Rack PDU A ──> PSU A (server)
|
||
│
|
||
Grid B ──> UPS B ──> PDU B1 ──> Rack PDU B ──> PSU B (server)
|
||
```
|
||
Each server has 2 PSUs — each powered from a different branch (A/B). On failure of one branch, the server continues without interruption.
|
||
|
||
### UPS types
|
||
|
||
| Classification | IEC 62040-3 | Description | Switching | Use case |
|
||
|--------------|-------------|-------------|-----------|----------|
|
||
| **VFD** (Voltage & Frequency Dependent) | Passive standby | UPS in bypass, switches to inverter on failure | 4-10 ms | SOHO, edge |
|
||
| **VI** (Voltage Independent) | Line-interactive | Voltage regulation via autotransformer | 2-4 ms | Smaller racks, office |
|
||
| **VFI** (Voltage & Frequency Independent) | Double-conversion | AC → DC → AC, full isolation, zero switching time | 0 ms | Enterprise DC, Tier III/IV |
|
||
|
||
For DC the standard is **VFI (double-conversion)** — online UPS, zero switching time, full isolation from the grid.
|
||
|
||
### Battery technologies
|
||
|
||
| Type | Density (Wh/L) | Lifespan (cycles) | Lifespan (years) | Temperature | Cost/kWh | Note |
|
||
|------|---------------|-------------------|------------------|-------------|----------|------|
|
||
| **VRLA** (AGM/Gel) | 50-80 | 200-500 | 3-5 | 20-25 °C | ~$150-200 | Cheap, large, heavy, temperature sensitive |
|
||
| **Li-ion (LFP)** | 200-350 | 3000-5000 | 10-15 | 0-40 °C | ~$300-500 | Small, light, long life, BMS required |
|
||
| **Li-ion (NMC)** | 250-400 | 1000-2000 | 8-12 | 0-40 °C | ~$250-400 | Higher density, thermal runaway risk |
|
||
| **NiCd** | 80-150 | 1000-2000 | 10-15 | −20-50 °C | ~$400-600 | Extreme temperatures, memory effect |
|
||
| **Flow battery** (V/Zn/Br) | 20-40 | 10,000+ | 20+ | 10-35 °C | ~$500-800 | Unlimited cycles, large, long-term backup |
|
||
|
||
Li-ion (LFP) is becoming the standard for new DCs due to longer life, smaller footprint, and better behavior at high temperatures.
|
||
|
||
### Generator sizing
|
||
|
||
| Variant | Size | Fuel | Start time | Run time | Use case |
|
||
|---------|------|------|------------|----------|----------|
|
||
| **Diesel** | 500-2500 kVA | Diesel | 10-30 s | 24-72 h (depending on tank) | Standard for enterprise DC |
|
||
| **Nat. gas** | 200-1500 kVA | Natural gas | 10-30 s | Unlimited (pipeline) | Less common, lower emissions |
|
||
| **CHP** (cogeneration) | 500-2000 kVA | Natural gas | 5-15 min | Unlimited | Combined power + cooling (absorption chiller) |
|
||
|
||
Sizing: Generator should cover 100 % IT load + 100 % cooling load (incl. chillers) — typically 1.3-1.8× IT load. Diesel tank min. for 24 h operation, commonly 48-72 h. Daily consumption ~0.3-0.4 L/kWh.
|
||
|
||
### ATS vs STS
|
||
|
||
| Feature | ATS (Automatic Transfer Switch) | STS (Static Transfer Switch) |
|
||
|---------|--------------------------------|-----------------------------|
|
||
| **Switching** | 4-10 ms (mechanical relay) | < 4 ms (thyristor) |
|
||
| **Lifespan** | ~10,000 switches | Unlimited (solid-state) |
|
||
| **Cost** | Low | High (~3-5× ATS) |
|
||
| **Use case** | Generator → UPS feed | Between two UPS outputs |
|
||
|
||
### PDU types
|
||
|
||
| Type | Description | Use case |
|
||
|------|-------------|----------|
|
||
| **Basic** | Passive splitter (no monitoring) | Edge, office |
|
||
| **Metered** | Current measurement at PDU level | Standard DC |
|
||
| **Monitored** | Measurement per outlet, SNMP, web GUI | Enterprise DC |
|
||
| **Switched** | On/off per outlet, remote reboot | Enterprise DC, colo |
|
||
| **High-density** | 3-phase, 60-100 A, C19 outlets | GPU/HPC/AI racks |
|
||
|
||
### Power calculation
|
||
|
||
```
|
||
Total Power = Σ(P_server + P_storage + P_network + P_cooling + P_losses)
|
||
|
||
P_server = P_idle + (P_max - P_idle) × Utilization%
|
||
P_cooling = P_IT / PUE
|
||
|
||
Example:
|
||
100 servers × 500 W (avg) = 50 kW IT load
|
||
PUE = 1.5 → total 75 kW
|
||
UPS + generator → sized for 75 kW × 1.2 (safety factor) = 90 kW
|
||
```
|
||
|
||
### PUE (Power Usage Effectiveness)
|
||
|
||
```
|
||
PUE = Total Facility Energy / IT Equipment Energy
|
||
```
|
||
|
||
| PUE | Efficiency | Type |
|
||
|-----|-----------|------|
|
||
| 1.0-1.1 | Excellent | Hyperscale (Google, Meta) |
|
||
| 1.1-1.3 | Very good | Modern DC |
|
||
| 1.3-1.6 | Good / average | Enterprise DC |
|
||
| 1.6-2.0 | Below average | Older DC |
|
||
| >2.0 | Poor | Legacy |
|
||
|
||
PUE is measured at the whole DC level, not per rack. Includes: UPS losses, cooling, lighting, distribution losses. Excludes: well-to-tank fuel production, embodied carbon. Target for modern DC: PUE < 1.2.
|
||
|
||
### WUE and CUE
|
||
|
||
| Metric | Description | Formula | Target |
|
||
|--------|-------------|---------|--------|
|
||
| **WUE** (Water Usage Effectiveness) | Water consumption per IT energy | WUE = Annual Water Usage / IT Energy (L/kWh) | < 0.5 L/kWh |
|
||
| **CUE** (Carbon Usage Effectiveness) | CO₂ emissions per IT energy | CUE = Total CO₂ / IT Energy (kg CO₂/kWh) | < 0.2 kg CO₂/kWh |
|
||
|
||
WUE is critical in dry regions (southwest US, Australia, Middle East). Adiabatic cooling consumes significantly more water than closed-loop cooling.
|
||
|
||
### 3-phase vs Single-phase
|
||
|
||
| Feature | Single-phase (230 V) | 3-phase (400 V) |
|
||
|---------|---------------------|-----------------|
|
||
| **Voltage** | 230 V (L-N) | 230/400 V (L-N/L-L) |
|
||
| **Power per feed** | ~7.4 kW (32 A) | ~22 kW (32 A, 3-ph) |
|
||
| **Efficiency** | Lower (more losses) | Higher (lower current) |
|
||
| **Use case** | Smaller racks, office | Standard in DC, high-density |
|
||
| **PDU** | Single-phase (C13/C19) | 3-phase (C13/C19, 3-ph monitoring) |
|
||
| **Balancing** | Automatic | Phase balancing required (L1/L2/L3) |
|
||
|
||
### Rack power density
|
||
|
||
| Cat. | Type | kW/rack | Power | Cooling |
|
||
|------|------|---------|-------|---------|
|
||
| Low | Office, storage | 1-3 kW | 1-ph, 16 A | Air (free cooling) |
|
||
| Medium | Standard compute | 5-10 kW | 3-ph, 32 A | Air (CRAC/CRAH) |
|
||
| High | GPU, HPC | 15-30 kW | 3-ph, 60 A | Air + liquid assist |
|
||
| Ultra | AI/ML clusters | 40-100+ kW | 3-ph, 100+ A | Direct-to-chip / immersion |
|
||
|
||
### Rack PDU connectors
|
||
|
||
| Connector | Max current | Device type |
|
||
|-----------|-------------|-------------|
|
||
| **C13** | 10 A (250 V) | Servers, switches, 1U |
|
||
| **C19** | 16 A (250 V) | Higher power servers, UPS |
|
||
| **IEC 60309** (3-ph) | 16-125 A | Rack PDU inputs |
|
||
| **NEMA L6-30** | 30 A (250 V) | US spec |
|
||
|
||
## Cooling
|
||
|
||
### Cooling — technology overview
|
||
|
||
| Technology | Type | Output (kW/rack) | Typical PUE | CAPEX | Use case |
|
||
|-----------|------|-----------------|-------------|-------|----------|
|
||
| **Free air cooling** | Air | < 5 | 1.05-1.15 | Low | Climatically suitable locations |
|
||
| **CRAC (DX)** | Air | 5-10 | 1.4-1.8 | Medium | Smaller DC, retrofit |
|
||
| **CRAH (CW)** | Air | 5-15 | 1.2-1.5 | High | Enterprise DC |
|
||
| **In-row cooling** | Air | 10-25 | 1.2-1.4 | High | High-density racks |
|
||
| **Rear-door HX** | Hybrid | 15-30 | 1.1-1.3 | Medium | Retrofits, GPU |
|
||
| **Direct-to-chip** | Liquid | 40-100+ | 1.05-1.15 | High | AI/ML, HPC |
|
||
| **Immersion (single-phase)** | Liquid | 50-100+ | 1.03-1.10 | High | Bitcoin, hyperscale |
|
||
| **Immersion (two-phase)** | Liquid | 100-200+ | 1.03-1.08 | Very high | Extreme density |
|
||
|
||
### Chilled water vs Direct Expansion (DX)
|
||
|
||
| Feature | Chilled water (CW) | Direct Expansion (DX) |
|
||
|---------|-------------------|----------------------|
|
||
| **Medium** | Water + glycol | Refrigerant (R134a, R410A, R454B) |
|
||
| **CRAC/CRAH** | CRAH (Coolant-based) | CRAC (refrigerant compressor) |
|
||
| **Efficiency** | Higher (COP 5-7) | Lower (COP 2-4) |
|
||
| **Water temperature** | 7-12 °C (standard), 18-22 °C (high-temp) | −5-10 °C (evaporator) |
|
||
| **Complexity** | Higher (chillers, pumps, pipes, cooling tower) | Simpler |
|
||
| **Maintenance** | Higher (water treatment, legionella prevention) | Lower |
|
||
| **Use case** | Large DC > 500 kW, enterprise | Smaller DC, edge, retrofit |
|
||
|
||
### Containment types
|
||
|
||
| Type | Description | Efficiency | Implementation |
|
||
|------|-------------|------------|----------------|
|
||
| **Cold aisle containment (CAC)** | Enclosed cold aisle, warm air returns to room | High | Doors at aisle ends, ceiling panels |
|
||
| **Hot aisle containment (HAC)** | Enclosed hot aisle, warm air goes directly to return | Higher | Doors + ceiling panels, return to CRAH |
|
||
| **Chimney / rear duct** | Each rack has its own exhaust chimney to ceiling | Highest | Individual ducts per rack, expensive |
|
||
| **Open aisle** | No containment, cold and warm air mix | Low | Legacy, cheap |
|
||
|
||
Recommendation: CAC/HAC at density > 5 kW/rack. HAC is 5-10 % more efficient than CAC (warm air is directly extracted, does not mix with room).
|
||
|
||
### CFD modeling
|
||
|
||
Computational Fluid Dynamics (CFD) simulates airflow in DC before physical implementation:
|
||
- Identification of hot spots (warm air recirculation into cold aisle)
|
||
- Optimization of perforated tile positions
|
||
- Design of bypass airflow (cable openings, uncovered positions)
|
||
- Simulation of CRAH unit failure (what-if scenarios)
|
||
- Tools: Future Facilities (6Sigma DC), Ansys Fluent, OpenFOAM
|
||
|
||
### Free cooling
|
||
|
||
- **Air-side** — intake of outside air at suitable temperature (filtration, humidification)
|
||
- **Water-side** — use of cold water from outdoor chillers (strainer cycle) without compressor
|
||
- **Climate zone** — free cooling usable ~2000-8000 hours/year depending on location
|
||
- Scandinavia: 7000-8000 h/year
|
||
- Central Europe: 4000-6000 h/year
|
||
- Southern Europe: 2000-4000 h/year
|
||
- **Hybrid** — combination of free cooling + mechanical cooling (most common)
|
||
- **Economizer types**: Class A1 (dry cooler), Class A2 (evaporative), Class B (air-side)
|
||
|
||
### Liquid cooling detail
|
||
|
||
| Type | Inlet temperature | Capacity (kW/rack) | Medium | Installation |
|
||
|------|-----------------|-------------------|--------|-------------|
|
||
| **Cold plate (D2C)** | 20-45 °C | 40-100+ | Water, propylene glycol | CDU per rack or per row |
|
||
| **Rear-door HX** | 18-27 °C | 15-30 | Water | Passive, no server modification |
|
||
| **Immersion (1-ph)** | 35-50 °C | 50-100+ | Dielectric oil | Tank, CDU, heat exchanger |
|
||
| **Immersion (2-ph)** | 25-35 °C | 100-200+ | Dielectric (boiling) | Tank + condenser |
|
||
|
||
**CDU (Coolant Distribution Unit)**:
|
||
- Provides coolant temperature and pressure to racks
|
||
- Primary loop (facility water) + secondary loop (rack coolant)
|
||
- Sizing: 1 CDU per 4-8 racks (40-100 kW per CDU)
|
||
- Redundancy: N+1 CDU, dual coolant loops
|
||
|
||
**Water quality requirements**:
|
||
- Conductivity: < 1 µS/cm (demineralized water)
|
||
- pH: 6.5-8.0
|
||
- Particulates: < 50 µm (filtration)
|
||
- Corrosion prevention: inhibitors, glycol (10-30 %)
|
||
- Biological growth prevention: UV, biocides
|
||
|
||
### Adiabatic cooling
|
||
|
||
Using water evaporation to cool air:
|
||
- **Direct adiabatic** — air passes through water (media pad), cools and humidifies
|
||
- **Indirect adiabatic** — air cools via heat exchanger without direct contact with water
|
||
- **Water consumption**: 3-5 L/kWh (direct), 1-2 L/kWh (indirect)
|
||
- Efficiency depends on air humidity — more effective in dry climates
|
||
|
||
## Cabling and structured cabling
|
||
|
||
### TIA-942 cabling hierarchy
|
||
|
||
```
|
||
Entrance Room (ER)
|
||
│
|
||
├── Backbone cabling (fiber single-mode / multi-mode)
|
||
│ │
|
||
│ ├── Main Distribution Area (MDA)
|
||
│ │ │
|
||
│ │ ├── Horizontal Distribution Area (HDA)
|
||
│ │ │ │
|
||
│ │ │ └── Equipment Distribution Area (EDA) → rack
|
||
│ │ │
|
||
│ │ └── Intermediate Distribution Area (IDA) — optional
|
||
│ │
|
||
│ └── Telecommunication Room (TR) — for office
|
||
│
|
||
└── Backbone cabling (fiber / copper)
|
||
```
|
||
|
||
### Copper cabling categories
|
||
|
||
| Category | Frequency | Speed | Length | Connector | Use case |
|
||
|----------|-----------|-------|--------|-----------|----------|
|
||
| **Cat5e** | 100 MHz | 1 GbE | 100 m | RJ45 | Legacy, voice |
|
||
| **Cat6** | 250 MHz | 1 GbE (10 GbE up to 55 m) | 100 m (10 GbE: 55 m) | RJ45 | Standard DC, enterprise |
|
||
| **Cat6A** | 500 MHz | 10 GbE | 100 m | RJ45 | Standard for new DC |
|
||
| **Cat7** (GG45) | 600 MHz | 10 GbE | 100 m | GG45/TERA | Niche, replaced by Cat6A/8 |
|
||
| **Cat8.1** | 2000 MHz | 25/40 GbE | 30 m | RJ45 | Top-of-rack, storage |
|
||
| **Cat8.2** | 2000 MHz | 25/40 GbE | 30 m | GG45/TERA | Top-of-rack, storage |
|
||
|
||
In DC, **Cat6A** (10 GbE up to 100 m) is standard for horizontal cabling. Cat8 only for patch cables within a rack (up to 30 m).
|
||
|
||
### Fiber optic types
|
||
|
||
| Type | Core | Modal BW | Speed | Max length | Use case |
|
||
|------|------|----------|-------|-----------|----------|
|
||
| **OS1** (SM) | 9 µm | — | 100 GbE - 800 GbE | 10-80 km | Backbone, campus, WAN |
|
||
| **OS2** (SM) | 9 µm | — | 100 GbE - 800 GbE | 2-80 km (CWDM/DWDM) | Backbone, DWDM |
|
||
| **OM1** (MM) | 62.5 µm | 200 MHz·km | 1 GbE | 275 m | Legacy |
|
||
| **OM2** (MM) | 50 µm | 500 MHz·km | 10 GbE | 82 m | Legacy |
|
||
| **OM3** (MM) | 50 µm | 2000 MHz·km | 10 GbE up to 300 m, 100 GbE up to 100 m | 300 m (10G) | Standard DC, VCSEL |
|
||
| **OM4** (MM) | 50 µm | 4700 MHz·km | 100 GbE up to 150 m, 400 GbE up to 100 m | 550 m (10G) | High-performance DC standard |
|
||
| **OM5** (MM) | 50 µm | 4700+ MHz·km | 200/400 GbE SWDM | 150 m (100G) | Emerging, SWDM |
|
||
|
||
For new DC: **OM4** as standard for multi-mode, **OS2** for single-mode backbone (LR, DWDM). OM5 is not widely deployed — OM4 + parallel optics (SR4) is more common.
|
||
|
||
### Connector types
|
||
|
||
| Connector | Type | Insertion loss | Fiber count | Use case |
|
||
|-----------|------|---------------|-------------|----------|
|
||
| **LC** | Duplex | < 0.15 dB | 2 | Standard for SFP/SFP+/QSFP |
|
||
| **SC** | Duplex | < 0.2 dB | 2 | Older installations, patch panels |
|
||
| **MPO/MTP** (12-f) | Multi-fiber | < 0.35 dB | 12/24 | 40/100/400 GbE parallel |
|
||
| **MPO/MTP** (24-f) | Multi-fiber | < 0.5 dB | 24 | 400 GbE (SR4.2, DR4) |
|
||
| **SN** | Duplex (mini) | < 0.15 dB | 2 | High-density (QSFP-DD, OSFP) |
|
||
| **CS** | Duplex (mini) | < 0.15 dB | 2 | High-density (QSFP-DD, OSFP) |
|
||
|
||
### MPO/MTP polarity
|
||
|
||
| Method | Description | Use case |
|
||
|--------|-------------|----------|
|
||
| **Type A** (Straight) | Fiber 1→1, 2→2, ... | Duplex applications with cross-over at both ends |
|
||
| **Type B** (Crossed) | Fiber 1→12, 2→11, ... | Parallel optics (SR4, SR8) — standard |
|
||
| **Type C** (Pairs crossed) | Pairs 1-2→2-1, 3-4→4-3 | 40 GbE SR4 (4×10G) |
|
||
|
||
### Breakout cassettes
|
||
|
||
```
|
||
MPO (12-f) ──> Breakout cassette ──> 6× LC duplex (12 fibers = 6× duplex)
|
||
MPO (24-f) ──> Breakout cassette ──> 12× LC duplex (24 fibers = 12× duplex)
|
||
```
|
||
|
||
Use case: Connecting MPO ports (switch) with LC ports (servers, storage). Cassettes are in the patch panel, not in the active path.
|
||
|
||
### Copper vs fiber decision
|
||
|
||
| Criterion | Copper (Cat6A/8) | Fiber (OM4/OS2) |
|
||
|-----------|-----------------|-----------------|
|
||
| **Reach** | 30-100 m | 100 m - 80 km |
|
||
| **Speed** | 1-40 GbE | 1-800 GbE |
|
||
| **Transceiver cost** | Lower (RJ45) | Higher (SFP+/QSFP) |
|
||
| **Cable cost** | Lower | Higher (patch cord) |
|
||
| **Port power** | 2-5 W (25 GbE) | 1-3 W (25 GbE SR) |
|
||
| **EMI immunity** | Susceptible | Immune |
|
||
| **Weight (100 m)** | ~3-4 kg | ~0.5-1 kg |
|
||
| **Recommendation** | Up to 30 m, server→ToR switch | Backbone, storage, >30 m |
|
||
|
||
### Cabling best practices
|
||
|
||
- **Horizontal cabling**: max 90 m permanent link + 10 m patch cords (TIA-942)
|
||
- **Fiber management**: slack spools, cable managers, minimum bend radius 10× cable diameter
|
||
- **Color coding**: OS1/OS2 (yellow), OM3 (aqua), OM4 (magenta/purple), OM5 (lime green)
|
||
- **Labeling**: both ends, patch panels, faceplates — standard ANSI/TIA-606-B
|
||
- **Overhead vs underfloor**: overhead (ladder rack) is preferred in DC (better airflow, easier changes)
|
||
- **MPO cassettes**: plan 15-20 % fiber reserve for future needs
|
||
|
||
## Physical security
|
||
|
||
### Multi-layer security model (defense in depth)
|
||
|
||
```
|
||
Layer 1: Perimeter (fence, gate, guards)
|
||
Layer 2: Building (walls, locks, CCTV, card readers)
|
||
Layer 3: DC hall (biometrics, mantrap, CCTV, motion detection)
|
||
Layer 4: Rack / Cage (electronic locks, sensors)
|
||
Layer 5: Data (encryption, HSM, access control)
|
||
```
|
||
|
||
### Access control
|
||
|
||
| Method | Factor | Level | Note |
|
||
|--------|--------|-------|------|
|
||
| **RFID / proximity card** | Something you have | Standard | Basic access, cheap |
|
||
| **Smart card (PKI)** | Something you have + PIN | Medium | Certificate on card, anti-passback |
|
||
| **Biometric (fingerprint)** | Something you are | High | Fast, hygienic (touchless readers) |
|
||
| **Biometric (palm/finger vein)** | Something you are | Very high | Hard to forge, contactless |
|
||
| **Biometric (iris/retina)** | Something you are | Highest | Very accurate, slow, expensive |
|
||
| **Multi-factor** | 2+ factors | Highest | Card + biometrics + PIN — Tier IV DC |
|
||
|
||
### Mantrap design
|
||
|
||
```
|
||
Outer door ──> Mantrap (vestibule) ──> Inner door
|
||
│
|
||
├── Weight sensor (anti-tailgating)
|
||
├── CCTV (both doors)
|
||
├── Intercom (emergency exit)
|
||
└── Motion detector (in mantrap)
|
||
```
|
||
|
||
- Only one door opens at a time
|
||
- Anti-tailgating: weight sensor detects multiple persons
|
||
- Exit via breakout button + motion detection
|
||
- Emergency exit: panic bar + alarm
|
||
|
||
### CCTV
|
||
|
||
| Element | Recommendation |
|
||
|---------|----------------|
|
||
| **Resolution** | Min. 1080p, ideally 4K (6 MP+) |
|
||
| **FPS** | 15-30 FPS (recording), 30+ FPS (realtime monitoring) |
|
||
| **Retention** | Min. 30 days (90 days for audit) |
|
||
| **Storage** | NVR (on-prem), cloud (AWS KVS, Azure Video Indexer) |
|
||
| **AI analytics** | Face detection, ANPR (license plate), object detection |
|
||
| **Field of view** | Every door, every aisle — no blind spots |
|
||
|
||
### Asset tracking
|
||
|
||
| Technology | Accuracy | Cost | Use case |
|
||
|-----------|----------|------|----------|
|
||
| **Barcode** | Rack-level | Very low | Manual inventory |
|
||
| **RFID (passive)** | Rack-level (door sweep) | Low | Automatic rack open detection |
|
||
| **RFID (active, UWB)** | 10-30 cm | Medium | Real-time tracking |
|
||
| **Bluetooth BLE** | 1-3 m | Low | Approximate position |
|
||
| **GPS** | 1-10 m | Medium | Outdoor tracking |
|
||
|
||
## DC layout and design
|
||
|
||
### Raised floor vs Slab
|
||
|
||
| Feature | Raised floor | Slab (solid floor) |
|
||
|---------|-------------|-------------------|
|
||
| **Airflow** | Underfloor air distribution (raised floor as plenum) | Overhead air, in-row cooling |
|
||
| **Flexibility** | Easy addition of perforated tiles | Limited (overhead cooling required) |
|
||
| **Weight** | Limit 500-1000 kg/m² (depends on height) | Unlimited |
|
||
| **Cost** | Higher (~$200-400/m²) | Lower (~$100-200/m²) |
|
||
| **Height** | 600-900 mm (standard), 900-1200 mm (high-density) | — |
|
||
| **Trend** | Declining (shift to in-row/overhead cooling) | Growing (new DC, high-density) |
|
||
|
||
Modern high-density DC (AI/ML, GPU) are moving away from raised floor to slab + overhead/in-row cooling — higher rack weights (1000-2000 kg), inability to provide sufficient airflow through floor.
|
||
|
||
### Rack layout and dimensions
|
||
|
||
| Parameter | Standard | High-density | Note |
|
||
|-----------|----------|-------------|------|
|
||
| **Rack width** | 600 mm (19") | 600-750 mm | 750 mm for GPU (cabling, cooling) |
|
||
| **Rack depth** | 1000-1200 mm | 1200-1500 mm | GPU servers, longer cables |
|
||
| **Rack height** | 42U | 48U / 52U | Higher rack = better power density |
|
||
| **Aisle width (cold)** | 1200-1500 mm | 1500-1800 mm | Service access, airflow |
|
||
| **Aisle width (hot)** | 900-1200 mm | 1200-1500 mm | Narrower than cold |
|
||
| **Max rack load** | 500-800 kg | 1000-2000 kg | Floor reinforcement required |
|
||
|
||
### Space planning
|
||
|
||
```
|
||
For Tier III DC (example):
|
||
IT space: 1000 m²
|
||
└── 20 rows × 10 racks = 200 racks at 42U
|
||
└── 200 racks × 5 kW avg = 1 MW IT load
|
||
└── PUE 1.4 → 1.4 MW facility
|
||
Support spaces:
|
||
└── UPS + batteries: 200 m²
|
||
└── Generators: 100 m² (outdoor)
|
||
└── Cooling (chillers, cooling tower): 300 m²
|
||
└── Offices, storage, loading dock: 400 m²
|
||
Total: ~2000 m² (50% IT, 50% support)
|
||
```
|
||
|
||
### Zone approach (TIA-942)
|
||
|
||
| Zone | Description | Access | Security |
|
||
|------|-------------|--------|----------|
|
||
| **Z1** (Public) | Reception, offices | Free | Minimal |
|
||
| **Z2** (Office) | Administration, NOC | Employees + guests | RFID |
|
||
| **Z3** (DC support) | UPS, generators, cooling | DC operators | RFID + biometrics |
|
||
| **Z4** (DC hall) | Servers, storage, networking | DC operators + approved | RFID + biometrics + mantrap |
|
||
| **Z5** (Rack/cage) | Specific rack or cage | Only authorized personnel | Electronic lock |
|
||
|
||
## Fire suppression
|
||
|
||
### Detection
|
||
|
||
| System | Type | Detection time | False alarms | Use case |
|
||
|--------|------|----------------|--------------|----------|
|
||
| **VESDA** (Very Early Smoke Detection) | Aspiration, laser sensor | < 30 s (4 alarm levels) | Very low | Standard for DC |
|
||
| **Spot detection** | Ionization / optical smoke detector | 2-5 min | Medium | Legacy, smaller DC |
|
||
| **Heat detection** | Thermal detector (temperature / rate of rise) | 5-10 min | Very low | Backup for VESDA |
|
||
| **Line-type (LHD)** | Linear heat detection cable | 2-5 min | Low | Cable trays, above ceiling |
|
||
|
||
VESDA is the standard — active aspiration draws air from DC, laser sensor detects smoke particles at 4 levels (Alert → Action → Fire 1 → Fire 2). Enables intervention before visible smoke.
|
||
|
||
### Suppression systems
|
||
|
||
| System | Medium | Advantages | Disadvantages | Typical DC |
|
||
|--------|--------|------------|---------------|-----------|
|
||
| **Novec 1230** (FK-5-1-12) | Gas | Safe for people, zero ODP, short atmospheric lifetime (5 days) | Higher cost | Enterprise DC |
|
||
| **FM-200** (HFC-227ea) | Gas | Fast (10 s), effective | High GWP (3220), no ODP | Legacy DC |
|
||
| **Inergen** (IG-541) | Inert gas (52% N₂, 40% Ar, 8% CO₂) | Completely safe, natural gas | Large volume, high pressure | Enterprise DC |
|
||
| **Argonite** (IG-55) | 50% Ar, 50% N₂ | Safe, natural | Large volume, higher pressure | Enterprise DC |
|
||
| **Water mist** | Water (fine mist) | Cooling, smoke suppression, low cost | Water in DC (risk), local application only | Retrofits |
|
||
| **Pre-action sprinkler** | Water | Dual activation (detection + sprinkler) | Water risk, drainage required | Tier I-II |
|
||
|
||
**Concentration**: Novec (4-6 % volume), FM-200 (7-9 %), Inergen (35-50 %). Novec and Inergen are safe for breathing (min. 5-7 min evacuation).
|
||
|
||
### Detection zones
|
||
|
||
```
|
||
DC hall ──> zones of ~200 m² (max)
|
||
│
|
||
├── VESDA (each zone its own aspirator)
|
||
├── Smoke detectors (ceiling + floor)
|
||
└── Heat detection (backup)
|
||
```
|
||
|
||
## DCIM (Data Center Infrastructure Management)
|
||
|
||
### What DCIM covers
|
||
|
||
| Area | Metrics | Output |
|
||
|------|---------|--------|
|
||
| **Power** | Per PDU, per outlet, per rack, total | Capacity planning, PUE, kW/rack |
|
||
| **Cooling** | Temperature, humidity, airflow (sensors per rack) | Hot spot maps, airflow optimization |
|
||
| **Asset** | What is in which rack, U position, serial, warranty | Asset inventory, lease management |
|
||
| **Network** | Port utilization, patch panel connections | Patch management, port tracking |
|
||
| **Space** | Free U in rack, free racks | Capacity planning, "what-if" simulations |
|
||
|
||
### Tools
|
||
|
||
| Tool | Type | Platform | Cost | Note |
|
||
|------|------|----------|------|------|
|
||
| **Nlyte (Carrier)** | Enterprise DCIM | On-prem / Cloud | $$$ | Market leader, complex |
|
||
| **Sunbird DCIM** | Enterprise DCIM | Cloud | $$$ | Power monitoring, asset tracking |
|
||
| **Device42** | DCIM + IPAM | On-prem / Cloud | $$ | Integrated IPAM, CMDB |
|
||
| **NetBox** | Open source DCIM | On-prem | Free | IPAM, DCIM, asset tracking |
|
||
| **OpenDCIM** | Open source | On-prem | Free | Basic DCIM, asset management |
|
||
| **RackTables** | Open source | On-prem | Free | Simple, asset + networking |
|
||
| **Vendor-specific** | Dell OME, HPE OneView | On-prem | Part of HW | Vendor-specific only |
|
||
|
||
## Site selection
|
||
|
||
### Criteria for DC site selection
|
||
|
||
| Category | Criterion | Weight |
|
||
|----------|-----------|--------|
|
||
| **Power** | Electricity availability (grid capacity), cost/kWh, possibility of two independent feeds | High |
|
||
| **Connectivity** | Fiber backbone availability, number of connectivity providers, latency to major POP | High |
|
||
| **Natural risks** | Earthquakes, floods, hurricanes, tornadoes, wildfires — historical data + predictions | High |
|
||
| **Climate** | Average temperature, humidity (free cooling potential) | Medium |
|
||
| **Workforce** | Availability of technicians, DC operators, network/admin engineers | Medium |
|
||
| **Taxes and regulation** | Tax incentives, environmental regulations, building permits | Medium |
|
||
| **Security** | Crime, political stability, terrorist risk | High |
|
||
| **Transport accessibility** | Proximity to airport, highway (for HW deliveries, personnel) | Low |
|
||
|
||
### Natural risks — mapping
|
||
|
||
| Risk | Areas | Mitigation |
|
||
|------|-------|------------|
|
||
| **Earthquakes** | Pacific Ring of Fire (CA, Japan, Chile) | Base isolation, seismic bracing, flexible connections |
|
||
| **Hurricanes** | Caribbean, southeastern US, southeast Asia | Reinforced construction, generators above flood level |
|
||
| **Floods** | River valleys, coastal areas | Location outside flood zone, barriers |
|
||
| **Wildfires** | California, Australia, Mediterranean | Defensive zones, air filtration, monitoring |
|
||
|
||
### Power availability by region
|
||
|
||
| Region | Grid reliability | Cost/kWh (industrial) | Note |
|
||
|--------|-----------------|------------------------|------|
|
||
| **Northern Europe** (SE, NO, FI) | High (99.99 %) | $0.04-0.08 | Cheap green energy, cool climate |
|
||
| **Central Europe** (DE, NL, CZ) | High (99.99 %) | $0.10-0.20 | Stable, growing renewables |
|
||
| **Eastern US** (VA, NC) | High | $0.05-0.08 | Largest DC hub (Ashburn, VA) |
|
||
| **Western US** (CA, OR) | Medium (PG&E issues) | $0.10-0.15 | CALISO grid, blackout risk |
|
||
| **Singapore** | High | $0.15-0.20 | Moratorium on new DC (2023), water |
|
||
| **Dubai / UAE** | High | $0.06-0.10 | Cheap energy, high temperature (cooling) |
|
||
|
||
## Compliance and certification
|
||
|
||
| Standard / Certification | Area | Description |
|
||
|-------------------------|------|-------------|
|
||
| **TIA-942** (Rated 1-4) | DC design | Classification of redundancy, cabling, security (analogous to Uptime Tier) |
|
||
| **Uptime Institute** (Tier I-IV) | DC design | Operational certification, construction documentation |
|
||
| **ISO 27001** | ISMS | Information security, risk management |
|
||
| **ISO 27701** | Privacy | Extension of ISO 27001 for GDPR compliance |
|
||
| **SOC 2** (Type I/II) | Service org | Controls: Security, Availability, Confidentiality, Integrity, Privacy |
|
||
| **PCI DSS** | Payment cards | Physical security, access to cardholder data |
|
||
| **HIPAA** | Healthcare | USA, health data protection |
|
||
| **FedRAMP** | US government | Cloud service authorization, DC security |
|
||
| **GDPR** | EU | Personal data protection, data residency |
|
||
| **NIST SP 800-53** | DC security | Security control catalog for US federal |
|
||
| **ISO 14001** | EMS | Environmental management, sustainability |
|
||
|
||
## Sustainability
|
||
|
||
### Carbon footprint of DC
|
||
|
||
```
|
||
Total emissions = Scope 1 (direct) + Scope 2 (energy) + Scope 3 (supply chain)
|
||
Scope 1: Generators (diesel), refrigerant leaks
|
||
Scope 2: Purchased electricity (grid mix)
|
||
Scope 3: HW manufacturing, transport, EOL recycling (~60-80 % of total emissions)
|
||
```
|
||
|
||
### Emission reduction
|
||
|
||
| Measure | Impact on PUE | Emission reduction | Payback |
|
||
|---------|--------------|-------------------|---------|
|
||
| **Temperature increase** (22→27 °C) | −0.1-0.2 | 10-20 % cooling | Immediate |
|
||
| **Free cooling** | −0.1-0.3 | 20-40 % cooling | 1-2 years |
|
||
| **Liquid cooling** | −0.2-0.4 | 30-50 % cooling | 2-4 years |
|
||
| **LED lighting + sensors** | −0.01-0.02 | < 1 % | 1 year |
|
||
| **PPA (Power Purchase Agreement)** | — | 100 % Scope 2 | Variable |
|
||
| **Renewable sources** (rooftop solar) | — | 5-15 % consumption | 5-10 years |
|
||
| **Green generator** (HVO biodiesel) | — | 90 % CO₂ reduction | +30 % fuel cost |
|
||
|
||
### Sustainability certifications
|
||
|
||
| Certification | Description |
|
||
|--------------|-------------|
|
||
| **LEED** (BD+C: DC) | U.S. Green Building Council — design and construction |
|
||
| **BREEAM** | UK, European sustainability assessment |
|
||
| **Climate Neutral Data Centre Pact** (EU) | Self-regulatory, PUE < 1.4 by 2030 |
|
||
| **ISO 50001** | Energy management system |
|
||
| **Energy Star** | EPA, energy efficiency (US only) |
|
||
|
||
## Decision diagram — DC topology design
|
||
|
||
```mermaid
|
||
flowchart TD
|
||
Start(["DC design"]) --> TIER{"Required Tier?"}
|
||
TIER -->|"Tier I / II"| T1["N / N+1 redundancy<br/>Simple power, single path<br/>CRAC/CRAH, free cooling<br/>PUE 1.4-1.6, cost 1×"]
|
||
TIER -->|"Tier III"| T3["N+1, concurrently maintainable<br/>Dual path (A/B feed)<br/>Hot aisle containment<br/>PUE 1.2-1.4, cost 2×"]
|
||
TIER -->|"Tier IV"| T4["2N+1, fault tolerant<br/>Dual redundant + STS<br/>Hot + cold containment<br/>PUE 1.1-1.3, cost 3×"]
|
||
|
||
TIER --> POWER{"Power chain"}
|
||
POWER -->|"UPS"| UPS{"UPS type"}
|
||
UPS -->|"Enterprise DC"| UPS1["VFI double-conversion<br/>Li-ion (LFP), 10-15 years<br/>N+1 or 2N modular"]
|
||
UPS -->|"Edge / office"| UPS2["VI line-interactive<br/>VRLA, 3-5 years"]
|
||
POWER -->|"Generator"| GEN["Diesel 500-2500 kVA<br/>Tank for 24-72 h<br/>ATS 4-10 ms switching"]
|
||
POWER -->|"PDU"| PDU["3-phase 400 V<br/>Monitored/Switched<br/>A/B feed to racks"]
|
||
|
||
Start --> DENS{"Power density"}
|
||
DENS -->|"< 10 kW/rack"| COOL1["Air cooling<br/>CRAC/CRAH, raised floor<br/>Hot aisle containment<br/>ASHRAE A1-A2"]
|
||
DENS -->|"10-25 kW/rack"| COOL2["Hybrid<br/>In-row cooling<br/>Rear door HX<br/>ASHRAE A1-H1"]
|
||
DENS -->|"> 25 kW/rack"| COOL3["Liquid cooling<br/>CDU, direct-to-chip<br/>Immersion single/two-phase<br/>ASHRAE W-classes"]
|
||
|
||
Start --> CLIM{"Climate zone"}
|
||
CLIM -->|"Moderate (CZ, DE)"| FC1["Free cooling 4000-6000 h/year<br/>Chiller + economizer<br/>PUE saving 0.2-0.3"]
|
||
CLIM -->|"Warm (ES, US South)"| FC2["Chiller year-round<br/>Adiabatic cooling<br/>PUE 1.3-1.6"]
|
||
CLIM -->|"Cold (SE, NO)"| FC3["Free cooling 7000+ h/year<br/>Air-side economizer<br/>PUE < 1.2"]
|
||
```
|
||
|
||
## Disk monitoring — S.M.A.R.T.
|
||
|
||
Self-Monitoring, Analysis and Reporting Technology — predictive monitoring of HDD/SSD.
|
||
|
||
| Key attribute | ID | Description |
|
||
|--------------|----|-------------|
|
||
| Reallocated Sectors Count | 5 | Number of remapped sectors (increase = end of disk life) |
|
||
| Power-On Hours | 9 | Total operating time in hours |
|
||
| Reported Uncorrectable Errors | 187 | Uncorrectable errors (red flag) |
|
||
| CRC Error Count | 199 | Errors on SATA link (cable/controller) |
|
||
| SSD Life Left | 231 | % remaining SSD life |
|
||
| Media Wearout Indicator | 233 | Total NAND writes |
|
||
|
||
Tools: `smartmontools` (smartctl, smartd), Prometheus exporter (`node_exporter`), OTeL collector.
|
||
|
||
## Sources
|
||
|
||
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||
|
||
### Recommended literature
|
||
|
||
| Book | Authors | ISBN | Description |
|
||
|------|---------|------|-------------|
|
||
| The Data Center as a Computer (4th ed., 2025) | Barroso, Hölzle, Ranganathan | 978-3-031-99488-3 | Comprehensive design evolution of warehouse-scale computer (WSC) by Google architects. Covers hardware, software, power, cooling, networking and 25 years of WSC experience. Key publication for datacenter architecture. |
|
||
| Electronics Cooling: From the Chip to the Datacenter (Vol. 62) | Abraham et al. | 978-0-443-47084-4 | Practical guide to thermal management from transistor level to datacenter. Covers conduction, convection, liquid immersion and phase change cooling. Essential resource for DC cooling design. |
|
||
|
||
## Datacenter backbone services
|
||
|
||
When building a new DC, basic infrastructure services must be deployed first — without them, higher layers cannot operate:
|
||
|
||
### DNS
|
||
|
||
| Role | Service | Description |
|
||
|------|---------|-------------|
|
||
| **Authoritative** | Bind, PowerDNS, NSD | Primary DNS zone for internal domains |
|
||
| **Recursive** | Unbound, Bind (caching), CoreDNS | Resolver for internal + external queries |
|
||
| **Anycast** | DNS anycast (BGP) | Redundancy, lower latency |
|
||
| **Integration** | Infoblox, BlueCat, dnsmasq | IPAM + DNS + DHCP in one |
|
||
|
||
Best practices: separate auth and recursive resolvers, DNSSEC, split-horizon (internal vs external view), TSIG for zone transfers, monitoring (DNS query latency, NXDOMAIN rate).
|
||
|
||
### NTP (time synchronization)
|
||
|
||
- **Primary**: GPS-disciplined NTP servers (Microchip S600, Meinberg)
|
||
- **Secondary**: Stratum 1/2 NTP (ntpd, chrony, NTPsec)
|
||
- **All nodes**: chrony (modern replacement for ntpd), local NTP server on each rack switch (boundary clock)
|
||
- **Precision**: PTP (IEEE 1588) for telco/fintech — sub-microsecond accuracy
|
||
- **DC topology**: GPS antenna → Grandmaster (PTP) → Boundary clock (rack switch) → Ordinary clock (server)
|
||
|
||
### DHCP + IPAM
|
||
|
||
| Tool | Description |
|
||
|------|-------------|
|
||
| **ISC DHCP** | Legacy, still widely deployed |
|
||
| **Kea** | Modern replacement for ISC DHCP (ISC + Linux Foundation) |
|
||
| **Infoblox / BlueCat** | Enterprise IPAM + DHCP + DNS |
|
||
| **NetBox / phpIPAM** | Open-source IPAM |
|
||
|
||
### LDAP / Identity Management
|
||
|
||
| Tool | Description |
|
||
|------|-------------|
|
||
| **FreeIPA** | Integrated IDM (LDAP + Kerberos + DNS + CA) — Linux |
|
||
| **Active Directory** | Microsoft, LDAP + Kerberos + Group Policy |
|
||
| **389 Directory Server** | Open-source LDAP (Red Hat) |
|
||
| **OpenLDAP** | Classic open-source LDAP |
|
||
| **Keycloak / Authentik** | Modern OIDC/SAML/LDAP gateways |
|
||
|
||
### PKI and certificates
|
||
|
||
- **Enterprise CA**: EJBCA, Smallstep, HashiCorp Vault (PKI engine)
|
||
- **ACME**: Cert-Manager (Kubernetes), certbot (Let's Encrypt)
|
||
- **mTLS**: Vault PKI, spire (SPIFFE), Cilium
|
||
- **Best practices**: root CA offline, intermediate CA per environment, short-lived certificates (max 90 days), revocation (CRL/OCSP)
|
||
|
||
### Monitoring and observability
|
||
|
||
See [MONITORING.md](MONITORING.md). Before running first workloads, DC must have:
|
||
- Metric collection (Prometheus, Zabbix)
|
||
- Centralized logs (Loki, ELK)
|
||
- Alerting (Alertmanager, PagerDuty)
|
||
- Uptime monitoring (heartbeat checks)
|
||
|
||
### Deployment logistics — step order
|
||
|
||
```
|
||
1. DNS (at least recursive + local resolver)
|
||
2. NTP (time synchronization)
|
||
3. DHCP + IPAM (first servers get IPs)
|
||
4. LDAP / IAM (users, groups, access rights)
|
||
5. PKI (certificates for encryption)
|
||
6. Configuration management (Ansible, Puppet)
|
||
7. Monitoring + logging (see what's happening)
|
||
8. Container registry / Package repo (docker registry, apt/yum mirror)
|
||
9. Load balancer (for services)
|
||
10. Storage backend (Ceph, NFS, SAN)
|
||
11. Orchestration (Kubernetes, OpenStack)
|
||
```
|
||
|
||
## OpenStack in the datacenter
|
||
|
||
OpenStack brings a software abstraction layer to DC enabling multi-tenancy and self-service:
|
||
|
||
### Control plane architecture
|
||
|
||
- **Controller nodes** — management services (Keystone, Nova API, Neutron API, Horizon, RabbitMQ, DB)
|
||
- **Compute nodes** — hypervisor (KVM), Nova Compute, Neutron agent
|
||
- **Storage nodes** — Ceph OSD, Cinder volumes, Swift object storage
|
||
- **Network nodes** — Neutron L3 router, DHCP agent, DVR
|
||
|
||
### Requirements for DC infrastructure
|
||
|
||
| Component | Requirement |
|
||
|-----------|-------------|
|
||
| **Controller** | 3-5 node HA cluster, 16+ vCPU, 32+ GB RAM, SSD |
|
||
| **Compute** | Dense performance per rack (GPU, high-core), NUMA-aware design |
|
||
| **Storage (Ceph)** | 10-25 GbE networking, NVMe/SSD OSD, 3+ replica |
|
||
| **Network** | 25/100 GbE spine-leaf, L3 BGP underlay, VXLAN overlay |
|
||
| **Rack power** | 10-30 kW/rack for GPU compute |
|
||
|
||
### Use cases
|
||
|
||
- Private cloud for enterprise (multi-tenant, self-service Horizon)
|
||
- NFVI for telco (DPDK, SR-IOV, low-latency)
|
||
- Academic / HPC clusters (Ironic, Cyborg, Manila)
|
||
- Government / regulated environments (on-prem, audit trail)
|
||
|
||
*Last revision: 2026-06-03*
|