379 lines
14 KiB
Markdown
379 lines
14 KiB
Markdown
# Case study: Proxmox VE demo cluster (3× node, Ceph, HA)
|
||
|
||
## 1. Requirements and parameters
|
||
|
||
| Parameter | Value |
|
||
|----------|---------|
|
||
| Number of hosts | 3 |
|
||
| Purpose | demo, learning, development |
|
||
| Hypervisor | Proxmox VE (free) |
|
||
| Budget | low-cost (~$10,000–$15,000) |
|
||
| Storage | Ceph (HCI) |
|
||
| HA | yes |
|
||
| Location | 1 rack, standard office room |
|
||
|
||
---
|
||
|
||
## 2. Server configuration
|
||
|
||
Based on a combination of the **Mini variant** (2–3 hosts, single-socket) and the **pure Ceph variant** per SERVER-CONFIG.md. Each of the 3 nodes is identical.
|
||
|
||
### 2.1 Single node configuration
|
||
|
||
| Component | Specification | Rationale |
|
||
|------------|-------------|------------|
|
||
| **CPU** | 1× AMD EPYC 9224 (24C/48T, 200 W TDP) or Intel Xeon 5418Y (16C/32T) | SERVER-CONFIG.md: "Pure Ceph variant: CPU 1× EPYC 9224–9334 (12–24C)". Ceph requires 1–2 cores per OSD; with 3 OSD + Proxmox + VM, 12+ cores is the minimum. |
|
||
| **RAM** | 128 GB DDR5-4800 (4× 32 GB RDIMM, 1DPC) | SERVER-CONFIG.md: "RAM 128–256 GB" for Ceph variant. 128 GB is sufficient for demo; 4–8 GB per OSD + OS + lightweight VMs. |
|
||
| **OS disk** | 2× 240 GB SATA SSD, RAID 1 (HW controller in HBA mode or SW mdadm) | "OS: 2× SATA SSD RAID 1" per Ceph variant. |
|
||
| **Ceph OSD** | 3× 960 GB SATA SSD (HBA/IT mode, no HW RAID) | "Ceph OSD: 4–8× NVMe/SATA SSD (RAW, HBA mode)". For demo we reduce to 3 OSD/node. Total 9 OSD in cluster. |
|
||
| **NIC** | 2× dual-port 10 GbE SFP+ (total 4× 10 GbE) | "Network: 2× 25 GbE public + 2× 25 GbE cluster". For low-cost we choose 10 GbE (SFP+), the concept remains the same. |
|
||
| **BMC** | 1× 1 GbE (iDRAC / iLO / IPMI) | Standard management port, CONNECTIVITY.md. |
|
||
| **Form factor** | 1U rack server (Dell R660, HPE DL360 Gen11, or Supermicro) | 19" rack, suitable for 1U. |
|
||
|
||
### 2.2 CPU choice rationale
|
||
|
||
KB states for the Mini variant "1× EPYC 4124 (4C) or Xeon E-2400". However, 4 cores is insufficient for Ceph (OSD + Proxmox + VM). Therefore we choose EPYC 9224 (24C) / Xeon 5418Y (16C), which corresponds to the Ceph variant in SERVER-CONFIG.md. The price is higher, but the cluster is functional for real-world testing.
|
||
|
||
---
|
||
|
||
## 3. Storage variant — Ceph
|
||
|
||
### 3.1 Topology
|
||
|
||
```
|
||
3× Proxmox node ─── each 3× OSD (SATA SSD)
|
||
│
|
||
Ceph cluster
|
||
│
|
||
┌─────────┼─────────┐
|
||
3× MON 3× MGR 9× OSD
|
||
```
|
||
|
||
### 3.2 Ceph configuration
|
||
|
||
| Parameter | Value | Note |
|
||
|----------|---------|---------|
|
||
| Replication | 3 (size = 3, min_size = 2) | Standard per STORAGE.md |
|
||
| Failure domain | host | CRUSH: replication across nodes |
|
||
| Raw capacity | 9 × 960 GB ≈ 8.6 TB | |
|
||
| Usable capacity | ~2.9 TB (8.6 / 3) | Sufficient for demo |
|
||
| OSD backend | BlueStore | Default in Ceph, recommended |
|
||
| MON quorum | 3 (1 per node) | Minimum for HA |
|
||
| Cache | RAM (BlueStore cache) | 1–2 GB per OSD |
|
||
| Network public | 2× 10 GbE LACP | VM traffic + Ceph frontend |
|
||
| Network cluster | 2× 10 GbE LACP | Ceph backend replication |
|
||
| MTU | 9000 (jumbo frames) | Recommended per NETWORKING.md |
|
||
|
||
### 3.3 Storage layout on disk
|
||
|
||
```
|
||
/dev/sda 240 GB OS (RAID 1, mirror with /dev/sdb)
|
||
/dev/sdc 960 GB OSD.0 (RAW, BlueStore)
|
||
/dev/sdd 960 GB OSD.1 (RAW, BlueStore)
|
||
/dev/sde 960 GB OSD.2 (RAW, BlueStore)
|
||
```
|
||
|
||
### 3.4 Ceph pool design
|
||
|
||
| Pool | PG count | Replication | Purpose |
|
||
|------|----------|-----------|-------|
|
||
| vms | 128 | 3× | VM disks (RBD) |
|
||
| data | 64 | 3× | Data volume |
|
||
| backups | 32 | 3× | Backups (low priority) |
|
||
|
||
PG count is approximate for demo (9 OSD). Production formula: (OSD_total × 100) / replication_size.
|
||
|
||
---
|
||
|
||
## 4. Network
|
||
|
||
### 4.1 Topology
|
||
|
||
```
|
||
┌─────────────────┐
|
||
│ 10 GbE Switch │
|
||
│ (24-port SFP+) │
|
||
└──┬──┬──┬──┬──┬──┘
|
||
┌─────────────┘ │ │ └─────────────┐
|
||
│ │ │ │
|
||
┌─────┴─────┐ ┌────┴──┴───┐ ┌───────┴──┐
|
||
│ Node 1 │ │ Node 2 │ │ Node 3 │
|
||
│ 4×10GbE │ │ 4×10GbE │ │ 4×10GbE │
|
||
│ ┌──────┐ │ │ ┌──────┐ │ │ ┌──────┐ │
|
||
│ │1GbE │ │ │ │1GbE │ │ │ │1GbE │ │
|
||
│ │BMC │ │ │ │BMC │ │ │ │BMC │ │
|
||
└─────────┘ └───────────┘ └───────────┘
|
||
```
|
||
|
||
### 4.2 VLAN and traffic segmentation
|
||
|
||
| VLAN | Purpose | Ports | MTU |
|
||
|------|------|-------|-----|
|
||
| VLAN 10 | Management (Proxmox web UI, SSH) | 1× 1 GbE BMC | 1500 |
|
||
| VLAN 20 | VM traffic + Ceph public | 2× 10 GbE (bond) | 9000 |
|
||
| VLAN 30 | Ceph cluster (backend) | 2× 10 GbE (bond) | 9000 |
|
||
|
||
### 4.3 Switch
|
||
|
||
| Parameter | Value |
|
||
|----------|---------|
|
||
| Model | MikroTik CRS326-24S+2Q+RM or similar L2+ switch |
|
||
| Ports | 24× SFP+ 10 GbE |
|
||
| Management | VLAN 10, IP 10.0.0.254/24 |
|
||
| Features | VLAN, LACP (LAG), Jumbo frames (MTU 9000), SNMP |
|
||
|
||
### 4.4 Cabling
|
||
|
||
| Type | Length | Quantity | Purpose |
|
||
|-----|-------|-------|-------|
|
||
| SFP+ DAC (passive) | 3 m | 12 | 10 GbE connection server ↔ switch |
|
||
| Cat6A UTP | 3 m | 3 | Management (1 GbE BMC) |
|
||
| Cat6A UTP | 1 m | 1 | Internet uplink (patch panel) |
|
||
|
||
DAC cables are cheaper than SFP+ optics + patch cords — suitable for single-rack.
|
||
|
||
---
|
||
|
||
## 5. Rack layout
|
||
|
||
### 5.1 Dimensions and positions
|
||
|
||
| U | Device | Power (W) |
|
||
|---|----------|-----------|
|
||
| U1 | Switch 10 GbE (1U) | ~60 W |
|
||
| U2 | UPS (2U) | — |
|
||
| U3 | (empty, ventilation) | — |
|
||
| U4 | Server Node 1 (1U) | ~250 W |
|
||
| U5 | Server Node 2 (1U) | ~250 W |
|
||
| U6 | Server Node 3 (1U) | ~250 W |
|
||
| U7–U15 | Empty (optional storage, patch panel) | — |
|
||
|
||
| Parameter | Value |
|
||
|----------|---------|
|
||
| Rack type | 15U wall-mount, 19", 600×600 mm |
|
||
| Total IT load | ~810 W |
|
||
| PUE estimate | ~1.5 (office room, no precision cooling) |
|
||
| Cooling | Standard office AC (ASHRAE A2: 10–35 °C). Sufficient for <1 kW. |
|
||
|
||
**Note:** KB (DATACENTERS.md) states free air cooling for low density (<5 kW/rack). Standard ventilation and AC are sufficient in an office.
|
||
|
||
### 5.2 UPS
|
||
|
||
| Parameter | Value |
|
||
|----------|---------|
|
||
| Type | VI (line-interactive) — per DATACENTERS.md for smaller racks |
|
||
| Capacity | 2000 VA / 1200 W |
|
||
| Backup time | ~15–20 min at 810 W load |
|
||
| Output | 8× C13 (for servers + switch) |
|
||
| Battery | VRLA (cheaper) or Li-ion LFP |
|
||
| Management | USB / SNMP card (automatic Proxmox shutdown) |
|
||
|
||
Optionally can be upgraded to VFI (double-conversion) UPS for cleaner output, but VI is sufficient for demo.
|
||
|
||
### 5.3 PDU
|
||
|
||
1× basic 1U PDU (8× C13), 230 V / 10 A — for distribution to servers.
|
||
|
||
---
|
||
|
||
## 6. Hypervisor — Proxmox VE
|
||
|
||
### 6.1 Installation and configuration
|
||
|
||
| Component | Version / Configuration |
|
||
|------------|---------------------|
|
||
| Hypervisor | Proxmox VE 8.x (Debian 12 + KVM + LXC) |
|
||
| Storage backend | Ceph Reef / Squid (18.x) integrated in Proxmox |
|
||
| Cluster | 3-node cluster, Corosync + PMXCFS |
|
||
| HA | Proxmox HA — 1 node failure tolerance (remaining 2 take over VMs) |
|
||
| Fencing | watchdog (softdog) + Proxmox HA manager |
|
||
|
||
### 6.2 License
|
||
|
||
| Item | Price | Note |
|
||
|---------|------|----------|
|
||
| Proxmox VE | $0 | Open source, full functionality without license |
|
||
| Proxmox community support | $0 | Forum, wiki |
|
||
| Proxmox enterprise support (optional) | ~€500/host/year | Can be purchased later |
|
||
|
||
HYPERVISORS.md: Proxmox VE is "open source (free)", no license required.
|
||
|
||
### 6.3 HA setup
|
||
|
||
- HA group: all 3 nodes, no-quorum-policy = "stop" (for demo)
|
||
- Max VM restart: 2 attempts
|
||
- Migration: live migration via Ceph RBD (shared storage)
|
||
|
||
---
|
||
|
||
## 7. Budget estimate
|
||
|
||
**Disclaimer:** KB does not contain specific component prices. The following amounts are approximate market estimates (Q2 2026, USD).
|
||
|
||
### 7.1 Servers (3×)
|
||
|
||
| Item | Qty | Price/unit | Total |
|
||
|---------|------|----------|--------|
|
||
| 1U rack server (basic config, without CPU/RAM/disk) | 3 | ~$1,200 | $3,600 |
|
||
| AMD EPYC 9224 (24C) / Intel Xeon 5418Y (16C) — per KB | 3 | ~$900 | $2,700 |
|
||
| RAM 128 GB (4× 32 GB DDR5-4800 RDIMM) | 3 | ~$600 | $1,800 |
|
||
| 240 GB SATA SSD (OS) | 6 | ~$50 | $300 |
|
||
| 960 GB SATA SSD (Ceph OSD) | 9 | ~$150 | $1,350 |
|
||
| Dual-port 10 GbE SFP+ NIC (e.g. Intel X710-DA2) | 6 | ~$120 | $720 |
|
||
| **Servers total** | | | **~$10,470** |
|
||
|
||
### 7.2 Network
|
||
|
||
| Item | Qty | Price/unit | Total |
|
||
|---------|------|----------|--------|
|
||
| MikroTik CRS326-24S+2Q+RM (24× 10GbE SFP+) | 1 | ~$600 | $600 |
|
||
| SFP+ DAC cable 3 m (passive) | 12 | ~$15 | $180 |
|
||
| Network total | | | **~$780** |
|
||
|
||
### 7.3 Rack and power
|
||
|
||
| Item | Qty | Price/unit | Total |
|
||
|---------|------|----------|--------|
|
||
| 15U wall-mount rack 19" | 1 | ~$300 | $300 |
|
||
| UPS 2000 VA (line-interactive, VRLA) | 1 | ~$450 | $450 |
|
||
| 1U PDU basic (8× C13) | 1 | ~$60 | $60 |
|
||
| Rack + power total | | | **~$810** |
|
||
|
||
### 7.4 Other
|
||
|
||
| Item | Price |
|
||
|---------|------|
|
||
| Cat6A patch cables, management | ~$50 |
|
||
| Mounting material, velcro | ~$30 |
|
||
| Shipping and installation | ~$200 |
|
||
| Other total | **~$280** |
|
||
|
||
### 7.5 Total calculation
|
||
|
||
| Category | Amount |
|
||
|-----------|--------|
|
||
| Servers (3× node) | ~$10,470 |
|
||
| Network (switch + cables) | ~$780 |
|
||
| Rack + power | ~$810 |
|
||
| Other | ~$280 |
|
||
| **Total** | **~$12,340** |
|
||
| Reserve (10–15%) | ~$1,200–1,800 |
|
||
| **Total with reserve** | **~$13,500–$14,100** |
|
||
|
||
Budget **$10,000–$15,000** is achievable. Using cheaper CPUs (EPYC 4124P / Xeon E-2488), it can be built for ~$8,000–9,000, but with limited performance for Ceph.
|
||
|
||
**Possible savings:**
|
||
- CPU: 2× EPYC 4124P (4C) + 1× more powerful node → ~$800 savings (but asymmetric cluster)
|
||
- OSD: 2× instead of 3× SSD/node → ~$500 savings (less capacity)
|
||
- Switch: 12-port instead of 24-port → ~$300 savings
|
||
|
||
---
|
||
|
||
## 8. Topology diagram
|
||
|
||
```mermaid
|
||
flowchart TB
|
||
subgraph Rack["15U Rack (office)"]
|
||
U1["U1: 10GbE Switch (MikroTik)"]
|
||
U2["U2: UPS 2000 VA"]
|
||
U4["U4: Node 1 — Proxmox + Ceph OSD"]
|
||
U5["U5: Node 2 — Proxmox + Ceph OSD"]
|
||
U6["U6: Node 3 — Proxmox + Ceph OSD"]
|
||
end
|
||
|
||
subgraph Node1["Node 1 (detail)"]
|
||
N1_CPU["CPU: EPYC 9224 (24C)"]
|
||
N1_RAM["RAM: 128 GB DDR5"]
|
||
N1_OS["OS: 2× 240 GB SSD (RAID 1)"]
|
||
N1_OSD1["OSD.0: 960 GB SSD"]
|
||
N1_OSD2["OSD.1: 960 GB SSD"]
|
||
N1_OSD3["OSD.2: 960 GB SSD"]
|
||
N1_NIC["NIC: 4× 10GbE SFP+"]
|
||
N1_BMC["BMC: 1× 1GbE"]
|
||
end
|
||
|
||
U1 ---|"4× 10GbE LACP<br/>(public + cluster)"| U4
|
||
U1 ---|"4× 10GbE LACP"| U5
|
||
U1 ---|"4× 10GbE LACP"| U6
|
||
|
||
U4 --- N1_CPU
|
||
U4 --- N1_RAM
|
||
U4 --- N1_OS
|
||
U4 --- N1_OSD1
|
||
U4 --- N1_OSD2
|
||
U4 --- N1_OSD3
|
||
U4 --- N1_NIC
|
||
U4 --- N1_BMC
|
||
|
||
subgraph Ceph["Ceph Cluster"]
|
||
CEPH_MON["3× MON (1 per node)"]
|
||
CEPH_MGR["3× MGR (1 per node)"]
|
||
CEPH_OSD["9× OSD (3 per node)"]
|
||
end
|
||
|
||
U4 --- CEPH_MON
|
||
U5 --- CEPH_MON
|
||
U6 --- CEPH_MON
|
||
U4 --- CEPH_MGR
|
||
U5 --- CEPH_MGR
|
||
U6 --- CEPH_MGR
|
||
U4 --- CEPH_OSD
|
||
U5 --- CEPH_OSD
|
||
U6 --- CEPH_OSD
|
||
|
||
subgraph Proxmox["Proxmox VE Cluster"]
|
||
PMX_HA["HA Group (3 nodes)"]
|
||
PMX_HA --- U4
|
||
PMX_HA --- U5
|
||
PMX_HA --- U6
|
||
end
|
||
|
||
subgraph Uplink["Internet / LAN"]
|
||
UPLINK_SW["Office LAN<br/>(1 GbE)"]
|
||
end
|
||
|
||
U1 ---|"1× Cat6A<br/>1 GbE"| UPLINK_SW
|
||
U1 ---|"Internet<br/>(ISP router)"| UPLINK_SW
|
||
```
|
||
|
||
---
|
||
|
||
## 9. Summary and key decisions
|
||
|
||
| Decision | Variant | Rationale |
|
||
|------------|----------|------------|
|
||
| Hypervisor | Proxmox VE | HYPERVISORS.md: "For SME / low budget — open source, built-in Ceph, no license costs". Ideal for demo. |
|
||
| Storage | Ceph (3× replication) | STORAGE.md + SERVER-CONFIG.md: Ceph is the recommended SDS for Proxmox, 3 nodes minimum for quorum. |
|
||
| CPU | Single-socket EPYC 9224 / Xeon 5418Y | Compromise between price (Mini variant ~1 socket) and performance for Ceph (Ceph variant ~12+ cores). |
|
||
| Network | 10 GbE SFP+ (instead of 25 GbE) | KB recommends 25 GbE, but for low-cost demo 10 GbE is sufficient. The concept (public/cluster network separation) remains the same. |
|
||
| Rack | 15U wall-mount | Suitable for office, no raised floor, no precision cooling. |
|
||
| UPS | 2000 VA line-interactive | DATACENTERS.md: VI type for smaller racks. Sufficient for demo. |
|
||
| License | Proxmox VE (free) | No license costs, support can be purchased later. |
|
||
|
||
### Compromises compared to production deployment
|
||
|
||
- **25 GbE → 10 GbE**: lower Ceph cluster network throughput (not an issue in demo environment)
|
||
- **HDD → SSD**: for Ceph OSD we choose SSD instead of HDD (higher price, better performance — demo focuses on functionality, not capacity)
|
||
- **2× 10 GbE public + 2× 10 GbE cluster → combined on LACP**: can be merged when ports are scarce, but separation is better
|
||
- **Cooling**: office AC, not DC-grade precision cooling (PUE ~1.5–1.8)
|
||
|
||
### What KB does not address (supplemented from practice)
|
||
|
||
KB does not contain specific component prices — the budget is an approximate market estimate. It also does not specify a concrete switch model with L2+ features (VLAN, LACP, Jumbo frames). Here we follow common practice for the SOHO/SME segment.
|
||
|
||
---
|
||
|
||
## 10. References from KB
|
||
|
||
- **DATACENTERS.md** — rack layout, power chain, UPS types, cooling classes (ASHRAE), cabling standards
|
||
- **HYPERVISORS.md** — Proxmox VE as open source variant, platform comparison, Mini variant (2–3 hosts), Ceph connectivity
|
||
- **SERVER-CONFIG.md** — Pure Ceph variant (3–6 hosts), HW specification, network design, BIOS settings
|
||
- **STORAGE.md** — Ceph architecture (MON/MGR/OSD, CRUSH map, BlueStore, replication), SDS overview
|
||
- **CONNECTIVITY.md** — Ethernet speeds (10/25 GbE), SFP+ form factor, NIC placement, management port
|
||
- **NETWORKING.md** — VLAN segmentation, MTU and jumbo frames, best practices
|
||
- **SERVER-HW.md** — CPU selection (EPYC vs Xeon), RAM population (1DPC/2DPC), NUMA, form factors
|
||
|
||
---
|
||
|
||
*Last revision: 2026-06-04*
|