758 lines
37 KiB
Markdown
758 lines
37 KiB
Markdown
# ⚙️ Server configuration — best practices by workload
|
||
|
||
## General BIOS/UEFI settings
|
||
|
||
| Setting | Recommendation | Rationale |
|
||
|-----------|-----------|------------|
|
||
| **Boot mode** | UEFI | Secure Boot, GPT, larger disks |
|
||
| **Power profile** | Performance / OS Control | Max performance, C-States disabled |
|
||
| **Hyper-Threading** | Enabled | +30-50 % throughput for multi-thread |
|
||
| **Virtualization** | Enabled (VT-x/AMD-V) | Required for hypervisor, containers |
|
||
| **SR-IOV** | Enabled | GPU, NIC passthrough |
|
||
| **NUMA** | Enabled | NUMA-aware scheduling |
|
||
| **ACPI** | Enabled | Power management, OS-level |
|
||
| **Secure Boot** | Enabled | Secure boot chain |
|
||
| **TPM** | Enabled | Measured boot, key storage |
|
||
|
||
---
|
||
|
||
## 1. Database servers
|
||
|
||
### CPU Selection
|
||
|
||
| DB type | CPU preference | Rationale |
|
||
|--------|---------------|------------|
|
||
| **OLTP** (PostgreSQL, MySQL) | High clock, moderate cores | Low latency per transaction, limited parallelism |
|
||
| **OLAP** (ClickHouse, Snowflake) | Many cores, AVX-512 | Columnstore, high parallelism |
|
||
| **In-memory** (Redis, Memcached) | High clock, low cache latency | Single-threaded (Redis), RAM bandwidth |
|
||
| **Document** (MongoDB) | Balance (clock × cores) | Mixed workload |
|
||
| **Distributed** (Cassandra, Scylla) | Many cores, high cache | Shard-per-core (Scylla), compaction |
|
||
| **Oracle OLTP** | High clock, moderate cores, core-factor aware | CPU license cost (core factor 0.5 for AMD EPYC and Intel Xeon) |
|
||
| **Oracle OLAP / DW** | Many cores, large SGA, in-memory option | Parallel query, Exadata Smart Scan, compression |
|
||
|
||
### Oracle CPU licensing — core factor
|
||
|
||
Oracle licenses per core with a correction factor depending on the processor. Factor 0.5 means 2 cores = 1 Oracle license.
|
||
|
||
| Processor | Core factor | 64 physical cores → Oracle licenses |
|
||
|----------|-------------|--------------------------------------|
|
||
| AMD EPYC (all series) | 0.5 | 32 |
|
||
| Intel Xeon (Scalable) | 0.5 | 32 |
|
||
| IBM POWER | 1.0 | 64 |
|
||
| ARM (Ampere Altra) | 0.5 | 32 |
|
||
|
||
**Impact on CPU selection**: At the same Oracle license cost, EPYC with more cores is more advantageous — you get more compute power for the same license price.
|
||
|
||
### Configuration by company size and storage type
|
||
|
||
#### Variant A: Small company — local NVMe RAID
|
||
|
||
| Component | Recommendation | Note |
|
||
|-----------|-----------|----------|
|
||
| **CPU** | 1× EPYC 9124/9224 or Intel Xeon 4410Y (8-16C) | 1 socket, high clock |
|
||
| **RAM** | 64-256 GB (8-16 GB/core) | DDR5-4800, 1DPC |
|
||
| **OS disk** | 2× SATA/SAS SSD, RAID 1 (240-480 GB) | For OS + binaries |
|
||
| **Data disk** | 4-6× NVMe (U.2/E3.S), RAID 10 | Local data, no sharing |
|
||
| **WAL disk** | 2× NVMe RAID 1 (400-800 GB) | PostgreSQL only |
|
||
| **Network** | 2× 25 GbE (LACP) | Application traffic + management |
|
||
| **Form factor** | 1U or 2U | Single node, no cluster |
|
||
| **Storage backend** | Local RAID controller (PERC/Broadcom) | HW RAID 10 or SW RAID (mdadm) |
|
||
| **HA** | Application manages failover (patroni, repmgr, orchestrator) | Standby node on failure |
|
||
|
||
**Use case**: Startup, branch office, dev/test, < 500 users, single database server, low availability requirements.
|
||
|
||
#### Variant B: Medium company — local NVMe + asynchronous replication
|
||
|
||
| Component | Recommendation | Note |
|
||
|-----------|-----------|----------|
|
||
| **CPU** | 1-2× EPYC 9334/9374F or Intel Xeon 5418Y (16-24C) | 1-2 socket, balanced |
|
||
| **RAM** | 128-512 GB (8-16 GB/core) | DDR5-4800/5600, 1DPC |
|
||
| **OS disk** | 2× NVMe RAID 1 (2× 480 GB) | OS + binaries |
|
||
| **Data disk** | 6-8× NVMe, RAID 10 | Local NVMe, 3-6 TB usable |
|
||
| **WAL disk** | 2× NVMe RAID 1 (2× 800 GB) | Separate from data |
|
||
| **Network** | 2× 25 GbE (app) + 2× 25 GbE (replication) | Application and replication networks separated |
|
||
| **Form factor** | 2U | Primary + replica node |
|
||
| **Storage backend** | SW RAID (mdadm) or HW RAID (PERC H965) | Write-back cache with BBU |
|
||
| **HA** | Patroni / repmgr / MySQL InnoDB Cluster | Asynchronous replication to 1-2 standby |
|
||
|
||
**Use case**: E-commerce, medium SaaS, 500-5000 users, RPO < 1 min, RTO < 5 min.
|
||
|
||
#### Variant C: Large company — FC SAN (enterprise)
|
||
|
||
| Component | Recommendation | Note |
|
||
|-----------|-----------|----------|
|
||
| **CPU** | 2× EPYC 9654/9965 or Xeon 8592+/6980P (48-128C) | 2 socket, max cores, large cache |
|
||
| **RAM** | 512 GB - 2 TB (8-16 GB/core) | DDR5, 2DPC (speed penalty), 12 channels (EPYC) |
|
||
| **OS disk** | 2× SATA SSD RAID 1 (2× 480 GB) | OS only, data on SAN |
|
||
| **Data + WAL** | LUNs from FC SAN | Hitachi VSP / Dell PowerMax / Pure //X |
|
||
| **HBA** | 2× dual-port FC HBA (32/64 Gb) | Multipath (active-active), FC-NVMe |
|
||
| **Network** | 2× 25/100 GbE (app) + 2× 32/64 Gb FC (storage) | App and storage networks separated |
|
||
| **Form factor** | 2U | 2-8 node cluster (RAC, AlwaysOn AG) |
|
||
| **Storage backend** | FC SAN — LUN per database | Thin provisioning, RAID on SAN, snapshots |
|
||
| **HA** | Oracle RAC / SQL Server AOAG / PostgreSQL Patroni | Synchronous replication, FC multipath |
|
||
|
||
**SAN advantages**: Centralized management, snapshots, cloning, disaster recovery (SRDF/Metro), separate storage network, higher availability.
|
||
**Disadvantages**: Higher latency compared to local NVMe (~50-200 µs over SAN vs ~10 µs local NVMe), higher CAPEX, vendor lock-in.
|
||
|
||
#### Variant D: Large company — Ceph / SDS backend
|
||
|
||
| Component | Recommendation | Note |
|
||
|-----------|-----------|----------|
|
||
| **CPU** | 2× EPYC 9334/9654 (16-32C) | Fewer cores than SAN variant — part of CPU goes to Ceph client |
|
||
| **RAM** | 256-512 GB | Less RAM — Ceph client cache is not as effective as local buffer |
|
||
| **OS disk** | 2× SATA SSD RAID 1 (2× 480 GB) | OS |
|
||
| **Network** | 2× 25/100 GbE (app) + 2× 25/100 GbE (Ceph public) | App and Ceph traffic over Ethernet |
|
||
| **HBA** | Storage HBA in IT/HBA mode (no RAID) | For Ceph OSD node, not DB node |
|
||
| **Form factor** | 2U | DB node + separate Ceph OSD node |
|
||
| **Storage backend** | RBD (RADOS Block Device) over Ceph | 3× replication or erasure coding |
|
||
| **HA** | Application + Ceph inherent HA | Ceph self-healing, auto-rebalance |
|
||
|
||
**Ceph advantages**: No vendor lock-in, horizontal scaling, unified platform for block/file/object, lower CAPEX.
|
||
**Disadvantages**: Higher latency and CPU overhead (Ceph client → network → OSD), variable performance, more complex troubleshooting.
|
||
|
||
#### Variant E: Cloud — RDS / CloudSQL / Azure SQL
|
||
|
||
| Component | Recommendation | Note |
|
||
|-----------|-----------|----------|
|
||
| **Compute** | AWS RDS (db.r7g/r8g), Azure SQL (GP/BC/Hyperscale) | Managed service, no OS access |
|
||
| **Storage** | EBS gp3 / io2, Azure Premium SSD v2, Cloud SQL SSD | Automatic scaling, PITR, multi-AZ |
|
||
| **Network** | Security Group, Private Link, VPC peering | No HBA, no SAN — everything over Ethernet |
|
||
| **HA** | Multi-AZ (synchronous), read replicas | Managed failover, RTO < 60 s |
|
||
| **Backup** | Automated, PITR (7-35 days) | No management required |
|
||
|
||
**Use case**: No on-prem hardware, elastic scaling, pay-per-use, lower operational overhead.
|
||
**Disadvantages**: Higher long-term costs, data residency, network latency, limited customization.
|
||
|
||
### Variant comparison
|
||
|
||
| Aspect | Local NVMe (small) | Local NVMe (medium) | FC SAN | Ceph | Cloud |
|
||
|--------|---------------------|----------------------|--------|------|-------|
|
||
| **Latency** | ~10 µs | ~10 µs | ~50-200 µs | ~100-500 µs | ~100-1000 µs |
|
||
| **Scaling** | Vertical | Vertical | Horizontal | Horizontal | Elastic |
|
||
| **CAPEX** | Low | Medium | High | Medium | None (OPEX) |
|
||
| **Operational overhead** | Low | Low | High (SAN admin) | Medium | None |
|
||
| **HA** | Application | Patroni/Cluster | RAC/AOAG | Ceph HA | Managed |
|
||
| **RPO** | 1-5 min | < 1 min | < 10 s | < 30 s | < 60 s |
|
||
| **RTO** | 5-15 min | < 5 min | < 2 min | < 5 min | < 60 s |
|
||
| **Number of servers** | 1-2 | 2-4 | 4-16 | 6-20+ | 0 (managed) |
|
||
| **Company** | Startup/SME | SME/Enterprise | Enterprise | Enterprise | Any |
|
||
|
||
### PostgreSQL parameter matrix by storage type
|
||
|
||
| Parameter | Local NVMe | FC SAN | Ceph RBD |
|
||
|----------|-----------|--------|----------|
|
||
| `random_page_cost` | 1.1 | 1.5-2.0 | 2.0-3.0 |
|
||
| `effective_io_concurrency` | 300 | 100-200 | 50-100 |
|
||
| `synchronous_commit` | off (NVMe cache) | on (SAN cache) | off (Ceph cache) |
|
||
| `full_page_writes` | on | on | on (even over Ceph) |
|
||
|
||
### Storage layout by backend type
|
||
|
||
**Local NVMe (small/medium):**
|
||
```
|
||
Mount point FS RAID Disk Purpose
|
||
/ ext4 1 (mirror) 2× SATA SSD OS
|
||
/data xfs 10 4-8× NVMe Data
|
||
/wal xfs 1 (mirror) 2× NVMe WAL (PG)
|
||
```
|
||
|
||
**FC SAN (enterprise):**
|
||
```
|
||
Mount point FS Device Purpose
|
||
/ ext4 local RAID 1 (2× SSD) OS
|
||
/dev/sdb xfs FC LUN 1 (500 GB) WAL (PG)
|
||
/dev/sdc xfs FC LUN 2 (2 TB) Data
|
||
/dev/sdd xfs FC LUN 3 (2 TB) Indexes (separate)
|
||
```
|
||
|
||
**Ceph RBD:**
|
||
```
|
||
Mount point FS Ceph device Purpose
|
||
/ ext4 local RAID 1 (2× SSD) OS
|
||
/dev/rbd0 xfs rbd datastore-01 Data + WAL (Ceph RBD)
|
||
```
|
||
|
||
### Kernel tuning by variant
|
||
|
||
**Local NVMe:**
|
||
```
|
||
vm.dirty_ratio = 30
|
||
vm.dirty_background_ratio = 5
|
||
```
|
||
|
||
**FC SAN:**
|
||
```
|
||
# SAN storage — higher latency, less aggressive flush
|
||
vm.dirty_ratio = 20
|
||
vm.dirty_background_ratio = 3
|
||
vm.dirty_expire_centisecs = 3000 # Defer writes (SAN cache)
|
||
```
|
||
|
||
**Ceph RBD:**
|
||
```
|
||
# Ceph RBD — network storage, optimize for RBD cache
|
||
vm.dirty_ratio = 15
|
||
vm.dirty_background_ratio = 2
|
||
# RBD cache settings
|
||
# rbd cache = true (client-side)
|
||
# rbd cache size = 256-512 MB
|
||
```
|
||
|
||
### Database-specific tuning
|
||
|
||
| Parameter | PostgreSQL | MySQL | Oracle | MongoDB |
|
||
|----------|-----------|-------|--------|---------|
|
||
| **Cache** | `shared_buffers` 25 % RAM | `innodb_buffer_pool` 70-80 % RAM | `SGA_TARGET` 60-80 % RAM | `WiredTiger cache` 50-80 % RAM |
|
||
| **OS cache** | `effective_cache_size` 75 % RAM | OS cache + InnoDB | OS cache (double buffering risk with large SGA) | OS cache |
|
||
| **Write buffer** | `wal_buffers` 64-256 MB | `innodb_log_file_size` 1-4 GB | Redo log (2-4 groups, 200 MB-4 GB) | WiredTiger log |
|
||
| **Connections** | `max_connections` 50-500 | `max_connections` 100-500 | `processes` 200-2000 | maxIncomingConnections |
|
||
| **I/O** | `effective_io_concurrency` 200 | `innodb_io_capacity` 2000 | `db_file_multiblock_read_count` 128 | WiredTiger eviction |
|
||
| **Huge pages** | `huge_pages = try` | `large-pages = ON` | `use_large_pages = only` (mandatory) | transparent_hugepages=never |
|
||
| **Parallel query** | `max_parallel_workers` 4-8 | `innodb_parallel_read_threads` 4 | `parallel_degree_policy = auto` — up to 64 | — |
|
||
|
||
### Connectivity by variant
|
||
|
||
| Variant | App network | Storage network | Replication | Management |
|
||
|----------|---------|-------------|-----------|------------|
|
||
| **Local (small)** | 2× 25 GbE LACP | — | 2× 25 GbE (same) | iDRAC/iLO |
|
||
| **Local (medium)** | 2× 25 GbE LACP | — | 2× 25 GbE dedicated | iDRAC/iLO |
|
||
| **FC SAN** | 2× 25/100 GbE | 2× 32/64 Gb FC (multipath) | FC replication | iDRAC/iLO + SAN mgmt |
|
||
| **Ceph** | 2× 25/100 GbE | 2× 25/100 GbE (public net) | 2× 25/100 GbE (cluster net) | iDRAC/iLO + Ceph mgmt |
|
||
| **Cloud** | Elastic IP / Private Link | — | — | AWS Console / API |
|
||
| **Oracle Standalone** | 2× 25 GbE LACP | ASM (2× 25 GbE or FC 32G) | Data Guard 2× 25 GbE | iLO + ASM mgmt |
|
||
| **Oracle RAC** | 2-4× 25/100 GbE | 2× 64 Gb FC (multipath) | Cache Fusion interconnect | iLO + SAN mgmt |
|
||
| **Oracle Exadata** | 4-8× 100 GbE RoCE | NVMe over Fabric | RDMA interconnect | Exadata CLI + OEDA |
|
||
|
||
### Oracle-specific configuration
|
||
|
||
#### Oracle ASM — diskgroup layout
|
||
|
||
Oracle ASM (Automatic Storage Management) replaces traditional filesystem + volume manager:
|
||
|
||
| Diskgroup | Redundancy | Disks | Purpose |
|
||
|-----------|-----------|-------|-------|
|
||
| **DATA** | Normal (2× mirror) | 4-12× FC LUN/NVMe | Data files, temp files, control files |
|
||
| **FRA** (Flash Recovery Area) | Normal (2× mirror) | 2-6× FC LUN/NVMe | Archive logs, backup, flashback logs |
|
||
| **REDO** | High (3× mirror) | 2-4× FC LUN/NVMe | Online redo log groups (I/O critical) |
|
||
| **SPFILE** | Normal | 2× small LUN | Server parameter file |
|
||
|
||
**ASM striping**: Coarse (1 MB) for regular data, Fine (128 KB) for redo logs (lower write latency).
|
||
|
||
#### Variant O1: Standalone Oracle (small/medium, single instance)
|
||
|
||
| Parameter | Small (< 500 users) | Medium (500-2000 users) |
|
||
|----------|---------------------|------------------------|
|
||
| **CPU** | 1-2× EPYC 9124-9224 / Xeon 4410Y (8-16C) | 2× EPYC 9334-9374F / Xeon 5418Y (16-24C) |
|
||
| **RAM (SGA + PGA)** | 64-128 GB (SGA 70 %, PGA 30 %) | 128-512 GB (SGA 60-80 %, PGA 20-40 %) |
|
||
| **Huge pages** | Yes (vm.nr_hugepages) — mandatory for SGA | Yes |
|
||
| **OS disk** | 2× SATA SSD RAID 1 (240 GB) | 2× NVMe RAID 1 (480 GB) |
|
||
| **DATA + FRA** | 4-6× NVMe, ASM normal redundancy | 6-8× NVMe or FC LUN, ASM normal |
|
||
| **REDO** | 2-4× NVMe (separate from DATA), ASM high | 4× FC LUN (separate), ASM high |
|
||
| **Archive log** | Local FRA | FC LUN (FRA diskgroup) |
|
||
| **Network (app)** | 2× 25 GbE LACP | 2-4× 25/100 GbE LACP |
|
||
| **Network (storage)** | — (local NVMe) | 2× FC 32G multipath |
|
||
| **Network (Data Guard)** | — | 2× 25 GbE dedicated |
|
||
| **DB version** | Oracle SE2 (max 16 threads) | Oracle EE (unlimited) |
|
||
|
||
**Use case**: Dev/test, small production DBs, branch offices. SE2 license = max 16 CPU threads, limited parallel execution.
|
||
|
||
#### Variant O2: Oracle Data Guard (medium/large, HA + DR)
|
||
|
||
Primary + standby in active-passive mode, Active Data Guard possible for reporting.
|
||
|
||
| Parameter | Recommendation |
|
||
|----------|-----------|
|
||
| **CPU** | 2× EPYC 9654-9965 / Xeon 8592+ (32-64C) |
|
||
| **RAM** | 256-1024 GB (SGA 60-80 %, PGA 20-40 %) |
|
||
| **Huge pages** | Yes (50-80 % RAM allocated for SGA) |
|
||
| **OS disk** | 2× NVMe RAID 1 (480 GB) |
|
||
| **Storage** | FC SAN LUN (DATA + FRA + REDO separate) or NVMe + ASM |
|
||
| **HBA** | 2× dual-port FC 32/64 Gb (multipath active-active) |
|
||
| **App network** | 2-4× 25/100 GbE LACP |
|
||
| **Storage network** | 2× FC 32/64 Gb multipath |
|
||
| **Data Guard network** | 2× 25/100 GbE dedicated (sync or async) |
|
||
| **Data Guard mode** | Maximum Availability (sync, fallback to async) — RPO = 0 |
|
||
| **Topology** | 1 primary + 1-2 standby (physical), far sync for geo-DR |
|
||
| **Active Data Guard** | Standby open for read (reporting, backup) — requires ADG license |
|
||
|
||
**Data Guard latency**:
|
||
```text
|
||
Synchronous (Maximum Availability):
|
||
Primary COMMIT → LGWR flush REDO → sync over network → Standby LGWR → ACK → ~1-5 ms
|
||
RPO = 0, impact on write latency
|
||
|
||
Asynchronous (Maximum Performance):
|
||
Primary COMMIT → LGWR flush REDO → async to standby buffer → ~0.1-1 ms
|
||
RPO = a few seconds, negligible write impact
|
||
```
|
||
|
||
**Network requirements for Data Guard sync**:
|
||
- RTT < 2 ms for synchronous mode (recommended < 1 ms)
|
||
- Min. 10 GbE, recommended 25 GbE (throughput = REDO rate × 2)
|
||
- REDO rate: OLTP ~50-500 MB/s, batch ~500-2000 MB/s
|
||
- At REDO rate 500 MB/s and 25 GbE → ~20 % link utilization
|
||
|
||
#### Variant O3: Oracle RAC (large, enterprise)
|
||
|
||
Multi-instance cluster with shared storage and Cache Fusion.
|
||
|
||
| Parameter | Recommendation |
|
||
|----------|-----------|
|
||
| **Number of nodes** | 2-4 (typical), max 64 (RAC cluster) |
|
||
| **CPU per node** | 2× EPYC 9654-9965 / Xeon 8592+ (32-64C) |
|
||
| **RAM per node** | 512-2048 GB (SGA 60-80 %, PGA 20-40 %) |
|
||
| **Huge pages** | Yes (1 GB pages if RAM > 512 GB) |
|
||
| **Storage** | FC SAN — shared LUNs (ASM normal/high redundancy) |
|
||
| **HBA** | 2× dual-port FC 64 Gb (multipath, active-active) |
|
||
| **App network** | 2-4× 25/100 GbE LACP (VIP, SCAN listener) |
|
||
| **Storage network** | 2-4× FC 64 Gb (multipath per node) |
|
||
| **Cache Fusion interconnect** | 2× 100 GbE (RoCE v2 or InfiniBand) — dedicated |
|
||
| **RAC interconnect latency** | < 5 µs (recommended), max < 10 µs |
|
||
| **ASM** | Normal redundancy (2-way mirror) |
|
||
| **Oracle Clusterware** | Voting disk (3× 1 GB LUN), OCR (3× 500 MB LUN) |
|
||
| **Service** | OLTP_service, REPORT_service, BATCH_service |
|
||
|
||
**Cache Fusion — critical interconnect**:
|
||
```
|
||
Node A (DB instance) ←──→ Node B (DB instance)
|
||
│ │
|
||
└──────── ASM ───────────┘
|
||
│
|
||
FC SAN (shared storage)
|
||
|
||
Cache Fusion traffic: dirty block transfer between instances
|
||
→ Latency < 5 µs, otherwise RAC scaling degrades
|
||
→ Capacity: 2× 100 GbE, dedicated switch or InfiniBand HDR100
|
||
→ Recommended MTU: 9000 (jumbo frames)
|
||
```
|
||
|
||
**RAC sizing by transaction count**:
|
||
|
||
| TPS | Nodes | CPU per node | RAM per node | Interconnect |
|
||
|-----|------|-------------|-------------|-------------|
|
||
| < 10 000 | 2 | 16-24C | 256 GB | 2× 25 GbE |
|
||
| 10 000 - 50 000 | 2-4 | 32-48C | 512 GB | 2× 100 GbE RoCE |
|
||
| 50 000 - 200 000 | 4-8 | 48-64C | 1024 GB | 2× 100 GbE RoCE / InfiniBand |
|
||
| > 200 000 | 8+ | 64-128C | 2048 GB | InfiniBand HDR100/HDR200 |
|
||
|
||
**RAC sizing — license cost calculation**:
|
||
|
||
```text
|
||
Example: 4-node RAC, each node 2× EPYC 9654 (96C) = 192 cores per node
|
||
Core factor 0.5 → 96 Oracle licenses per node
|
||
4 × 96 = 384 Oracle EE licenses
|
||
At ~$47.5k/license → ~$18.2M (licenses only, without 22 % annual support)
|
||
```
|
||
|
||
#### Variant O4: Oracle Exadata (hyperscale)
|
||
|
||
Engineered system — optimal for hybrid workload (OLTP + DW).
|
||
|
||
| Parameter | X9M / X10M | Use case |
|
||
|----------|-----------|----------|
|
||
| **Database servers** | 2-8× (Xeon, 1.5-6 TB RAM, NVMe) | Compute |
|
||
| **Storage servers** | 3-18× (NVMe + HDD, Smart Scan) | Predicate offloading |
|
||
| **Smart Scan** | Filtering at storage layer | Less data over network, higher throughput |
|
||
| **RoCE interconnect** | 100 GbE (RDMA) | Low latency, high bandwidth |
|
||
| **In-Memory Column Store** | Optional license | Real-time analytics without ETL |
|
||
| **HCC (Hybrid Columnar Compression)** | Compression in storage servers | Up to 10-15× compression for DW |
|
||
| **Rack power** | ~15-30 kW (full rack) | Higher density |
|
||
|
||
**When to choose Exadata over standalone RAC**:
|
||
- OLTP > 50 000 TPS
|
||
- Consolidation needed (multiple DBs on one cluster)
|
||
- Smart Scan significantly accelerates reporting on production data
|
||
- HCC for storage savings on DW workloads
|
||
|
||
---
|
||
|
||
|
||
|
||
## 2. Hypervisor host (ESXi / KVM / Hyper-V)
|
||
|
||
### Configuration by size and storage type
|
||
|
||
#### Variant A: Small company — local storage (2-3 hosts)
|
||
|
||
| Component | Recommendation | Note |
|
||
|-----------|-----------|----------|
|
||
| **CPU** | 1× EPYC 9224/9254 or Xeon 4410Y/5418Y (12-24C) | 1 socket, enough cores for VM density |
|
||
| **RAM** | 128-256 GB (4-8 GB/core) | DDR5, 1DPC |
|
||
| **OS disk** | 2× SATA SSD RAID 1 (2× 240-480 GB) | ESXi / Proxmox / Hyper-V boot |
|
||
| **VM storage** | 4-6× SATA/SAS SSD, RAID 5/6 or 10 | Local RAID, 4-12 TB usable |
|
||
| **Network** | 2-4× 10/25 GbE (LACP) | Shared for everything (management + VM + storage) |
|
||
| **Hypervisor** | VMware vSphere Standard / Proxmox VE / Hyper-V | Basic license, no enterprise features |
|
||
| **Storage backend** | Local RAID controller (PERC H755, Broadcom 9560) | HW RAID with cache, write-back |
|
||
| **HA** | VMware HA / Proxmox HA | Restart VM on another host on failure |
|
||
| **Backup** | Veeam B&R Free / PBS (Proxmox Backup Server) | Local or USB disk |
|
||
|
||
**Use case**: Small office, branch office, dev/test, < 10 VMs, low budget, simple management.
|
||
**Limitations**: No vMotion without shared storage, outage during host failure (HA restart, not seamless).
|
||
|
||
#### Variant B: Medium company — vSAN / Ceph (3-6 hosts)
|
||
|
||
| Component | Recommendation | Note |
|
||
|-----------|-----------|----------|
|
||
| **CPU** | 1-2× EPYC 9334/9654 or Xeon 5418Y/8592+ (16-32C) | 1-2 socket |
|
||
| **RAM** | 256-512 GB (4-8 GB/core) | DDR5, 2DPC (minimal penalty) |
|
||
| **OS disk** | 2× SATA SSD RAID 1 or 2× M.2 NVMe (BOSS-S1) | Separate from VM storage |
|
||
| **Cache tier** | 1-2× NVMe (vSAN caching / Ceph WAL+DB) | For write performance |
|
||
| **Capacity tier** | 4-8× SATA/SAS SSD or HDD (vSAN capacity / Ceph OSD) | HDD for capacity, SSD for performance |
|
||
| **Network** | 4× 25/100 GbE — 2× VM + mgmt, 2× storage (vSAN/Ceph) | Separate storage network, RDMA (RoCE v2) |
|
||
| **Hypervisor** | VMware vSAN / Proxmox Ceph / StarWind HCI | HCI license (vSAN ~$2.5k/Core) |
|
||
| **Storage backend** | vSAN OSA/ESA or Ceph (RADOS) | Distributed storage, auto-rebalance |
|
||
| **HA** | vSphere HA + vSAN / Proxmox HA + Ceph | vMotion, DRS, automated failover |
|
||
| **Failover** | N+1 (one host as reserve) | vSAN requires min. 4 hosts (ESA min. 3) |
|
||
|
||
**Pure Ceph variant (Proxmox / OpenStack)**:
|
||
```
|
||
Proxmox node (3-6×):
|
||
├── CPU: 1× EPYC 9224-9334 (12-24C)
|
||
├── RAM: 128-256 GB
|
||
├── OS: 2× SATA SSD RAID 1
|
||
├── Ceph OSD: 4-8× NVMe/SATA SSD (RAW, HBA mode)
|
||
├── Network: 2× 25 GbE (public) + 2× 25 GbE (cluster)
|
||
└── Storage: Ceph 3× replication, CRUSH host failure domain
|
||
```
|
||
|
||
**VMware vSAN variant (4-6 hosts)**:
|
||
```
|
||
vSAN node (4-6×):
|
||
├── CPU: 1-2× EPYC/Xeon (16-32C)
|
||
├── RAM: 256-512 GB
|
||
├── OS: 2× M.2 NVMe (BOSS-S1) or SD card (deprecated)
|
||
├── vSAN cache: 1-2× NVMe (write buffer)
|
||
├── vSAN capacity: 4-8× SATA SSD (vSAN ESA) or HDD (vSAN OSA)
|
||
├── Network: 2× 25/100 GbE (VM) + 2× 25 GbE (vSAN)
|
||
└── Storage: vSAN ESA (all-NVMe) or OSA (hybrid)
|
||
```
|
||
|
||
**Use case**: SME, enterprise division, 10-100 VMs, need for vMotion, DRS, HA, simple storage management.
|
||
|
||
#### Variant C: Large company — FC SAN (6+ hosts)
|
||
|
||
| Component | Recommendation | Note |
|
||
|-----------|-----------|----------|
|
||
| **CPU** | 2× EPYC 9654/9965 or Xeon 8592+/6980P (32-64C) | 2 socket, max VM density |
|
||
| **RAM** | 512 GB - 2 TB (4-8 GB/core) | DDR5, 2DPC |
|
||
| **OS disk** | 2× SATA SSD RAID 1 or SD card (vSphere) | Boot, image storage |
|
||
| **VM storage** | LUNs from FC SAN — VMFS / NFS datastores | Hitachi, Dell, Pure, HPE storage |
|
||
| **HBA** | 2× dual-port FC HBA 32/64 Gb | Multipath, FC-NVMe |
|
||
| **Network** | 4-8× 25/100 GbE — split by traffic type | Management, VM, vMotion, FT separated |
|
||
| **Hypervisor** | VMware vSphere Enterprise+ / Hyper-V DC | Enterprise license, DRS, HA, FT |
|
||
| **Storage backend** | FC SAN — VMFS 8 datastores, VVols | Thin provisioning, storage DRS, array snapshots |
|
||
| **HA** | vSphere HA + DRS + vCenter | vMotion, DRS, FT, SRM for DR |
|
||
| **Failover** | N+1 or admission control (CPU/RAM reserve) | Reserved capacity for HA failover |
|
||
|
||
**Use case**: Enterprise, 100+ VMs, mix of DB and applications, centralized storage management, enterprise SLA.
|
||
|
||
#### Variant D: Hyperscale — Ceph / SDS (20+ hosts)
|
||
|
||
| Component | Recommendation | Note |
|
||
|-----------|-----------|----------|
|
||
| **CPU** | 2× EPYC 9654/9965 (64-128C) | 2 socket, compute optimal |
|
||
| **RAM** | 512 GB - 1 TB (2-4 GB/core) | Low overcommit ratio for consistency |
|
||
| **OS disk** | 2× M.2 NVMe RAID 1 (BOSS) | Boot |
|
||
| **Network** | 4-8× 100 GbE (compute + storage) | Separate OVN/OVS for SDN, VXLAN tunneling |
|
||
| **Hypervisor** | OpenStack (Nova) / OpenShift (KubeVirt) | Open source, API-driven, multi-tenant |
|
||
| **Storage backend** | Ceph (RADOS, RBD, RGW, CephFS) | Unified storage, erasure coding (8+3) |
|
||
| **Orchestration** | OpenStack / Kubernetes | Infrastructure-as-Code, autoscaling |
|
||
| **HA** | OpenStack HA / Kubernetes HA | Self-healing, auto-rebalance |
|
||
|
||
**Use case**: Cloud provider, hyperscale, 500+ VMs, multi-tenant, maximum automation.
|
||
|
||
### Hypervisor variant comparison
|
||
|
||
| Aspect | Local (small) | vSAN/Ceph (medium) | FC SAN (large) | Ceph hyperscale |
|
||
|--------|---------------|---------------------|----------------|-----------------|
|
||
| **Storage** | Local RAID | vSAN / Ceph (HCI) | FC SAN (centralized) | Ceph (distributed) |
|
||
| **Number of hosts** | 2-3 | 3-6 | 6-50+ | 20+ |
|
||
| **VM latency** | ~10 µs (local) | ~100-500 µs | ~200 µs (SAN) | ~500-2000 µs |
|
||
| **CAPEX/host** | Low | Medium | High | Medium |
|
||
| **CAPEX storage** | Low | None (part of hosts) | High (SAN array) | None (part of hosts) |
|
||
| **Management** | Simple (per host) | vCenter / Proxmox | vCenter + SAN mgmt | OpenStack / K8s |
|
||
| **vMotion** | No (no shared storage) | Yes (vSAN / Ceph RBD) | Yes (FC LUN) | Yes (Ceph RBD) |
|
||
| **DRS** | No | Yes (vSphere) | Yes (vSphere) | OpenStack scheduler |
|
||
| **Scaling** | Vertical | Horizontal (add host) | Horizontal (host + SAN) | Horizontal |
|
||
|
||
### Network design by variant
|
||
|
||
#### Small (local storage)
|
||
|
||
| Traffic | VLAN | Speed | Teaming | Note |
|
||
|---------|------|----------|---------|----------|
|
||
| Management | Mgmt | 1 GbE | Active/Passive | Dedicated port (iLO/iDRAC) |
|
||
| VM + Storage | All | 2-4× 10/25 GbE | LACP | Shared, VLAN tagging |
|
||
|
||
```
|
||
┌──────────────────────────────────────────┐
|
||
│ Host │
|
||
│ ┌──────┐ ┌─────────────────────────────┐│
|
||
│ │ iLO │ │ NIC1 NIC2 ││
|
||
│ │ 1 GbE │ │ [LACP] 25 GbE ││
|
||
│ └──────┘ └──────────┬──────────────────┘│
|
||
└──────────────────────┼───────────────────┘
|
||
│
|
||
┌─────┴─────┐
|
||
│ Switch │
|
||
└───────────┘
|
||
```
|
||
|
||
#### Medium (vSAN / Ceph)
|
||
|
||
| Traffic | VLAN | Speed | Teaming | Note |
|
||
|---------|------|----------|---------|----------|
|
||
| Management | Mgmt | 1 GbE | Active/Passive | Dedicated iLO/iDRAC |
|
||
| VM | VM | 2× 25/100 GbE | LACP | VM traffic, migration |
|
||
| Storage | vSAN/Ceph | 2× 25/100 GbE | LACP or RDMA | Separate, Jumbo frames (MTU 9000) |
|
||
|
||
```
|
||
┌──────────────────────────────────────────┐
|
||
│ Host │
|
||
│ ┌──────┐ ┌──────────┐ ┌───────────────┐│
|
||
│ │ iLO │ │ NIC1 NIC2│ │ NIC3 NIC4 ││
|
||
│ │ 1 GbE │ │ VM traffic│ │ Storage (vSAN)││
|
||
│ └──────┘ └──────────┘ └───────────────┘│
|
||
└──────────────────────────────────────────┘
|
||
```
|
||
|
||
#### Large (FC SAN)
|
||
|
||
| Traffic | VLAN | Speed | Teaming | Note |
|
||
|---------|------|----------|---------|----------|
|
||
| Management | Mgmt | 1 GbE | Active/Passive | Dedicated |
|
||
| VM | VM | 2-4× 25/100 GbE | LACP | VM traffic |
|
||
| vMotion | vMotion | 2× 25 GbE | Dedicated | Multi-NIC vMotion |
|
||
| FT | FT | 2× 10/25 GbE | Dedicated | Low latency |
|
||
| Storage | — | 2× 32/64 Gb FC | Multipath | FC SAN |
|
||
|
||
```
|
||
┌──────────────────────────────────────────────┐
|
||
│ Host │
|
||
│ ┌──────┐ ┌────────────┐ ┌────┐ ┌─────────┐│
|
||
│ │ iLO │ │ NIC1-4 │ │HBA1│ │ HBA2 ││
|
||
│ │ 1 GbE │ │ VM+vMotion+FT│ │32Gb│ │ 32Gb ││
|
||
│ └──────┘ └────────────┘ └─┬──┘ └──┬──────┘│
|
||
└────────────────────────────┼───────┼───────┘
|
||
│ │
|
||
┌───────┴───┐ ┌─┴────────┐
|
||
│ Ethernet │ │ FC Switch │
|
||
│ Switch │ │ (Brocade/ │
|
||
│ │ │ Cisco) │
|
||
└───────────┘ └──────────┘
|
||
```
|
||
|
||
### BIOS for hypervisor — all variants
|
||
|
||
| Setting | Value | Rationale |
|
||
|-----------|---------|------------|
|
||
| Hyper-Threading | Enabled | Higher VM density |
|
||
| Virtualization Technology | Enabled | VT-x/AMD-V |
|
||
| VT-d / IOMMU | Enabled | Passthrough, SR-IOV |
|
||
| Power Management | Performance / OS | Minimize VM exit latency |
|
||
| C-States | Disabled | Lower VM exit latency (important for real-time VMs) |
|
||
| NUMA | Enabled | NUMA-aware VM placement |
|
||
| SR-IOV | Enabled | NIC/GPU virtualization |
|
||
| Adjacent Sector Prefetch | Enabled (Intel) | Better sequential reads |
|
||
| DCU Streamer / IP Prefetcher | Enabled | HW prefetch for VM workload |
|
||
| Patrol Scrub | Disabled (vSAN/Ceph) | Can cause latency spikes with SDS |
|
||
|
||
### Hypervisor selection by variant
|
||
|
||
| Criterion | VMware vSphere | Proxmox VE | Hyper-V | OpenStack |
|
||
|-----------|---------------|------------|---------|-----------|
|
||
| **Size** | SME - Enterprise | SME | SME - Enterprise | Hyperscale |
|
||
| **Storage** | vSAN, SAN, NFS | Ceph, ZFS, NFS | Storage Spaces, SAN | Ceph, manila |
|
||
| **License** | ~$1-5k/core | Free (support ~$500/host) | Part of Windows Server | Open source |
|
||
| **Familiarity** | Highest | Medium | Windows admin | Low |
|
||
| **Automation** | Terraform, Ansible, PowerCLI | Ansible, Terraform, PBS | PowerShell, SCVMM | Terraform, Heat, Ansible |
|
||
| **Ecosystem** | Broadest (Veeam, Zerto, SRM) | Growing (PBS, remote migration) | Windows ecosystem | Open source (Kolla, TripleO) |
|
||
|
||
---
|
||
|
||
## 3. Kubernetes node
|
||
|
||
### Node profiles
|
||
|
||
| Role | CPU | RAM | Storage | Network | Use case |
|
||
|------|-----|-----|---------|---------|----------|
|
||
| **General purpose** | 16-32 cores | 64-128 GB | 1× NVMe OS + 1×NVMe local | Web, API, microservices |
|
||
| **Memory optimized** | 32-64 cores | 256-512 GB | 1× NVMe OS + 2×NVMe local | In-memory cache, DB |
|
||
| **Compute optimized** | 64-128 cores | 128-256 GB | 1× NVMe OS | Batch, CI/CD |
|
||
| **GPU node** | 32-64 cores | 512-1024 GB | 1× NVMe OS + 4-8×NVMe local | AI/ML training, inference |
|
||
| **Storage node** | 16-32 cores | 64-128 GB | 4-12× NVMe/SATA (Ceph/Longhorn) | SDS, persistent volumes |
|
||
|
||
### Kernel tuning
|
||
|
||
```
|
||
# /etc/sysctl.d/99-kubernetes.conf
|
||
net.bridge.bridge-nf-call-iptables = 1
|
||
net.bridge.bridge-nf-call-ip6tables = 1
|
||
net.ipv4.ip_forward = 1
|
||
net.ipv4.conf.all.forwarding = 1
|
||
|
||
# Connection tracking (for NodePort, Service)
|
||
net.netfilter.nf_conntrack_max = 2097152
|
||
net.netfilter.nf_conntrack_tcp_timeout_established = 86400
|
||
|
||
# File watchers (for kubelet, containerd)
|
||
fs.inotify.max_user_instances = 8192
|
||
fs.inotify.max_user_watches = 524288
|
||
|
||
# Memory management
|
||
vm.swappiness = 0
|
||
vm.overcommit_memory = 1 # Allow overcommit (CRI-O, containerd)
|
||
vm.panic_on_oom = 0
|
||
kernel.panic = 10
|
||
kernel.panic_on_oops = 1
|
||
```
|
||
|
||
### Container storage
|
||
|
||
| Type | Recommendation | Note |
|
||
|-----|-----------|----------|
|
||
| **OS disk** | RAID 1 (2× NVMe) | Ext4/XFS, 100-200 GB |
|
||
| **Container runtime image** | RAID 1 (2× NVMe) | /var/lib/containerd, 200-500 GB |
|
||
| **Local PV** | Single NVMe | Raw device, no RAID |
|
||
| **Rook/Ceph OSD** | Raw NVMe/SATA | HBA/IT mode, no RAID |
|
||
| **Longhorn** | Raw NVMe/SATA | Ext4/XFS per volume |
|
||
|
||
---
|
||
|
||
## 4. Storage server (Ceph / MinIO / NAS)
|
||
|
||
### Ceph OSD node
|
||
|
||
| Component | Recommendation | Note |
|
||
|-----------|-----------|----------|
|
||
| **CPU** | 1-2 cores per OSD | Up to 12 OSD per node (24 cores) |
|
||
| **RAM** | 4-8 GB per OSD + OS | BlueStore cache, 16-64 GB min |
|
||
| **Network** | 2× 25/100 GbE | Public + Cluster network |
|
||
| **Storage** | 10-12× NVMe/SATA SSD OSD | HBA/IT mode, no RAID |
|
||
| **OS disk** | 2× SATA SSD RAID 1 | OS, Ceph MON/MGR |
|
||
|
||
**BIOS for Ceph:**
|
||
- SATA/NVMe: AHCI/NVMe mode (not RAID)
|
||
- C-States: Disabled (lower OSD latency)
|
||
- NUMA: Enabled
|
||
- Power: Performance
|
||
|
||
### MinIO node
|
||
|
||
| Component | Recommendation |
|
||
|-----------|-----------|
|
||
| **CPU** | 8-16 cores (32+ for erasure coding) |
|
||
| **RAM** | 32-64 GB + 1 GB per 1 TB storage |
|
||
| **Storage** | 4-16× NVMe (direct, no RAID) |
|
||
| **Network** | 2× 25/100 GbE |
|
||
| **OS** | Ubuntu / RHEL, XFS (for data) |
|
||
|
||
### NAS (TrueNAS / FreeNAS)
|
||
|
||
- **ZFS**: RAID-Z1/Z2/Z3, compression (lz4, zstd), dedup
|
||
- **ARC cache**: 1 GB per 1 TB storage (max 64 GB)
|
||
- **L2ARC**: NVMe cache (optional, read-heavy)
|
||
- **SLOG**: NVDIMM / Optane (sync write, ZIL)
|
||
- **Network**: 2-4× 10/25 GbE LACP
|
||
|
||
---
|
||
|
||
## 5. Web / API servers
|
||
|
||
| Parameter | Recommendation |
|
||
|----------|-----------|
|
||
| **CPU** | High clock, 8-32 cores |
|
||
| **RAM** | 32-128 GB |
|
||
| **Storage** | 2× NVMe RAID 1 (OS + app) |
|
||
| **OS** | Ubuntu / RHEL, optimized kernel |
|
||
| **Network** | 2× 10/25 GbE (bonding) |
|
||
|
||
**Kernel tuning:**
|
||
```
|
||
net.ipv4.tcp_tw_reuse = 1
|
||
net.ipv4.tcp_fin_timeout = 15
|
||
net.core.somaxconn = 65535
|
||
net.ipv4.tcp_max_syn_backlog = 65535
|
||
net.core.netdev_max_backlog = 65535
|
||
```
|
||
|
||
---
|
||
|
||
## Quick decision tree — server selection by workload, size and storage
|
||
|
||
```mermaid
|
||
flowchart TD
|
||
W["What workload?"] --> DB["Database"]
|
||
W --> HV["Virtualization"]
|
||
W --> K8s["Kubernetes"]
|
||
W --> AI["AI/ML"]
|
||
W --> ST["Storage server"]
|
||
W --> WEB["Web / API"]
|
||
|
||
DB --> DBS{"Company size"}
|
||
DBS -->|"< 500"| DB1["1× EPYC 8-16C, 64-256 GB<br/>NVMe RAID10, 2× 25GbE"]
|
||
DBS -->|"500-5000"| DB2{"Storage"}
|
||
DB2 -->|"Local"| DB2L["1-2× EPYC 16-24C, 128-512 GB<br/>NVMe RAID10, 4× 25GbE"]
|
||
DB2 -->|"Ceph"| DB2C["2× EPYC 16-32C, 256-512 GB<br/>RBD, 4× 25/100GbE"]
|
||
DBS -->|"Enterprise"| DB3{"Storage"}
|
||
DB3 -->|"FC SAN"| DB3F["2× EPYC 48-128C, 512-2048 GB<br/>SAN LUN + 2× FC 32/64G"]
|
||
DB3 -->|"Ceph"| DB3C["2× EPYC 32-64C, 256-512 GB<br/>RBD, 4× 100GbE"]
|
||
DBS -->|"Cloud"| DBC["RDS/Azure SQL/CloudSQL<br/>Managed, Multi-AZ"]
|
||
|
||
DB --> ORACLE{"Oracle architecture?"}
|
||
ORACLE -->|"Standalone"| ORA1["1-2× EPYC 8-24C<br/>64-512 GB, ASM local/FC<br/>2× 25GbE + FC 32G"]
|
||
ORACLE -->|"Data Guard"| ORA2["2× EPYC 32-64C<br/>256-1024 GB, FC SAN<br/>2× 25/100GbE + 2× FC 64G<br/>2× 25GbE (DG sync)"]
|
||
ORACLE -->|"RAC 2-4 nodes"| ORA3["Per node: 2× EPYC 32-64C<br/>512-2048 GB, FC SAN<br/>2× 100GbE (app)<br/>2× FC 64G (storage)<br/>2× 100GbE RoCE (interconnect)"]
|
||
ORACLE -->|"Exadata"| ORA4["Engineered system<br/>2-8 DB servers + 3-18 storage<br/>RoCE 100GbE, Smart Scan<br/>15-30 kW/rack"]
|
||
|
||
HV --> HVS{"Number of hosts"}
|
||
HVS -->|"2-3"| HV1["1× EPYC 12-24C, 128-256 GB<br/>RAID5/6 SSD, 2-4× 10/25GbE"]
|
||
HVS -->|"3-6"| HV2{"HCI"}
|
||
HV2 -->|"vSAN"| HV2V["1-2× EPYC 16-32C, 256-512 GB<br/>NVMe cache + SSD, 4× 25GbE"]
|
||
HV2 -->|"Ceph"| HV2C["1× EPYC 12-24C, 128-256 GB<br/>4-8× HBA NVMe/SSD, 4× 25GbE"]
|
||
HVS -->|"6+"| HV3["2× EPYC 32-64C, 512-2048 GB<br/>FC SAN 32/64G, 4-8× 25/100GbE"]
|
||
HVS -->|"20+"| HV4["2× EPYC 64-128C, 512-1024 GB<br/>OpenStack + Ceph, 4-8× 100GbE"]
|
||
|
||
K8s --> K8T{"Node type"}
|
||
K8T -->|"General"| K8G["16-32C, 64-128 GB<br/>2× NVMe, 2× 25GbE"]
|
||
K8T -->|"Memory"| K8M["32-64C, 256-512 GB<br/>3× NVMe, 2× 25GbE"]
|
||
K8T -->|"GPU"| K8U["32-64C, 512-1024 GB<br/>6-10× NVMe, H100/B200, 4× 100GbE"]
|
||
K8T -->|"Storage"| K8S["16-32C, 64-128 GB<br/>6-14× HBA NVMe, 4× 25GbE"]
|
||
|
||
AI --> AIT{"Purpose"}
|
||
AIT -->|"Training"| AITR["GPU H100/B200, NVLink<br/>InfiniBand 400Gb/s, liquid cooling"]
|
||
AIT -->|"Inference"| AIIR["A100/H200, MIG<br/>PCIe 5.0, 2× 100GbE"]
|
||
|
||
ST --> STT{"Type"}
|
||
STT -->|"Ceph OSD"| STC["EPYC (PCIe lanes)<br/>4-8 GB/OSD, HBA, 2× 25/100GbE"]
|
||
STT -->|"MinIO"| STM["EPYC 8-16C, 32-64 GB<br/>4-16× NVMe direct, 2× 25/100GbE"]
|
||
STT -->|"NAS (ZFS)"| STN["EPYC 16-32C, 64-128 GB<br/>RAID-Z, SLOG NVMe, 2-4× 10/25GbE"]
|
||
|
||
WEB --> WEBE["EPYC high clock, 8-32C<br/>32-128 GB, 2× NVMe RAID1, 2× 10/25GbE"]
|
||
```
|
||
|
||
### Connectivity summary by platform
|
||
|
||
| Platform | App / VM network | Storage network | Replication / Cluster | Management |
|
||
|-----------|-------------|-------------|---------------------|------------|
|
||
| **DB local (small)** | 2× 25 GbE LACP | — | 2× 25 GbE (shared) | 1× 1 GbE (iLO) |
|
||
| **DB local (medium)** | 2× 25/100 GbE LACP | — | 2× 25 GbE dedicated | 1× 1 GbE (iLO) |
|
||
| **DB FC SAN** | 2× 25/100 GbE LACP | 2× 32/64 Gb FC multipath | FC replication | 1× 1 GbE (iLO) + SAN mgmt |
|
||
| **DB Ceph** | 2× 25/100 GbE | 2× 25/100 GbE (Ceph public) | 2× 25/100 GbE (Ceph cluster) | 1× 1 GbE (iLO) |
|
||
| **Hypervisor local** | 2-4× 10/25 GbE LACP | — (local) | — | 1× 1 GbE (iLO) |
|
||
| **Hypervisor vSAN** | 2× 25/100 GbE LACP | 2× 25/100 GbE (vSAN) | vSAN traffic | 1× 1 GbE (iLO) |
|
||
| **Hypervisor FC SAN** | 2-4× 25/100 GbE LACP | 2× 32/64 Gb FC multipath | 2× 25 GbE (vMotion) | 1× 1 GbE (iLO) |
|
||
| **Hypervisor Ceph** | 2× 25/100 GbE LACP | 2× 25/100 GbE (Ceph) | 2× 25 GbE (migration) | 1× 1 GbE (iLO) |
|
||
| **Kubernetes** | 2× 25/100 GbE | 2× 25/100 GbE (Ceph/Longhorn) | 2× 25/100 GbE (K8s cluster) | 1× 1 GbE (BMC) |
|
||
| **Web/API** | 2× 10/25 GbE LACP | — | — | 1× 1 GbE (BMC) |
|
||
| **Oracle Standalone** | 2× 25 GbE LACP | 2× FC 32G or NVMe local | Data Guard 2× 25 GbE | 1× 1 GbE (iLO) + ASM mgmt |
|
||
| **Oracle Data Guard** | 2× 25/100 GbE LACP | 2× FC 64G multipath | 2× 25 GbE (DG sync) | 1× 1 GbE (iLO) + SAN mgmt |
|
||
| **Oracle RAC** | 2× 100 GbE LACP (VIP/SCAN) | 2× FC 64G multipath | 2× 100 GbE RoCE (Cache Fusion) | 1× 1 GbE (iLO) + Clusterware |
|
||
| **Oracle Exadata** | 4-8× 100 GbE RoCE | NVMe over Fabric | RDMA interconnect | Exadata CLI + OEDA |
|
||
|
||
## Sources
|
||
|
||
Links, books and standards: [sources/infrastructure/sources.md](sources/infrastructure/sources.md)
|
||
|
||
*Last revision: 2026-06-03*
|