18 KiB
💾 Storage infrastructure
Storage types
| Type | Description | Latency | Use case |
|---|---|---|---|
| DAS (Direct Attached) | Disks directly in server | <0.1 ms | OS, cache, local data |
| SAN (Storage Area Network) | Block devices over network | <1 ms | Databases, VM datastores |
| NAS (Network Attached Storage) | File access (NFS, SMB) | 1-3 ms | Shared files, home dirs |
| Object storage | REST API, flat namespace | 10-100 ms | Backups, media, big data |
Protocols
| Protocol | Type | Speed | Note |
|---|---|---|---|
| Fibre Channel | SAN | 8/16/32/64 Gbps | Low latency, dedicated network |
| iSCSI | SAN (IP) | 1/10/25 GbE | Cheaper, over ethernet |
| NVMe-oF | SAN (NVMe) | 25/50/100 GbE | Lowest latency, emerging |
| NFS | NAS | 1/10/25 GbE | Universal, simple |
| SMB/CIFS | NAS | 1/10/25 GbE | Windows native |
| S3 API | Object | — | Standard for object storage |
RAID
| RAID | Min. disks | Capacity | Protection | Read speed | Write speed | Use case |
|---|---|---|---|---|---|---|
| 0 | 2 | 100 % | None | N × (striping) | N × | Temp data, cache (risky) |
| 1 | 2 | 50 % | 1 disk | N × (mirror) | 1 × | OS disk, critical data |
| 5 | 3 | 67-94 % | 1 disk | N-1 × | N-1 × (parity write penalty) | Universal file/VM storage |
| 6 | 4 | 50-88 % | 2 disks | N-2 × | N-2 × (double parity) | Large capacities, important data |
| 10 | 4 | 50 % | 1/mirror | N × | N/2 × | Databases, VM, high-performance |
| 50 | 6 | 67-94 % | 1/stripe | N-1 × | N-1 × | Large capacity + performance |
| 60 | 8 | 50-88 % | 2/stripe | N-2 × | N-2 × | Enterprise |
Stripe size
- Small stripe (16-64 KB) — better IOPS, worse throughput (databases, OLTP)
- Large stripe (128-1024 KB) — better throughput, worse IOPS (video, media, backup)
- Write hole on RAID 5/6: metadata inconsistency during power loss while writing parity (prevention: non-volatile cache, battery-backed RAID controller)
Software-Defined Storage (SDS)
| Tool | Type | Use case |
|---|---|---|
| Ceph | Object/Block/File (RADOS) | Universal SDS, OpenStack, Kubernetes |
| MinIO | Object (S3 API) | High-performance S3, AI/ML data lake |
| GlusterFS | Distributed File | Shared filesystem, POSIX |
| Longhorn | Block (Kubernetes) | K8s PVC, microservices |
| Linstor | Block (DRBD + LVM) | Linux SDS, Kubernetes |
| VMware vSAN | Block (HCI) | VMware ecosystem |
| StarWind | Block (HCI) | Hyper-V / VMware |
Ceph
Architecture:
RADOS (Reliable Autonomic Distributed Object Store)
├── Monitors (MON) — cluster map, quorum (3/5)
├── Managers (MGR) — dashboard, balancer, orchestrator
├── OSDs (Object Storage Daemons) — data + replication
└── MDS (Metadata Server) — CephFS only
CRUSH map (Controlled Replication Under Scalable Hashing):
- Algorithm for calculating data placement (no central index)
- Layers: Root → Datacenter → Rack → Host → OSD
- Failure domain: replication across racks / hosts
ceph osd crush rule create-replicated replicated_rule default host
Access interfaces:
| Interface | Type | Use case |
|---|---|---|
| RBD (RADOS Block Device) | Block | VM images, Kubernetes PVC (csi-rbd) |
| RGW (RADOS Gateway) | Object (S3/Swift API) | S3-compatible storage, backup |
| CephFS | File (POSIX) | Shared filesystem, home dirs |
| NFS-Ganesha | File (NFS) | NFS export over CephFS |
Erasure coding:
- K+M (data + parity chunks), e.g. 8+3 (8 data, 3 parity)
- More space-efficient than 3× replication (1.375× vs 3×)
- Higher CPU overhead, lower IOPS
- Recommended for cold data (RGW) instead of replication
Enterprise storage vendors
Hitachi VSP (Virtual Storage Platform)
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|---|---|---|---|---|---|
| VSP 5200/5600 | Active-active, scale-up/out, 2–12 controllers | 69.3 PB raw, 287 PBe | 33M IOPS, 39 µs | FC-NVMe 32Gb, FC 16/32Gb, FICON 16Gb, iSCSI 10Gb | Mission-critical, mainframe, enterprise consolidation |
| VSP E590/E790/E1090 | Symmetric active-active, up to 65 nodes/130 controllers | 10.62 PB raw (E1090) | 8.4M IOPS, <41 µs | FC 32Gb, iSCSI 25Gb, FC-NVMe 32Gb | Midrange enterprise, hybrid workloads |
Key features: SVOS common across entire portfolio, AI-driven data reduction 4:1 guarantee, Global-Active Device metro clustering, 8 nines availability (HW), 100% data availability guarantee.
Huawei OceanStor Dorado
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|---|---|---|---|---|---|
| Dorado 8000/18000 V6 | SmartMatrix full-mesh, up to 32 controllers | 32 TB cache, 6400 SSD | 40M IOPS, 0.05 ms | FC 32/64Gb, FC-NVMe, iSCSI, NFS, SMB, NVMe/RoCE, S3 | Mission-critical, finance, govt, carrier |
| Dorado 8000/18000 V7 (2025) | SmartMatrix 4.0, up to 64/128 controllers | 500 PB+ | >100M IOPS, 0.03 ms | FC, RoCE, NVMe/TCP, NFS, SMB, S3 | AI workloads, converged block/file/object |
Key features: SmartMatrix survives 7/8 controllers, FlashEver (3-gen online HW upgrade in 10 years), RAID-TP (triple SSD failure), DPU-based SmartNIC, ML-based I/O prefetch, 100% ransomware detection (Tolly), #1 SPC-1 benchmark.
Dell PowerStore & PowerMax
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|---|---|---|---|---|---|
| PowerStore 1500/5500/9500 (Gen 3) | Active-active dual-node, PCIe Gen5, DDR5, RDMA 200GbE | 1.2 PB raw, 5.8 PBe | 3× IOPS vs Gen2 | FC 32/64Gb, iSCSI, NVMe/FC, NVMe/TCP, NFSv4, SMB3 | Midrange-to-high-end, VMware, containerized |
| PowerMax 2500/8500 | Scale-out NVMe, Dynamic Fabric, up to 16 nodes | 8.8 PBe (2500), 18 PBe (8500) | 6 nines availability | FC 64Gb, FICON, NVMe/FC, NVMe/TCP, iSCSI, NFS, SMB | Mission-critical, mainframe, OLTP, cyber vault |
Key features: PowerStore 6:1 DRR guarantee, unified block/file/vVols out of box, Cyber Detect AI anomaly; PowerMax 5:1 DRR, Secure Snapshots 65M, SRDF/Metro, Flexible RAID up to 92% efficient, FIPS 140-3.
HPE Alletra
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|---|---|---|---|---|---|
| Alletra 5000 | Active-active hybrid flash, dual controller | 1.2 PB raw | 99.9999% guarantee | FC, iSCSI | Mixed primary + secondary, cost-efficient hybrid |
| Alletra 6000 | Active-active all-NVMe, dual controller | ~368 TB usable | <100 µs | FC, iSCSI | Business-critical DB, VDI, VMware |
| Alletra 9000 | Active-active all-NVMe, multi-node scale-out | 2–4 PB+ usable | ~2–3M IOPS, <150 µs | FC, iSCSI, NVMe/FC | Mission-critical ERP, AI, consolidation |
| Alletra Storage MP | Disaggregated modular, block + file + object | 5.8 PB block, 11.8 PB object | 100% availability guarantee | FC, iSCSI, NVMe/FC, NFS, SMB, S3 | Multi-protocol consolidation, AI/analytics |
Key features: Triple Parity RAID (5000), InfoSight AI Ops, HPE GreenLake as-a-service, non-disruptive controller upgrades (MP), 100% data availability guarantee.
Infinidat
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|---|---|---|---|---|---|
| InfiniBox SSA G4 | Triple-active controller, AMD EPYC PCIe 5.0, DDR5 | 1.97 PB usable / 5.9 PBe | 2.24M IOPS, 35 µs | FC 32Gb, 25/100GbE, NVMe-oF/TCP, iSCSI, NFS, SMB, S3 | Mission-critical Oracle/SQL, multi-site DR |
| InfiniBox G4 Hybrid | Triple-active hybrid (HDD + flash cache) | 10.9 PB raw / 32.8 PBe | 2.24M IOPS, 64 GB/s | FC, Ethernet, NVMe-oF, iSCSI, NFS, SMB, S3 | Backup, massive unstructured data |
Key features: Only 3-way active on the market, Neural Cache (ML-driven), InfiniRAID, Immutable snapshots, 100% availability + 1-min snapshot recovery guarantee, everything included in base price (no extra licensing).
Pure Storage FlashArray
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|---|---|---|---|---|---|
| FlashArray//X (X20–X90 R5) | Active-active, NVMe DirectFlash | 1.2 PB raw / 4.4 PBe | 250 µs, 5:1 DRR | FC, NVMe/FC, NVMe/RoCE, NVMe/TCP, iSCSI, NFS, SMB | Mission-critical DB, VMware, enterprise |
| FlashArray//C (C50–C90 R5) | Active-active, QLC DirectFlash | 4.2 PB raw / 16.3 PBe | 5:1 DRR | FC, NVMe-oF, iSCSI, NFS, SMB | Capacity-optimized, backup, file |
| FlashArray//XL (XL190) | Active-active, 40 DirectFlash modules | 1.9 PB raw / 9.4 PBe | >4M IOPS, <100 µs, 45 GB/s | FC 64Gb, 100GbE RoCE, NVMe/FC, NVMe/TCP, NFS, SMB | Largest DB consolidation, OLTP |
Key features: DirectFlash (no FTL layer), 99.9999% availability, Evergreen (never forklift upgrade), Purity OS unified across entire portfolio, ActiveCluster/ActiveDR, Pure1 AIOps.
Lenovo ThinkSystem
| Model | Architecture | Max capacity | IOPS / Latency | Protocols | Use case |
|---|---|---|---|---|---|
| DM Series (DM3200F/5200F/7200F) | Active-active, all-NVMe, NetApp ONTAP | 1.8 PB raw / 6.8 PBe | Up to 120 NVMe SSD | FC 64Gb, iSCSI, NVMe/FC, NFS, SMB, S3 | Unified block/file, AI/ML, VMware |
| DG Series (DG5200/7200) | Active-active, all-QLC, ONTAP | 7.4 PB raw / 27 PBe | QLC economics | FC, NVMe/FC, NVMe/TCP, iSCSI, NFS, SMB, S3 | Capacity-optimized, backup, archive |
| DE Series (DE4000F–DE6600F) | Active-active, SAS/NVMe hybrid | 1.84 PB raw | 2M IOPS, <100 µs, 44 GB/s | FC 32Gb, iSCSI 25Gb, NVMe/FC, SAS, NVMe/RoCE | HPC, analytics, video surveillance |
Key features: DM/DG use ONTAP (SnapMirror, SnapVault, FabricPool, RAID-DP/RAID-TEC); cluster scale-out up to 12 HA pairs; DE series best price/performance in portfolio.
Synology
| Model | Architecture | Max capacity | Protocols | Use case |
|---|---|---|---|---|
| UC3200/UC3400 | Active-active dual-controller, SAS backend | 576 TB raw | iSCSI, FC 16Gb, 10/25GbE | SMB/midmarket SAN, VMware, HA |
| DS/RS Series (RS3626xs+, RS6426xs+) | Single-controller / HA pair, Btrfs | 864 TB raw, 1 PB volume | SMB, NFS, iSCSI, FC (HBA) | SME all-in-one NAS/SAN, backup, surveillance |
Key features: DSM UC for SAN, Synology HA, Snapshot Replication (16K snapshots), VMware VAAI/ODX/ALUA, Surveillance Station, low TCO.
Vendor comparison — overview
| Vendor | Flagship | Max IOPS | Max capacity | Latency | Availability guarantee | Main differentiator |
|---|---|---|---|---|---|---|
| Hitachi | VSP 5600 | 33M | 287 PBe | 39 µs | 8 nines (HW) | Mainframe + open; 65-node cluster |
| Huawei | Dorado 18000 V7 | >100M | 500 PB+ | 0.03 ms | 99.99999% | SmartMatrix; #1 SPC-1 |
| Dell | PowerMax 8500 | — | 18 PBe | — | 6 nines | SRDF/Metro; mainframe |
| HPE | Alletra 9000/MP | ~3M | 11.8 PBe | <150 µs | 100% data guarantee | InfoSight AIOps; GreenLake |
| Infinidat | InfiniBox SSA G4 | 2.24M | 32.8 PBe | 35 µs | 100% availability | 3-way active; Neural Cache |
| Pure | FlashArray//XL | >4M | 16.3 PBe | <100 µs | 99.9999% | DirectFlash; Evergreen |
| Lenovo | DM7200F | — | 27 PBe | — | — | ONTAP ecosystem; broad portfolio |
| Synology | UC3400 | 690K | 576 TB | — | — | Lowest price for active-active SAN |
Storage selection by use case
| Use case | Recommendation | Rationale |
|---|---|---|
| Mainframe + open hybrid | Hitachi VSP / Dell PowerMax | Only ones with FICON + FC simultaneously |
| AI/ML training | Huawei Dorado V7 / Pure //XL | Highest IOPS, lowest latency |
| Enterprise DB (Oracle, SQL Server) | Infinidat / Pure //X | Low latency, consistent performance |
| Virtualization (VMware, Hyper-V) | Dell PowerStore / HPE Alletra 6000 | VAAI, vVols, InfoSight |
| SMB / SME | Synology / Lenovo DE | Low TCO, simple management |
| Object storage / backup | Pure //C / Lenovo DG / Infinidat Hybrid | QLC economics, high capacity |
| Multi-protocol consolidation | HPE Alletra MP / Huawei Dorado | Block + file + object in one platform |
Decision diagram — storage platform selection
flowchart TD
Start(["Storage requirement"]) --> PROTO{"Access type"}
PROTO -->|"Block (SAN)"| BLOCK
PROTO -->|"File (NAS)"| FILE
PROTO -->|"Object"| OBJECT
BLOCK --> BPERF{"Performance tier"}
BPERF -->|"Tier 0/1<br/>< 100 µs, > 1M IOPS"| BT1["Infinidat / Pure //XL<br/>Huawei Dorado V7<br/>FC-NVMe, NVMe-oF"]
BPERF -->|"Tier 2<br/>100-500 µs"| BT2["Dell PowerStore / HPE Alletra 6000<br/>Hitachi VSP / Lenovo DM<br/>FC 32G, iSCSI 25GbE"]
BPERF -->|"Tier 3<br/>SME / low-cost"| BT3["Synology UC3400<br/>Lenovo DE / Dell PowerVault<br/>iSCSI, SAS"]
BLOCK --> BECOS{"Ecosystem"}
BECOS -->|"Mainframe"| BMF["Hitachi VSP / Dell PowerMax<br/>FICON + FC simultaneously"]
BECOS -->|"VMware"| BVM["Dell PowerStore / HPE Alletra<br/>VAAI, vVols, InfoSight"]
BECOS -->|"Oracle / SQL Server"| BDB["Infinidat / Pure //X<br/>Lowest latency"]
FILE --> FSIZE{"Scaling"}
FSIZE -->|"Enterprise"| FE["HPE Alletra MP (file)<br/>Lenovo DM / Dell PowerScale<br/>NFS, SMB, multi-protocol"]
FSIZE -->|"SMB"| FS["Synology DS/RS<br/>Lenovo DE / TrueNAS<br/>Btrfs, NFS, SMB, low TCO"]
OBJECT --> OUSE{"Use case"}
OUSE -->|"Backup / archive"| OB["Pure //C / Infinidat Hybrid<br/>Lenovo DG<br/>QLC, erasure coding, low cost/TB"]
OUSE -->|"AI/ML data lake"| OM["MinIO / Pure //C<br/>High throughput S3<br/>NVMe direct, erasure coding"]
OUSE -->|"Kubernetes PVC"| OK["Ceph RBD / Longhorn / Linstor<br/>SDS on K8s<br/>CSI, replication, snapshots"]
OpenStack Storage
OpenStack offers three main storage services:
| Service | Type | Description |
|---|---|---|
| Cinder | Block storage | Persistent volumes for instances (iSCSI, NFS, Ceph RBD) |
| Swift | Object storage | RESTful object store (S3-compatible via middleware) |
| Manila | File storage | Shared file systems (NFS, CIFS) as a managed service |
Cinder (Block Storage)
- Multi-backend support: LVM, Ceph RBD, NFS, iSCSI, Fibre Channel
- Snapshoting, cloning, encryption at rest
- Cinder scheduler for volume distribution across backends
- QoS specs for IOPS/bandwidth limits
Swift (Object Storage)
- Alternative to S3 for on-prem object storage
- Ring-based data distribution (consistent hashing)
- Multi-region replication (syncopy)
- Stateless REST API (RESTful, no single point of failure)
Manila (Shared File Systems)
- Managed NFS/CIFS for sharing between instances
- Backends: NetApp, Dell EMC, CephFS, GlusterFS
- Access rules (IP-based, cert-based, user-based)
- Use case: HPC cluster home directories, NAS for legacy apps
Container storage (OpenStack + Ceph)
Ceph is the most common storage backend for OpenStack: Cinder (RBD), Swift (RGW), Manila (CephFS), Glance (RBD images).
Big Data storage
HDFS cluster
HDFS is the primary storage for the Hadoop ecosystem (on-prem). Typical configuration:
| Parameter | Value | Note |
|---|---|---|
| Disk per DataNode | 8–24 × HDD (14–22 TB) + 2× NVMe (metadata, cache) | Balance capacity / performance |
| Replication factor | 3× | Rack-aware |
| Network | 2× 25/100 GbE (data) + 1× 1 GbE (management) | Data + replication traffic |
| RAM | 64–256 GB (OS cache + metadata) | HDFS cache + OS buffer cache |
| CPU | 16–32 cores | HDFS overhead is low |
| NameNode HA | Active + Standby + JN (JournalNode) | Quorum-based HA |
| Use case | Sequential read/write, large files, Spark YARN |
Model cluster — 1 PB usable:
- 10× DataNode (12× 18 TB HDD, 2× 1.9 TB NVMe)
- 2× NameNode (HA, 256 GB RAM)
- 3× JournalNode (small VMs)
- Replication 3× → raw ~ 2.2 PB
- Network: 25 GbE for data, 100 GbE for shuffle-heavy Spark
Object storage as Data Lake (S3/GCS/MinIO)
For new projects (Spark on K8s, Iceberg/Delta, lakehouse), object storage is preferred over HDFS:
| Platform | Advantages | Limits |
|---|---|---|
| MinIO (on-prem) | S3 API, erasure coding, NVMe direct, high throughput | Single tenant (per cluster) |
| Pure //C (on-prem) | QLC NVMe, dedupe, S3 + NFS | Higher $/TB |
| AWS S3 (cloud) | Unlimited capacity, Iceberg/Delta support | Egress fees |
| Azure ADLS (cloud) | Hierarchical namespace, HNS, POSIX-like ACLs | Vendor lock |
| GCP GCS (cloud) | Uniform + fine-grained ACLs, object versioning | Region restrictions |
Comparison: HDFS vs Object Storage for Big Data
| Criteria | HDFS | Object Storage (S3/MinIO) |
|---|---|---|
| Architecture | Master/worker (NameNode SPOF) | Distributed, no SPOF (erasure coding) |
| Consistency | Strong (single writer per file) | Eventual (S3) / Strong (MinIO) |
| Throughput | High (rack-aware, locality) | High (network-bound) |
| Scaling | Horizontal (DataNode) | Horizontal (stateless) |
| Cost | Low (HDD) | Medium (S3 API) |
| Metadata | NameNode (1M blocks ~ 1 GB RAM) | Object-level (flat namespace) |
| Spark integration | Native (locality-optimized) | S3A connector, Hadoop Compatible |
| 2026 trend | Legacy, declining | Standard for new projects |
For more information about Big Data see BIG-DATA.en.md.
Sources
Links, books and standards: sources/infrastructure/sources.en.md
Recommended reading
| Book | Authors | ISBN | Description |
|---|---|---|---|
| Storage Systems | Ganger, Gibson | 978-1680837540 | Textbook covering the design, implementation and operation of storage systems — from device characteristics through OS, databases and networking to server distribution and large-scale systems. An essential resource for storage infrastructure architects. |
Last revision: 2026-06-03