**Ceph** is a unified distributed storage platform that provides block storage (RBD), object storage (RadosGW), and a POSIX-compliant shared filesystem (CephFS) from a single cluster. It is the storage backbone of most production [[OpenStack]] deployments and the closest thing open source has to AWS EBS + S3 + EFS in one system. --- ### First Principle: Storage should be software-defined, self-healing, and hardware-agnostic. Ceph runs on commodity hardware and replicates data across nodes with no single point of failure. When a disk or node fails, Ceph automatically re-replicates the affected data — self-healing without operator intervention. This is the same principle behind Dynamo and GFS, implemented in open source. --- ### Key Considerations - **RADOS Foundation**: Everything in Ceph is built on RADOS (Reliable Autonomic Distributed Object Store) — a distributed object store that handles placement, replication, and self-healing. - **RBD (RADOS Block Device)**: Provides block volumes that back [[OpenStack]] Cinder. Supports thin provisioning, snapshots, and cloning. - **RadosGW**: An S3-compatible object storage gateway — allowing Ceph to serve as the backend for S3-style workloads (backups, ML datasets, application assets). - **CephFS**: A POSIX-compliant distributed filesystem for shared workloads that need traditional file semantics. - **Deployment**: Ceph clusters typically need at least 3 nodes for replication. `cephadm` and [[Ansible]]-based `ceph-ansible` automate deployment. - **Performance**: NVMe-backed Ceph with BlueStore can achieve hundreds of thousands of IOPS per cluster. Network bandwidth is the typical bottleneck at scale. --- ### How It Fits ``` [[OpenStack]] (Cinder / Swift API layer) → Ceph (RBD for block, RGW for object, CephFS for file) → RADOS (distributed object foundation) → Commodity drives on bare metal nodes ``` [[OpenStack]] | [[MinIO]] | [[Longhorn]] | [[Kubernetes]] | [[Open Source Hyperscaler MoC]]