Software Defined Storage (SDS) frameworks offer storage services that provide high availability, durability and reliability guarantees. However, these guarantees come at a performance cost. While drives can offer microsecond latency and throughput of millions of IOPs, SDS services typically offer millisecond access latencies with 10-100K IOPs throughput.
We present an architecture that decouples data read paths to bypass the bottleneck SDS stack for these paths. The SDS stack shares logical-to-physical storage mappings with a companion module. The module can cache these mappings which enables creation of direct I/O paths to the drives under SDS control. We also describe mechanisms whereby the SDS stack can forcibly evict the shared storage mappings as needed.
We share our experience implementing a proof-of-concept of this approach in Openstack Ceph. The PoC uses SPDK to create direct NVMe-over Fabrics I/O paths into Ceph logical volumes (aka Rados Block Device (RBD) images). Our microbenchmark results with this PoC demonstrate an order-of-magnitude read latency reduction and a 4-10X increase in throughput. Finally, we discuss future opportunities in reducing CPU usage, utilizing hardware assists like IPU/DPU/sNIC, and workload specific optimizations for the fast read paths into SDS.