Non-Volatile Memory express (NVMe) is a storage access protocol that lets the CPU access SSD memory through the Peripheral Component Interconnect Express (PCIe). Through a set of protocols and technologies, NVMe dramatically accelerates the way data is transmitted, stored and retrieved. With NVMe, the CPU accesses data on SSDs directly, enabling maximum SSD utilization and flexible scalability. NVMe allows for Storage Disaggregation and can be combined with Kubernetes for scale-out applications.
This blog explores how NVMe redefines storage orchestration in Kubernetes.
By using the PCIe interface to connect CPUs to SSDs, NVMe removes layers connecting compute to storage, allowing efficient storage abstraction and disaggregation. This offers various benefits for modern data centers, including:
Non Volatile Memory express-over Fabrics (NVMe-oF) is a specification that allows CPUs to connect to SSD Storage devices across a network fabric. This is designed to harness the benefits of the NVMe protocol over a Storage Area Network (SAN). The host computer can target an SSD storage device using an MSI-X based command while the network can be implemented using various networking protocols, including Fiber Channel, Ethernet or Infiniband.
NVMe-oF has found wider popularity in modern networks since it allows software organizations to implement scaled out storage for highly-distributed, highly-available applications. By extending the NVMe protocol to SAN devices, NVMe-oF makes CPU usage efficient while improving connection speeds between applications on servers and storage.
NVMe-oF supports various data transfer mechanisms, such as:
NVMe-oF interfaces networked flash storage with compute servers, enabling applications to run on shared network storage, thereby providing additional network consolidation for data centers. The SSD targets can be shared dynamically among application workloads, allowing for the efficient consumption of resources, flexibility and scalability.
While containers are transient, Kubernetes enables stateful applications by providing abstractions that reference a physical storage device. A containerized application is virtually isolated from other processes and applications running on other containers. This makes the Kubernetes environment highly flexible and scalable, as it allows applications to run in virtual machines, bare metal systems, supported cloud systems, or a combination of various deployments. While there are benefits to this approach, it also presents a challenge when there is the need to store and share data between containers.
Kubernetes offers various abstractions and options for attaching container PODs to physical storage, such as:
While Direct Attached Storage (DAS) offers a simple, highly available and quick storage, DAS alone is not sufficient to run Kubernetes clusters. This is because DAS devices have a limited storage capacity that cannot be dynamically provisioned to match stateful Kubernetes workloads. Additionally, DAS doesn’t incorporate networking capabilities or facilitate data access by different user groups since storage is only directly accessible to individual servers/desktop machines, while Kubernetes orchestrates on distributed clusters.
NVMe extends the low latency of DAS to Network Attached Storage devices by connecting servers to SSDs over a high-speed PCIe-oF interface. This makes NVMe an efficient option to provide storage for dynamic, extensible and flexible stateful applications running on Kubernetes. The Container Storage Interface (CSI) standard connects these pooled NVMe devices to Kubernetes clusters running stateful applications. By combining the low-latency networked storage offered by NVMe-oF and the flexibility of the CSI plugin, organizations can provide an efficient, agile and demand driven storage solution for Kubernetes applications.
To avoid the bottlenecks of running NVMe SSDs on a single, local server, several organizations are working to enable an NVMe-oF plugin for Kubernetes storage. Kubernetes enables the use of REST APIs to allow control of the storage provisioner through the NVMe-oF protocol. The storage provisioner then creates standard Volume API objects that can be used to attach a portion of pooled NVMe SSDs to a POD. Kubernetes PODs and other resources can then read and write data onto this pooled storage like any persistent volume object.
OpenEBS created by MayaData is a popular agile storage stack for stateful Kubernetes applications that need minimal latency. The software infrastructure and plugins from OpenEBS integrate perfectly with the rapid, disaggregated physical storage offered by NVMe-oF. Integrating NVMe SSDs with OpenEBS plugins allows for simpler storage configurations for loosely coupled applications with stateful workloads.
OpenEBS is one of the popular open-source, agile storage stacks for performance-sensitive databases orchestrated by Kubernetes. Mayastor, OpenEBS’s latest storage engine, delivers very low overhead versus the performance capabilities of underlying devices. While OpenEBS Mayastor does not require NVMe devices and does not require the workloads to access data via NVMe, an end to end deployment from a workload running a container supporting NVMe over TCP through the low overhead OpenEBS Mayastor and ultimately NVMe devices will understandably perform as close as possible to the theoretical maximum performance of the underlying devices.To learn more about how OpenEBS Mayastor, leveraging NVMe as a protocol, performs when leveraging some of the fastest NVMe devices currently available on the market, visit this article.
OpenEBS Mayastor builds a foundational layer that enables workloads to coalesce and control storage as needed in a declarative, Kubernetes-native way. While doing so, the user can focus on what's important, that is, deploying and operating stateful workloads.
If you’re interested in trying out Mayastor for yourself, instructions for how to set up your own cluster, and run a benchmark like `fio` may be found at https://docs.openebs.io/docs/next/mayastor.html.
References
Mayastor NVMe-oF TCP performance - https://openebs.io/blog/mayastor-nvme-of-tcp-performance/
Lightning-fast storage solutions with OpenEBS Mayastor and Intel Optane -
https://mayadata.io/assets/pdf/product/intel-and-mayadata-benchmarking-of-openEBS-mayastor.pdf