MayaData Blog

Turning Kubernetes into a data plane

Written by Evan Powell | Nov 18, 2019 2:00:00 PM

Some updates and observations from three years of growth

Several years ago, a couple of veterans of building enterprise-class storage with I/O control in the user space - met up with some other data management veterans with extensive experience building open-source DevOps projects - and boom - OpenEBS was born.Our vision then and now is a simple one. What if we used containers and container orchestration to deliver a fundamentally better way of managing data - one that was truly cloud-native in architecture, easy to use and self adopt, and 100% open source? Could we deliver a better experience for developers and SREs? Would they trust containers and Kubernetes as their data plane?

We wrote about much of this vision and approach in our ironically titled Secret Plan of World Domination at KubeCon two years ago.

In this blog, I’d like to give a quick update.

How is Kubernetes being used as a data plane?

The adoption of OpenEBS has increased dramatically in 2019, and Docker and Quay’s pulls are now averaging a few thousand per day. We estimate that 550 users adopt OpenEBS daily. We get to know some of these users via the community via their use of the free version of OpenEBS Director, which includes monitoring and visualization for stateful workloads plus data migration and more.

Get your free account here: https://director.mayadata.io/

Public references in the community for the CI/CD use case now include Arista and Comcast and others.

Every large financial institution we know in North America and Europe is adopting Kubernetes for a variety of use cases - frequently being driven by a need to get better at code and data pipelines to increase their agility. Some of these are paid customers of MayaData - where we sell OpenEBS support via a set of software that also includes the Litmus Chaos project and OpenEBS Director, whether as hosted SaaS or deployed on-premises.

We also see many telcos using OpenEBS at the edge - and the community has been extremely helpful in keeping up with minimalistic ARM builds of OpenEBS and so on. If you are a service provider and are planning on approximately 30,000 Kubernetes nodes at the edge as a part of your 5G rollout, do you want yet another system such as CEPH that itself requires multiple external nodes to be resilient or would you rather run something like OpenEBS that runs on Kubernetes itself? A system based on external storage could easily cost 3x as much in hardware and power and be much more complicated to manage.

New directions and old directions confirmed

We've learned a lot over the last couple of years. A few examples:

  1. LocalPV - The rise of LocalPV makes perfect sense. Many workloads have adapted to the horrible state of traditional storage by writing to disk directly, with little indirection. OpenEBS, thanks to delivering per workload data management, fits into this paradigm, especially now that the project supports LocalPV as a first-class citizen. While this means that workloads utilizing this LocalPV engine lose some of the capabilities of cStor or Jiva - they can use, for example, ZFS as a local file system and can improve performance as well. Recently we started to add ZFS support to LocalPV as well, and more information is available here: https://github.com/openebs/zfs-localpv.

  2. Litmus Chaos - Litmus started because we wrote quite a bit of tooling to accelerate the polishing and hardening of OpenEBS shift left ourselves and grew over the last two years into a project that can insert chaos into Kubernetes environments, thereby improving the resilience of these environments. It is now emerging as a favorite Kubernetes native chaos engineering project, and the CNCF recently published a blog authored by our co-founder and COO explaining it in more detail: https://www.cncf.io/blog/2019/11/06/cloud-native-chaos-engineering-enhancing-kubernetes-application-resiliency/.

  3. MayaStor - Last but not least - the newest storage engine of OpenEBS - addresses one of the most reported issues or concerns about OpenEBS - relatively low performance. While the general idea of building a higher performance engine was something we considered early on, it was a lower priority than ease of initial use of OpenEBS.

Our CTO, Jeffry Molanus, provides an update about this project here.

As Jeffry explains, in addition to extremely high performance, MayaStor delivers truly end to end encryption without any decryption and encryption cycles when transiting from one cloud to another.

Conclusion

It has been a great few years helping to build OpenEBS and the OpenEBS community. OpenEBS itself has matured, and we’ve learned an enormous amount about how Kubernetes is being used as a data plane.

We’re at the beginning of a transformation of how software is built and operated, and I’m not sure many organizations, users, or vendors have grasped what is possible today. Pipeline portability - and multi-cloud functionality - is emerging as a reality, boosting the productivity of developers - what we call data agility - while reducing the risks of cloud or vendor lock-in.