I'm an engineer at MayaData, a company dedicated to enabling data agility through the use of Kubernetes as a data layer and the sponsor of the CNCF project OpenEBS and other open source projects. Take it for a test drive via free and easy to use management software by registering here.
You probably have seen many articles on how to design Kubernetes operators. However, the same is not correct when we start talking about the ways of testing Kubernetes operators. In this article, I will reflect some of the standard Kubernetes methods for testing Kubernetes operators as well as a few innovative techniques that open source communities have been working on to fill this ever-widening gap.
Kubernetes client-go library provides fake clientset that respond to the configured Kubernetes resources, backed by a simple object tracker that handles creates, updates, and deletions as-is, without applying any validations and defaults. In its own words, this shouldn't be considered a replacement for a real clientset and is mostly useful in simple unit tests. This tracker accepts handlers against specific actions as a prerequisite. Fake workflow gets activated when an action is triggered during tests and helps to test the Kubernetes custom controller code.
In this mode, a Kubernetes setup is built using kube apiserver, etcd, and kubectl binaries. When made to talk to each other, these binaries represent a Kubernetes cluster with limited features that are good enough to test various controllers, orchestrated by a logic that is also responsible to triggering and tearing down custom test cases. This, along with other stuff, is what the Kubernetes e2e testing framework is made of. This concept has been elevated as a first-class citizen in the controller runtime codebase and is referred to as the EnvTest library. Once this setup is ready, its configured kubectl binary can be used to create, delete, update, patch, etc. Kubernetes resources (custom as well as native). Developers of various operators use this along with Ginkgo, GOmega, Go testing to let this be invoked via the familiar go test command. In other words, this is the canonical way to test a Kubernetes controller using Go.
Metacontroller is the first-ever project that brought out the Kubernetes control plane approach as an effective strategy of testing Kubernetes operators. Unlike others, it sticks to the Kubernetes control plane only by avoiding Ginkgo, GOmega libraries. Instead, it relies on idiomatic go's testing practices to keep things simple. Of late, it has innovated further to make testing more developer-friendly. It has implemented a wrapper over the Kubernetes control plane to design integration tests as a series of steps. A step is defined as an action against an unstructured instance. These steps outlined in an integration test are run in sequence to determine the result. The resulting test code is simple to understand for a Go developer and paves the way to an increased focus on testing in addition to adding features. Among other things, it has eliminated the boilerplate code related to loops, condition-based checks, retries, error handling, to name a few.
Full disclosure - I am a core maintainer of Metacontroller and I tend to like the way Metacontroller works as a means to test Kubernetes controllers. It seems to me that the best way to put the test development in the Go developer's hands.
The Kubernetes Test Toolkit (kuttl) has taken a different approach to testing Kubernetes operators. In short, it tests Kubernetes controllers declaratively. KUTTL documentation says it all. It lets one focus on testing using YAMLs that the Kubernetes community is more familiar with nowadays. KUTTL allows teams to design end to end tests, integration tests, or conformance tests using the same set of YAML definitions. In addition to supporting declarative intents, KUTTL supports invoking commands, e.g. kubectl command or any bash scripts.
One of the things that irk me is KUTTL’s dependence on so many files to test a feature instead of relying on a single one. At the same time invoking terminal commands or scripts might seem to be handy, to begin with, but can soon become unmanageable when testing at scale. KUTTL needs to ensure these commands do not freeze in between and if such a thing happens, it should provide a proper bailout mechanism.
Recently I’ve led an effort at MayaData to address the requirement for better testability of Kubernetes controllers by writing and sharing some software that we call D-Operators. We derived D-Operators from the metacontroller’s techniques by transforming those steps or actions into a Kubernetes custom resource. In other words, D-Operators use Kubernetes as a substrate to handle the testing needs of Kubernetes. Hence, features like multi-tenancy, container-native, API driven, autoscale comes out of the box. Since the test intents can be specified as a custom resource(s), it breeds familiarity with KUTTL’s declarative approach. However, D-Operators differ primarily in their vision and architecture. D-Operators is essentially just a bunch of Kubernetes custom controllers with the intent to simplify Kubernetes operations. Making Kubernetes controller testing simple is one of its primary goals. Being close to Kubernetes enables it to execute thousands of test cases in minutes or perhaps seconds just by increasing the number of its controller instances. We avoid out of band approaches like mixing YAML with bash, templating, or invoking shell commands. It strives to be purely declarative in nature and expects any out of band feature to be a separate custom resource that in turn can be actioned in test specifications. While D-Operators is a new project - it works well so far. Please take a look and open an issue on GitHub or otherwise share your feedback.
I put together a matrix that should help a team decide the best possible library or tool to test their Kubernetes controllers. I will exclude client-go in this matrix since it is purely for unit testing purposes. I did not include the Kubernetes control plane approach in this matrix since it is a subset of a more developer-friendly metacontroller style.
I would advise readers not to be too fixated with the above matrix and instead explore the reference links below. On a lighter side, the approaches we used for testing Kubernetes operators speak a lot about us as individuals and teams we are part of. It can indicate if we are deep into programming or care just enough about operations as well. Testing can indeed provide us with enough hints about our involvement in development or operations or both. A well-balanced approach can propel our Kubernetes story with much-needed tailwinds or slow it down significantly as unwieldy headwinds.
Once again - please provide your feedback on D-Operator and on this write-up as well!