Hardening Prometheus and OpenEBS using Litmus

Modern day testing of applications happens both in the CI pipelines and during production. That is, if you are following the principles of chaos engineering. Chaos engineering has spread into the lives of DevOps so much that it has become part of the application development process.1-14
Chaos Engineering for Prometheus on Kubernetes using Litmus

At MayaData, we use Litmus to practice chaos engineering for validating each commit of OpenEBS against a few Prometheus releases. We also extend the same testing to conduct real-world chaos engineering on our production clusters where Prometheus is being used for monitoring our GitLab production system.

In this article, I will describe how OpenEBS is used as persistent storage for Prometheus and discuss how we verify the stability of such a deployment. Before diving into the details, let’s talk about what Litmus is and why OpenEBS is used as TSDB for Prometheus.

What is Litmus?

LITMUSAn open source framework for chaos engineering based qualification of Kubernetes environments running stateful applications. For a good introduction to Litmus and how to get started with Litmus, see the Litmus docs (https://docs.litmuschaos.io/ )

Litmus books are broadly categorized into four types:

  1. K8s infrastructure books
  2. Stateful applications deployment books
  3. Stateful applications chaos books
  4. Deployers for providers such as OpenEBS.

Next, I will focus on what Litmus deployers and chaos jobs are available to help build a CI/CD pipeline to harden Prometheus applications on OpenEBS and Kubernetes.

Introduction to OpenEBS

OpenEBS is the leading open source Container Attached Storage software and has become a common part of many Kubernetes deployments since its first release in early 2017. OpenEBS has been accepted into the Cloud Native Computing Foundation as a Cloud Native Sandbox Project and is featured here: CNCF Sandbox Projects. You can read more about OpenEBS in the OpenEBS Docs

2-12OpenEBS architecture

Because OpenEBS is a pluggable, containerized architecture, it can easily use different storage engines that write data to a disk or underlying cloud volumes; the two primary storage engines are Jiva and cStor. With WAL support, the write performance of Prometheus increases significantly.

Why use OpenEBS Volume as Prometheus TSDB?

One of the challenges with Prometheus is determining how to set up and manage storage. The default behavior of Prometheus is to simply have each node store data locally. However, this of course exposes the user to potential data loss if the local node goes down.

Here are some issues with Prometheus storage:

  1. When using local storage, Prometheus stores time series in memory and on a local disk. Therefore, metrics are not persisted if its POD restarts.
  2. If we configure persistent volumeas local and that pod is rescheduled to any other Node of the cluster, it loses all previous data that has persisted on the previous node.
  3. When using Remote storage, read and write operations are quite slow.

By using OpenEBS volumes related to the local storage for Prometheus on Kubernetes clusters, each of the above drawbacks is directly addressed. OpenEBS volumes are replicated synchronously and data is protected and always made available against either a node outage or a disk outage.

Using OpenEBS as storage for Prometheus on Kubernetes clusters is an easy and viable solution for production-grade deployments.

3-8
OpenEBS as highly available TSDB for Prometheus

Elements of a Prometheus CI/CD pipeline

We have successfully implemented GitLab stages for Prometheus and system validation. Full implementation of such pipeline is shown below as an example.

4-9
GitLab CI pipeline for Prometheus on OpenShift using OpenEBS as persistent storage

The figure above is a sample GitLab pipeline that is running OpenShift EE 3.10 and Prometheus:v2.3.0 with Litmus. Here are the following stages:

  • CLUSTER-Setup
  • OpenEBS-Setup
  • FUNCTIONAL
  • CHAOS
  • CLEANUP

Litmus provides almost-ready books for every stage except FUNCTIONAL, where the Developers and DevOps admins should be spending time creating the tests for their applications. The rest of the stages are generic enough that Litmus can easily do the job for you with the tuning of the parameters.

Reference Implementation:

The Prometheus GitLab pipeline implementation for OpenShift EE platform and corresponding Litmus books are all available in the OpenEBS GitHub repository:

mayadata-io/e2e-openshift
Automation of OpenEBS E2E testing on OpenShift On-Premise — mayadata-io/e2e-openshiftgithub.com

Example Litmus Jobs for Prometheus on OpenEBS

App Deployers

Litmus job for deploying Prometheus using OpenEBS volumes for storing metrics:

https://raw.githubusercontent.com/litmuschaos/litmus/master/apps/prometheus/deployers/run_litmus_test.yml

Loadgen

Litmus job for load generation in Prometheus using Avalanche load generator:

https://raw.githubusercontent.com/litmuschaos/litmus/master/apps/prometheus/loadgen/run_litmus_test.yml

Liveness

Litmus job to check the liveness of Prometheus app:

https://raw.githubusercontent.com/litmuschaos/litmus/master/apps/prometheus/liveness/run_litmus_test.yml

Chaos Jobs — Storage

Litmus job for inducing OpenEBS cStor pool pod to delete and verify the application availability:

Summary:

Building CI/CD pipelines for stateful applications like Prometheus on OpenEBS and Kubernetes/OpenShift is quick and easy. Most of the pipeline is readily available through Litmus. Users can apply the readily available Litmus books to build Chaos Engineering into their GitLab pipelines with ease.

 

This article was first published on May 21, 2018 on MayaData's Medium Account.

Don Williams
Don is the CEO of MayaData and leading the company for last one year. He has an exceptional record of accomplishments leading technology teams for organizations ranging from private equity-backed start-ups to large, global corporations. He has deep experience in engineering, operations, and product development in highly technical and competitive marketplaces. His extensive professional network in several industries, large corporations and government agencies is a significant asset to early stage businesses, often essential to achieve product placement, growth and position for potential exit strategies.
Kiran Mova
Kiran evangelizes open culture and open-source execution models and is a lead maintainer and contributor to the OpenEBS project. Passionate about Kubernetes and Storage Orchestration. Contributor and Maintainer OpenEBS projects. Co-founder and Chief Architect at MayaData Inc.
Murat Karslioglu
VP @OpenEBS & @MayaData_Inc. Murat Karslioglu is a serial entrepreneur, technologist, and startup advisor with over 15 years of experience in storage, distributed systems, and enterprise hardware development. Prior to joining MayaData, Murat worked at Hewlett Packard Enterprise / 3PAR Storage in various advanced development projects including storage file stack performance optimization and the storage management stack for HPE’s Hyper-converged solution. Before joining HPE, Murat led virtualization and OpenStack integration projects within the Nexenta CTO Office. Murat holds a Bachelor’s Degree in Industrial Engineering from the Sakarya University, Turkey, as well as a number of IT certifications. When he is not in his lab, he loves to travel, advise startups, and spend time with his family. Lives to innovate! Opinions my own!