LitmusChaos 0.7 Streamlines Kubernetes Chaos Engineering

As a project, Litmus has grown significantly and its vibrant community has provided sustained feedback over the last few months.

LitmusChaos 0.7

 

Introduction

I would like to express my heartfelt gratitude to both contributors and users for that feedback. To date, we have garnered 350+ GitHub stars & 138+ forks! And I have learned through many interactions with Developer/DevOps teams across various meetups and events such as the Gitlab Commit and DevOps days that the need and importance of chaos engineering practices (and a firm commitment to the Litmus architecture) has never been greater. The consensus view is “As applications turn more cloud-native (read: Kubernetes-native), the practices and tooling around chaos engineering should too. Chaos CRDs are key!” To help advocate this message more broadly we have created a channel on Kubernetes Slack called #litmus.

The Litmus 0.7 release equips our users with more experiments and integrates infrastructure components to facilitate easier onboarding into the world of open, collaborative chaos engineering. In this blog, we will delve into some of my favorite features & peek into the immediate road map for subsequent releases. You can find the full list of changes here.

Override Experiment Tunables via ChaosEngine

The chaos charthub was introduced as part of version 0.6 and allows users to browse for chaos experiment custom resources of choice and install them on a cluster while creating a chaosEngine CR to execute them against the desired application. The chaos-experiment CRs play the role of base specifications for chaos parameters and are available to a given namespace. Considering that it is possible to have more than one application being subjected to chaos in a given namespace, there was a need to isolate the tunables for each instance of chaos without changing it at an namespace-wide level. To reinforce the status of chaosEngine as “the” single-source of truth (which the user needs to edit) the chaos executor now has the ability to override defaults.

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: chaos
  namespace: default
spec:
  monitoring: false
  appinfo:
    appkind: deployment
    applabel: app=nginx
    appns: default
  chaosServiceAccount: nginx
  experiments:
  - name: container-kill
    spec:
    components:
- name: TARGET_CONTAINER
value: nginx

 

Experiment Results Available as Status on ChaosEngine  

The status of chaos experiments executed by the chaos operator is now published in the status field of the ChaosEngine. Note that the ChaosResult CR continues to exist, with scope for further schema development on result specifics.

kind: ChaosEngine
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: | <stripped>
    creationTimestamp: 2019-10-09T11:27:09Z
    generation: 1
    name: engine-nginx
    namespace: default
    resourceVersion: "6854030"
    selfLink: /apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/engine-nginx
    uid: bb48b201-ea87-11e9-bb68-0050569846e3
  spec:
    appinfo:
      applabel: run=nginx
      appns: default
    chaosServiceAccount: nginx
    experiments:
    - name: pod-delete
      spec:
        components: null
status:
experimentStatuses:
- instance: pod-delete-792363
name: pod-delete
status:
verdict: pass

Integration with PowerfulSeal

Litmus is inherently a community-driven chaos engineering project that aims to reuse the many excellent tools already available that can inflict chaos while orchestrating them all in a Kubernetes-native way. Powerfulseal is one such chaos tool. With Litmus 0.7, you can choose to kill pods randomly via Powerfulseal.

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: chaos
  namespace: default
spec:
  monitoring: false
  appinfo:
    appkind: deployment
    applabel: app=nginx
    appns: default
  chaosServiceAccount: nginx
  experiments:
  - name: pod-delete
    spec:
    components:
- name: FORCE
value: true
- name: LIB
value: powerfulseal

Increased Chaos Experiments

Additional chaos charts enable injecting pod-level “network” chaos (packet loss & latency) and have been added to the “generic” experiment category. In addition, this release adds OpenEBS data plane chaos (storage target and storage pool pods) experiments.

Storage pool pods
storage target

 

Improved CI for LitmusChaos Components

Litmus 0.7 improved upon the existing CI via increased unit tests & BDD tests coverage (chaos-operator, chaos-exporter) while also putting CI in place for the charthub & chaos-charts repo (which is a canonical place/backend for the CRs listed on the hub).

PASS: TestNewRunnerPodForCR/Test_Positive-2 (0.00s)
PASS: TestNewRunnerPodForCR/Test_Negative-1 (0.00s)
PASS: TestNewRunnerPodForCR/Test_Negative-2_ (0.00s)
PASS: TestNewRunnerPodForCR/Test_Negative-3_ (0.00s)
PASS: TestNewRunnerPodForCR/Test_Positive-1 (0.00s)
PASS: TestNewMonitorServiceForCR (0.00s)
PASS: TestNewMonitorServiceForCR/Test_Positive (0.00s)
PASS: TestNewMonitorServiceForCR/Test_Negative (0.00s)
PASS: TestNewMonitorPodForCR (0.00s)
PASS: TestNewMonitorPodForCR/Test_Positive (0.00s)
PASS: TestNewMonitorPodForCR/Test_Negative (0.00s)
PASS: TestInitializeApplicationInfo (0.00s)
PASS: TestInitializeApplicationInfo/Test_Negative (0.00s)
PASS: TestInitializeApplicationInfo/Test_Positive (0.00s)

RUN   TestChaos
Running Suite: BDD test
=======================
Random Seed: 1571131836
Will run 2 of 2 specs

chaos-operator created successfully
ChaosExperiment created successfully...
Chaosengine created successfully...
name :  engine-nginx-runner
• [SLOW TEST:100.090 seconds]

Ran 2 of 2 Specs in 140.025 seconds
SUCCESS! -- 2 Passed | 0 Failed | 0 Pending | 0 Skipped
PASS: TestChaos (140.03s)

Improved Documentation

Importantly, this release includes completely-rewritten user documentation, with simpler getting-started guides, improved examples and an upgraded docusaurus version to help users to start their chaos engineering journey with Litmus.

https://docs.litmuschaos.io

Conclusion

The strength of any open source project is in its community. I would like to give a huge shoutout to @jayadeepkm, @aswathkk, and a host of other contributors for helping us roll out this release.

A quick peek into the 0.8 release, some of the high-level backlog features include:

  • Increased chaos experiment charts
  • Upgraded chaos-operator with ability to select job cleanup/retention, executor image selection, etc.
  • Improved developer docs for chaos chart contributors
  • Improved project maintenance guidelines

Do try out Litmus charts. As always, we look forward to your valuable feedback & comments.

Don Williams
Don is the CEO of MayaData and leading the company for last one year. He has an exceptional record of accomplishments leading technology teams for organizations ranging from private equity-backed start-ups to large, global corporations. He has deep experience in engineering, operations, and product development in highly technical and competitive marketplaces. His extensive professional network in several industries, large corporations and government agencies is a significant asset to early stage businesses, often essential to achieve product placement, growth and position for potential exit strategies.
Kiran Mova
Kiran evangelizes open culture and open-source execution models and is a lead maintainer and contributor to the OpenEBS project. Passionate about Kubernetes and Storage Orchestration. Contributor and Maintainer OpenEBS projects. Co-founder and Chief Architect at MayaData Inc.
Murat Karslioglu
VP @OpenEBS & @MayaData_Inc. Murat Karslioglu is a serial entrepreneur, technologist, and startup advisor with over 15 years of experience in storage, distributed systems, and enterprise hardware development. Prior to joining MayaData, Murat worked at Hewlett Packard Enterprise / 3PAR Storage in various advanced development projects including storage file stack performance optimization and the storage management stack for HPE’s Hyper-converged solution. Before joining HPE, Murat led virtualization and OpenStack integration projects within the Nexenta CTO Office. Murat holds a Bachelor’s Degree in Industrial Engineering from the Sakarya University, Turkey, as well as a number of IT certifications. When he is not in his lab, he loves to travel, advise startups, and spend time with his family. Lives to innovate! Opinions my own!