Kubernetes Label Selector and Field Selector

The resources that we create in a Kubernetes cluster can be organised using labels. Before we talk about the field selector in Kubernetes, let’s quickly walk through what labels are.

Labels are key value pairs that can be used to identify or group the resources in Kubernetes. In other words, labels can be used to select resources from a list. You can label Kubernetes-native resources as well as Custom Resources. To better understand this, let us do some hands-on practice with labels.

Kubernetes storage extensions to weave scope

This tutorial will assume that you have a working minikube setup or a Kubernetes cluster setup.

The following is a link to the yaml. Its application will create a pod.

https://raw.githubusercontent.com/sonasingh46/artifacts/master/samples/sample-pod.yaml

The yaml looks like this:

 apiVersion: v1
 kind: Pod
 metadata:
   name: example-pod
   labels:
     env: development
 spec:
   containers:
   - name: label-example
     image: sonasingh46/node-web-app:latest
     ports:
     - containerPort: 8000

Notice the bold text in this yaml. That is one way to add labels to a resource through specification in yaml.Let us now create a pod by executing the following command:

kubectl apply -f https://raw.githubusercontent.com/sonasingh46/artifacts/master/samples/sample-pod.yaml

You can use the above command directly or copy the content to save it on your local machine in a file , say sample-pod.yaml.

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl apply -f sample-pod.yaml
pod/example-pod created

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get po
NAME READY STATUS RESTARTS AGE
example-pod 1/1 Running 0 3m

Now, we will run the following commands to check for labels in the pod:

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod example-pod --show-labels

NAME           READY   STATUS    RESTARTS   AGE     LABELS

example-pod    1/1     Running     0        3m      env=development

As you can see in the above output, example-pod uses a label of key value pair as env=development.
You can also do a kubectl get pod example-pod -o yaml to see all of the fields and labels.

Let us now add another label to the above pod using the kubectl command.

Adding a label:

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl label pod example-pod tier=backend
pod/example-pod labeled

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod example-pod --show-labels
NAME        READY   STATUS  RESTARTS   AGE         LABELS

example-pod  1/1   Running   0       13m    env=development,tier=backend

Removing a label:

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl label pod example-pod tier-

pod/example-pod labeled

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod example-pod --show-labels

NAME          READY    STATUS    RESTARTS    AGE          LABELS
example-pod    1/1     Running     0         23m        env=development

Updating a label:

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl label --overwrite pods example-pod env=prod

pod/example-pod labeled

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod example-pod --show-labels
NAME         READY    STATUS   RESTARTS  AGE    LABELS

example-pod   1/1     Running    0       25m    env=prod

kubectl label --overwrite pods example-pod env=prod will update the value of key env in the labels, and if the label does not exist, it will create one.

Now we will create one more pod by editing the above yaml and changing metadata.name to example-pod1. We will also remove the label from yaml.

 apiVersion: v1
 kind: Pod
 metadata:
   name: example-pod1
 spec:
   containers:
   - name: label-example
     image: sonasingh46/node-web-app:latest
     ports:
     - containerPort: 8000

Create a yaml file with above content, let’s say sample-pod1.yaml, and apply it.

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl apply -f sample-pod1.yaml
pod/example-pod1 created

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod
NAME           READY   STATUS    RESTARTS   AGE

example-pod    1/1     Running     0        17h
example-pod1   1/1     Running     0        6s

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pods --show-labels
NAME         READY   STATUS   RESTARTS  AGE   LABELS

example-pod   1/1    Running    0       17h   env=prod
example-pod1  1/1    Running    0       1m    <none>

You can learn more about some other kubectl label commands using kubectl label --help

Now we know enough to tag our resources with labels either by providing it in yaml or using the kubectl command. Let us now explore how the label can help in filtering or grouping the resources.

Selection Via Labels (Label Selector)

Using selection via labels can have the following two types of requirements:

  1. Equality-Based Requirement
  2. Set-Based Requirement

Equality Based Requirement

An equality-based requirement will match the specified label and filter the resources. The supported operators are =, ==, !=.

Suppose I have the following pods with the labels.

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get po --show-labels
NAME         READY   STATUS  RESTARTS    AGE    LABELS

example-pod   1/1    Running    0        17h env=prod,owner=Ashutosh,status=online,tier=backend

example-pod1  1/1    Running    0        21m env=prod,owner=Shovan,status=offline,tier=frontend

example-pod2  1/1    Running    0        8m env=dev,owner=Abhishek,status=online,tier=backend

example-pod3  1/1    Running    0        7m env=dev,owner=Abhishek,status=online,tier=frontend

Now, I want to see all pods with online status:

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pods -l status=online
NAME          READY   STATUS   RESTARTS    AGE

example-pod    1/1    Running    0         17h
example-pod2   1/1    Running    0         9m
example-pod3   1/1    Running    0         9m

Similarly, go through the following commands:

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pods -l status!=online
NAME          READY  STATUS   RESTARTS   AGE

example-pod1  1/1    Running   0         25m
example-pod4  1/1    Running   0         11m

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pods -l status==offline
NAME          READY  STATUS   RESTARTS   AGE

example-pod1  1/1    Running   0         26m
example-pod4  1/1    Running   0         11m

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pods -l status==offline,status=online
No resources found.

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pods -l status==offline,env=prod
NAME          READY    STATUS   RESTARTS  AGE

example-pod1  1/1      Running  0         28m

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pods -l owner=Abhishek
NAME          READY   STATUS    RESTARTS   AGE

example-pod2  1/1     Running   0          15m
example-pod3  1/1     Running   0          14m

In the above commands, labels separated by commas are a type of AND satisfy operation.

Similarly, you can try other combinations using the operators ( =!=, ==) and play!

Set-Based Requirement

Label selectors also support set-based requirements. In other words, label selectors can be used to specify a set of resources.

The supported operators here are in, notin and exists.

Let’s walk through kubectl commands for filtering resources using set-based requirements.

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod -l 'env in (prod)'
NAME          READY   STATUS   RESTARTS   AGE

example-pod   1/1     Running  0          18h
example-pod1  1/1     Running  0          41m

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod -l 'env in (prod,dev)'
NAME           READY   STATUS    RESTARTS   AGE

example-pod    1/1     Running    0         18h
example-pod1   1/1     Running    0         41m
example-pod2   1/1     Running    0         27m
example-pod3   1/1     Running    0         27m

Here, in env in (prod,dev), the comma operator acts as an OR operator. That is, it will list pods which are in prod or dev.

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod -l 'env in (prod),tier in (backend)'
NAME          READY   STATUS    RESTARTS   AGE

example-pod   1/1     Running   0          18h

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod -l 'env in (qa),tier in (frontend)'
No resources found.

Here the comma operator separating env in (qa) and tier in (frontend)will act as an AND operator.
To understand the exists operator, let us add the label region=central to example-pod and example-pod1 and region=northern to example-pod2.

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod --show-labels
NAME          READY   STATUS    RESTARTS  AGE   LABELS

example-pod    1/1     Running   0         18h env=prod,owner=Ashutosh,region=central,status=online,tier=backend

example-pod1   1/1     Running   0         54m env=prod,owner=Shovan,region=central,status=offline,tier=frontend

example-pod2   1/1     Running   0         40m env=dev,owner=Abhishek,region=northern,status=online,tier=backend

example-pod3   1/1     Running   0         40m env=dev,owner=Abhishek,status=online,tier=frontend

example-pod4   1/1     Running   0         40m env=qa,owner=Atul,status=offline,tier=backend

Now, I want to view pods that are not in the central region:

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pods -l 'region notin (central)'
NAME           READY  STATUS    RESTARTS  AGE
example-pod2   1/1    Running   0         42m
example-pod3   1/1    Running   0         42m
example-pod4   1/1    Running   0         41m

You can see here that example-pod2 has a region key with the value northern, and hence appears in the result. But one point to note is that the other two pods in the result do not have any region field and will satisfy the condition to appear in the result. If we want pods that have the region key to be the set of resources over which filtering should be done, we can restrict via the exists operator. We do not specifically write exists as we do write the in and notin in commands.

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pods -l 'region,region notin (central)'
NAME           READY   STATUS    RESTARTS  AGE

example-pod2   1/1     Running   0         46m

Similarly, you can play around by using various combinations in set-based requirements too for selecting a set of pods.

For more information about Labels and Selectors you can visit

https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/

Selection Via Fields (Field Selector)

We can also select kubernetes resources via a field-selector, but it has very limited support as of now.

The field selector does not support set-based requirement. Even the support for equality-based requirement is not extensive.

There are a limited number of fields that can be used for selection.

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod --field-selector metadata.name=example-pod
NAME          READY   STATUS    RESTARTS   AGE

example-pod   1/1     Running   0          18h

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod --field-selector metadata.namespace=default
NAME           READY   STATUS    RESTARTS   AGE

example-pod    1/1     Running   0          18h
example-pod1   1/1     Running   0          1h
example-pod2   1/1     Running   0          1h
example-pod3   1/1     Running   0          1h
example-pod4   1/1     Running   0          1h

For example, the following command will fail:

ashutosh@miracle:~/Desktop/artifacts/samples$ kubectl get pod --field-selector spec.name=label-example
No resources found.
Error from server (BadRequest): Unable to find {"" "v1" "pods"} that match label selector "", field selector "spec.name=label-example": field label not supported: spec.name

So, one can conclude that field-selector only works for metadata.name and for additional fields for some types, but it is a very select set. For examples, visit the link to see fields supported on pods:

https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/core/v1/conversion.go#L160-L167

I hope that this explanation about label and field selectors will help you in understanding how they work and how they can be used.

This article was first published on Jul 14, 2018 on MayaData's Medium Account.

Utkarsh Mani Tripathi
Utkarsh is a maintainer of jiva project and has contributed in building both control and data plane of OpenEBS. He loves to learn about file-system, distributed systems and networking. Currently, he is mainly focusing on enhancing jiva and maya-exporter In his free time, he loves to write poems and make lip smacking dishes
Chuck Piercey
Chuck Piercey is a Silicon Valley product manager with experience shipping more than 15 products in several different market segments representing a total of $2.5Bn revenue under both commercial and open source business models. Most recently he has been working for MayaData, Inc. focused on software-defined storage, network, and compute for Kubernetes environments. Chuck occasionally writes articles about the technology industry.
Evan Powell
Founding CEO of a few companies including StackStorm (BRCD) and Nexenta — and CEO &Chairman of OpenEBS / MayaData. ML and DevOps and Python, oh my!