Resources

Understanding StatefulSets in Kubernetes

October 14, 2021

StatefulSets allow the orchestration of replicated containers based on the same specifications. Let us dive deep!

Anthony Yandell
Sr. Cloud Engineer

A great deal of the work involved with creating a clustered environment of any type is the ability to deploy and scale with consistency. In Kubernetes, one method of managing these actions is by using StatefulSets. Here, we dive deeper into using this API object to manage stateful applications. We will also look into an example of StatefulSets in use.

StatefulSets for Stateful Applications

In Kubernetes, scaling is completed by managing a set of Pods. For this to happen in a standardized way, a resource is needed at the workload level. One common workload resource is a Deployment that will handle ReplicaSets. While this works well for stateless applications, it will not work for situations where data must persist through any application scaling.

StatefulSets is for use with stateful applications in a Kubernetes cluster. It allows the orchestration of replicated containers based on the same specifications, but maintains an identity that remains with the Pod through rescheduling. These Pods are no longer interchangeable due to this persistent identifier. However, matching any new Pods to existing volumes allows the state to persist.

An Order of Stability with a Side of Persistence

As we discussed, if your application does not need to take state into consideration, ReplicaSets may be the better option for simplicity. Otherwise, using StatefulSets will help ensure a number of requirements can be met for your stateful app. It provides a more graceful method of deployment and scaling and ensures data is available via unique identity.

A number of features intrinsic with the use of StatefulSets are the guarantees around deployment and scaling. These features are what contributes to the stability of an application running under this method of Pod management. There are a number of things to know in regards to these deployment and scaling guarantees:

  • For a StatefulSet with N replicas, when Pods are being deployed, they are created sequentially, in order from {0..N-1}.
  • When Pods are being deleted, they are terminated in reverse order, from {N-1..0}.
  • Before a scaling operation is applied to a Pod, all of its predecessors must be Running and Ready.
  • Before a Pod is terminated, all of its successors must be completely shutdown.

Courtesy kubernetes.io Concepts Documentation

This type of pod management (OrderedReady) is the default when using StatefulSets. Alternatively, Parallel pod management allows the controller to launch pods without waiting for them to be Running and Ready. As a behavior of scaling, pods may also be terminated in parallel. This negates the normal routine of waiting for the pods to finish startup or completing the shutdown of N successors during deployment or scaling.

Application Example Using nginx

Taking a queue from the available documentation at kubernetes.io, we will be creating a web service using nginx to publish a web service. During this process, we will use kubectl commands that show the state of our pods throughout their lifecycle.

The OS cross-compatibility of minikube makes it handy for demonstrating StatefulSets. We will use it to emulate a Kubernetes cluster and set up a Stateful application example. If you have not had the chance to use it yet, take a look at the Chocolatey Package Manager. This example uses the choco CLI to install minikube.

Other OS users can find instructions for their systems here.

From an administrative command line:

<terminal>choco install minikube<terminal>

After the installation, it is a good idea to close and re-open your command shell. The next command will prepare and start the cluster.

<terminal>minikube start<terminal>

Look at this YAML example. It is meant to show one method of creating a headless service using nginx. Save this file as web.yaml for use in upcoming steps:


---
apiVersion: v1
kind: Service
metadata:
 name: nginx
 labels:
  app: nginx
spec:
 ports:
  - port: 80
  name: web
 clusterIP: none
 selector:
  app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
 name: web
spec:
 serviceName: "nginx"
 replicas: 2
 selector:
  matchLabels:
   app: nginx
 template:
  metadata:
   labels:
    app: nginx
  spec:
   containers:
    - name: nginx
    image: k8s.gcr.io/nginx-slim:0.8
    ports:
     - containerPort: 80
     name: web   
    volumeMounts:
     - name: www
     mountPath: /usr/share/nginx/html
 volumeClaimTemplates:
  - metadata:
   name: www
  spec:
   accessModes: [ "ReadWriteOnce" ]
   resources:
    requests:
     storage: 1Gi

So that we can watch the creation of the Pods, we open two command line windows. In the first window we run the following:

<terminal>kubectl get pods -w -l app=nginx<terminal>

In the second, the service is created based off the web.yaml, above:

<terminal>kubectl apply -f web.yaml<terminal>

Now we can see the 2 instances of the application as they are instantiated.

We can use kubectl to get additional information regarding our running containers:

Just from this short example, it is easy to see there is a large amount of manual toil required for monitoring a real-world application. By using the various tools for Kubernetes management, you begin to surface all of the complexities. Technology for cloud and hybrid cloud management is moving towards a self-healing model thanks to the metrics in the underlying infrastructure.

Important metrics include but should not be limited to:

The first step in a successful Kubernetes implementation is being able to look at the big picture. Putting it all together, it is clear that having a way to visualize applications in a real-world scenario is critical. At ContainIQ, our platform helps engineers monitor, visualize, and debug Kubernetes clusters. Book a demo to learn more.

Article by

Anthony Yandell

Sr. Cloud Engineer

Anthony is a dynamic technologist that takes pride in the breadth of experience he has gained from years in the software development industry. As a pioneer of the DevOps movement, he strives to educate others on the time-saving and stability aspects of proper automation.

Read More