While the pod is the basic deployment unit for containers, Kubernetes provides various resource objects for orchestrating multiple pod replicas. Two commonly used ones are Deployments and StatefulSets. A Deployment manages multiple pods by automating the creation, updating, and deletion of ReplicaSets. By contrast, a StatefulSet helps orchestrate stateful pods by guaranteeing the ordering and uniqueness of pod replicas.
In this article, we’ll discuss these two pod orchestration resources, how they differ, and the use cases they are most suitable for.
What Is a Deployment in Kubernetes?
A Deployment is a Kubernetes resource object that provides declarative updates for pods that encapsulate application containers. A Deployment represents a number of identical pods without unique IDs, while specifying the pods’ desired state and attributes. Deployments are typically used to autoscale the number of pod replicas, perform controlled rollouts for application code, and perform rollbacks when necessary.
Kubernetes administrators rely on Deployments to manage a containerized application’s lifecycle by defining the number of pods to be deployed, the image to be used for the application, and how to perform code updates. Kubernetes deployments help automate repeatable application updates, subsequently reducing the effort, time, and number of errors associated with manual updates.
Components of a Kubernetes Deployment
The primary components used to create and apply a Deployment to a cluster include:
- Deployment template: This is a JSON or YAML configuration file that is used to define the Deployment’s configuration specification. The Kubernetes Deployment controller relies on the desired state described in the Deployment template to create, update, and scale pods. The JSON or YAML file is static, and it includes a pod template that defines what each pod should look like, as well as other common parameters, such as:
- Number of pod replicas
- Name of the image running in the pods
- Deployment’s image tag
- Secrets, ConfigMaps, and other settings injected into the pod
- Service labels
- Service: Defines a single endpoint that is used to enable network access and expose workloads running on the pods within the Deployment. A service is a REST object that points to the Deployment pods and includes a policy to access them.
- Persistent Volume: Allows pods within the Deployment to access a portion of node storage to store data.
Deployment Configuration Manifest
Consider a static YAML file for a Kubernetes deployment named <terminal inline>darwin-deployment.yaml<terminal inline> with the following specifications:
The above static file represents a Deployment named <terminal inline>darwin-deployment<terminal inline> that deploys three replicas of a pod to encapsulate containers running the <terminal inline>novice<terminal inline> image workload. The pods are attached to the <terminal inline>darwin-volume-claim<terminal inline> PersistentVolumeClaim with a specification similar to:
To execute the Deployment within the cluster, it should be exposed using a service, such as the NodePort service, specified by the <terminal inline>service.yaml<terminal inline> file below:
To deploy the application, the Deployment, volume claim, and service are all applied to the cluster using the following commands:
<terminal inline>$ kubectl apply -f service.yaml<terminal inline>
<terminal inline>$ kubectl apply -f darwin-volume-claim.yaml<terminal inline>
<terminal inline>$ kubectl apply -f darwin-deployment.yaml<terminal inline>
Discovering Deployment Details
Administrators can use the kubectl command to discover details of the Deployment and the pods they control. To check for the successful creation of the deployment, run the command:
<terminal inline>$ kubectl get deployments<terminal inline>
Which returns a response similar to:
To check for the pods automatically created by the deployment, run the command:
<terminal inline>$ kubectl get pods<terminal inline>
Which returns the following output:
The ‘kubect’l command can also be used to scale the number of pods with changing patterns of an application load. To increase the number of pods for <terminal inline>darwin-deployment<terminal inline> to 5, run the command:
<terminal inline>$ kubectl scale deployment/darwin-deployment --replicas=5<terminal inline>
With an output similar to:
<terminal inline>deployment.apps/darwin-deployment scaled<terminal inline>
Kubernetes Deployment Strategies
Kubernetes supports multiple rollout strategies for pod deployments. These include:
- Recreate: Simultaneously terminates and replaces all pods running the old version of the application with new pods.
- Ramped: Rolls out new application versions while terminating the old pods.
- Rolling update: Replaces old pods with new ones, one-by-one, with zero downtime.
- Canary deployment: Replaces a subset of existing pods with new ones, keeping both versions running, and then rolls out the new version to more pods if the test deployment is a success.
What Is a StatefulSet in Kubernetes?
A StatefulSet is a Kubernetes resource object that manages a set of pods with unique identities. By assigning a persistent ID that is maintained even if the pod is rescheduled, a StatefulSet helps maintain the uniqueness and ordering of pods. With unique pod identifiers, administrators can efficiently attach cluster volumes to new pods across failures.
Although the StatefulSet controller deploys pods using similar specifications, pods are not interchangeable. As a StatefulSet does not create a ReplicaSet, the pod replicas cannot be rolled back to previous versions. StatefulSets are typically used for applications that require persistent storage for stateful workloads, and ordered, automated rolling updates.
Components of a Kubernetes StatefulSet Configuration Manifest
A Kubernetes StatefulSet configuration comprises the following:
- StatefulSet: The template that defines pod selectors and replicas of containers that will run on the pods.
- Headless service: The network domain controller that allows clients to connect with the pods using a DNS entry.
- Volume claim template: The template specification that allows administrators to provision stateful storage using persistent volumes.
StatefulSet Configuration Manifest
Consider a StatefulSet configuration named <terminal inline>statefulset.yaml<terminal inline> with the following specification:
The above StatefulSet can be attached to a PersistentVolume named <terminal inline>darwin-claim.yaml<terminal inline> as follows:
To expose the StatefulSet via a headless service named <terminal inline>darwin-service.yaml<terminal inline>, the following configuration can be used:
All the above configurations can be applied to the cluster using the <terminal inline>kubectl apply<terminal inline> command, as follows:
<terminal inline>$ kubectl apply -f statefulset.yaml<terminal inline>
<terminal inline>$ kubectl apply -f darwin-claim.yaml<terminal inline>
<terminal inline>$ kubectl apply -f darwin-service.yaml<terminal inline>
The above commands create three pod replicas with ordered identities.
Discovering StatefulSet Details
Pods within the StatefulSet can be verified with the <terminal inline>get pods<terminal inline> command:.
<terminal inline>$ kubectl get pods<terminal inline>
The above command returns the list of pods running, as shown below:
Quick Note: The above output shows that the StatefulSet created the pods in an ordered sequence, with the index starting at 0. As the StatefulSet controller guarantees ordering and uniqueness of pods, and since the StatefulSet was initially named as <terminal inline>darwin<terminal inline>, the pod replicas are auto-named as <terminal inline>darwin-0<terminal inline>, <terminal inline>darwin-1<terminal inline>, and <terminal inline>darwin-2<terminal inline>.
Kubernetes Deployment vs. StatefulSet: How to Choose
The table below shows the primary differences between a StatefulSet and a Deployment:
When to Use
A StatefulSet is better suited to stateful workloads that require persistent storage on each cluster node, such as databases and other identity-sensitive workloads. A Deployment, on the other hand, is suitable for stateless workloads that use multiple replicas of one pod, such as web servers like Nginx and Apache.
This practical scenario demonstrates how a StatefulSet differs from a Deployment:
Consider a web app that uses a relational database to store data. When traffic to the application increases, administrators intend to scale up the number of pods to support the workload. A straightforward approach is simply to change the replica count within the Deployment’s configuration manifest; then the Deployment controller will take care of scaling. Since new pod replicas are assigned the same set of ConfigMaps and environment variables when starting, they communicate with the backend the same way as the original pod, retaining the user experience for incoming traffic.
Similar to the web servers, the relational database may also need to be scaled up to meet the increased workload. Since the master and replica pods need to implement a leader-follower pattern, the pods of the database cannot be created or deleted randomly. In addition, while each pod needs to sync its data with the previous pod, it retains its own copy of the data stored. In such an instance, a StatefulSet helps create the database pods in an ordered sequence where every new pod acquires its copy of data from the last pod generated. If a pod fails, the StatefulSet controller automatically deploys new pod replicas incrementally with the same identity and attaches them to the same PVC.
Deployments and StatefulSets are Kubernetes API resources with different approaches to launching and managing containerized workloads. StatefulSets assign pods the same storage and network identities across restarts, with every replica getting its own state and persistent volume claim. A Deployment is used to spin and scale stateless applications while saving the state of the ReplicaSet it manages in a persistent volume, so that all pod replicas share the same volume.
Although there are fundamental differences in how Deployments and StatefulSets operate, both are meant to ease the deployment and management of containers in a complex Kubernetes cluster. Before choosing one of them, it’s important for administrators to assess their technical use case and their objectives.