Kubernetes has gained popularity in recent years for its ability to manage containerized workloads at scale. One of the key features of Kubernetes is its support for multiple storage backends, which allows users to select the storage solution that best fits their needs.
With Kubernetes, the Cloud Native Computing Foundation graduated container orchestrator, you can manually provision storage whenever your pods need it, or do so automatically with the help of dynamic provisioning. When your pods require storage, Kubernetes creates it with the desired provisioner.
In this blog, you’ll look at the dynamic storage provisioning in Kubernetes by storage classes. You’ll also look at its components with examples and use cases to see why it’s so important and how it works with the cluster. By the end of this post, you’ll understand how to use storage classes to control the type of storage used for your workloads.
An Introduction to Kubernetes Storage
Before diving into the specifics of Kubernetes storage classes, you must understand the basics of persistent volumes and persistent volume claims.
A persistent volume (PV) is a Kubernetes object that represents a piece of storage, either locally or on the cloud. Pods use the PV of a cluster to store their data. A persistent volume claim (PVC) is a Kubernetes object representing a claim on a PV by forming a one-to-one mapping with your persistent volume and specifying what’s required for the persistent volume to be used by your pod.
Whenever you need storage for your stateful application, you should provision a disk for your PV. For example, <terminal inline>gcePersistentDisk<terminal inline> on Google Cloud:
After the disk provision is successful, create a PV using the kubectl command-line tool, or by creating YAML files that define the PV objects. A sample YAML file for your PV will look like this:
In the above manifest, you create a PV with one gibibyte that can be mounted to your cluster. The file system used is a fourth extended filesystem, and the name of the disk is gcp-disk.
The next step is creating a PVC that claims the storage. The same PVC will be used in your pod definition, which will allow your pods to use the disk. The above procedure, in which you need to provision a disk for your PV, is referred to as static storage provisioning, because you create the disks manually.
However, it’s not convenient or practical to manually create a disk and PV every time you need storage. Kubernetes solves this problem by providing a way to make your storage provisioning more dynamic with dynamic storage provisioning, where PVs and disks are created on the fly when PVCs or pods request storage.
What Is Dynamic Storage Provisioning?
Dynamic storage provisioning is the process of automatically allocating storage to containers in a Kubernetes cluster. This is done automatically by the Kubernetes control plane when it detects that a container needs storage.
The control plane will select a storage class based on the container’s requirements, then automatically create a new persistent volume using that storage class. After that, as in static provisioning, the PV is attached to the container, and the container can use the storage via a PVC.
What Is a Storage Class?
StorageClass objects were introduced in Kubernetes 1.5 and have since become essential to Kubernetes storage.
A storage class in Kubernetes defines different storage types, which allows the user to request a specific type of storage for their workloads. Storage classes also allow the cluster administrator to control which type of storage is used for specific workloads by specifying a type of storage.
The StorageClass object contains information about the provider, such as Amazon EBS or Google Cloud Storage, as well as which capabilities, such as replication factor or encryption, the storage should have. Kubernetes will then use information from the storage class when it creates new persistent volumes.
Why Use Kubernetes Storage Classes?
Kubernetes storage classes enable an administrator to create and manage multiple storage configurations and bind them to individual applications or workloads. This provides greater flexibility and control over managing storage resources in a Kubernetes cluster, as you don’t need to configure and create different types of storage with various specifications every time a storage request is made.
The dynamic provisioning and class isolation of storage can be used in many different scenarios, including the following common use cases.
Different Quality of Service Levels
Different classes can correspond to varying levels of quality of service, which can mean some classes are faster or have more storage than others.
For example, a class might be designed for files that are accessed frequently, or that are accessed only occasionally. A premium solid-state drive (SSD) can be provisioned for frequently accessed files, while a lower-cost but slower hard drive (HDD) can be provisioned for occasionally accessed files.
Azurefile Premium Disk
The premium disk is a very performant disk with high input/output per second (IOPS), but is also costlier than using the standard storage class.
Azure creates this storage class with every managed Kubernetes cluster. To define the object, use the command <terminal inline>kubectl describe sc azurefile-premium<terminal inline>.
In the output above, you can see details about the StorageClass <terminal inline>azurefile-premium<terminal inline>, including information about the type of disk it should use, which in this instance is <terminal inline>Premium_LRS<terminal inline>. The <terminal inline>AllowVolumeExpansionindicates<terminal inline> that more storage should be provisioned as required.
This is a less expensive alternative that’s best used for less intensive tasks, as it has limited performance when compared to the alternatives. It can be described with the command <terminal inline>kubectl describe sc azurefile<terminal inline>. This storage class also comes built in to every Azure cluster, and uses <terminal inline>Standard_LRS<terminal inline> as the storage disks, as shown below:
Different Backup Policies
Different storage classes can be created for different backup policies, allowing you to provision disks with different performance and prices. For example, one class might be designed for files that are in the production environment and need to be backed up regularly, while another class might be designed for files in the quality assurance environment that only need to be backed up occasionally.
Different classes might also be used to support other kinds of administrator-determined policies, as well. Depending on the scenario, one class might be designed to replicate files to multiple servers, while another class might be designed to retain data on a single server, such as in test environments.
Another example can be of a cluster administrator who might want to use a specific type of storage for databases and a different type of storage for application logs.
Using Kubernetes Storage Classes
A StorageClass is simply a Kubernetes object, so like other objects, it’s defined by a YAML manifest. However, its properties and metadata are different from those associated with other objects.
A sample manifest of a StorageClass object is below:
Understanding Components of Kubernetes Storage Classes
Some keys are present in all the objects, and some are exclusive to the object type. The first three keys, <terminal inline>apiVersion<terminal inline>, <terminal inline>kind<terminal inline>, and <terminal inline>metadata<terminal inline>, are used across all Kuberentes objects, while the others in the list are exclusive to StorageClass objects.
The apiVersion field indicates which version of the storage.k8s.io API is being used.
The kind field indicates the type of object being created. In this case, the object is a StorageClass. Other <terminal inline>kind<terminal inline>s include pods and deployments.
The metadata field contains information about the object, such as its name, labels, namespace, and creation timestamp. The only metadata field in this example is the name field, which specifies the name of the StorageClass, and is required. Other fields in the list are optional.
A storage class provisioner is a type of volume plug-in that handles the creation and management of storage within Kubernetes. Storage provisioners are responsible for creating and deleting storage volumes, as well as attaching and detaching them from nodes.
Some provisioners are built in and shipped with Kubernetes and some are third-party provisioners, which is helpful when internal provisioners don’t support your use case. Third-party provisioners are not shipped with Kubernetes, but are developed and maintained by third-party vendors. You can also create your own provisioner using the external provisioner libraries from the Kubernetes community.
The example implements the ’kubernetes.io/aws-ebs’ provisioner, which refers to the AWSElasticBlockStore volume plug-in. You can identify it as a built-in provisioner because it’s prefixed with <terminal inline>kubernetes.io/<terminal inline>. Some of the other built-in volume plug-ins are those of cloud providers such as Amazon, as in this example, and Azure, shown in the earlier examples.
More information about volume plug-ins can be found in the official documentation.
The reclaim policy is the policy that describes how a storage class can reclaim a PV when it is no longer needed. The options are <terminal inline>Delete<terminal inline>, which will delete the data immediately, or <terminal inline>Retain<terminal inline>, which will keep the data until the user deletes it.
There are benefits and drawbacks to both approaches. <terminal inline>Delete<terminal inline> will de-provision your storage disk and <terminal inline>Retain<terminal inline> keeps the storage disk. If the disk is present, you can get your data back if the wrong pods are deleted, but this can result in increased costs if the disks aren’t deleted regularly.
The default reclaim policy is <terminal inline>Delete<terminal inline>; in the above example, it’s set to <terminal inline>Retain<terminal inline>.
Sometimes, you may find that your application needs more storage than was initially provisioned. With the <terminal inline>allowVolumeExpansion<terminal inline> property, you can simply expand the storage volumes, and the storage class will take care of the rest.
When <terminal inline>allowVolumeExpansion<terminal inline> is set to true, the storage class will allow the underlying storage volumes to be expanded to add more storage if needed. This is useful when using storage solutions that will enable on-demand expansion, such as Amazon EBS in the example above.
<terminal inline>volumeBindingMode<terminal inline> is a policy setting that determines how VolumeAttachments are handled when a PVC is created.
There are two volume binding modes that can be used:
- Immediate: In this mode, the volume and storage binding are created immediately after the PVC is created. The manifest example uses this mode, so whenever a new PVC with the standard storage StorageClass is created, you can see it in the bound state. This mode is universally supported.
- WaitForFirstConsumer: In this mode, volume is not created until a pod that consumes the PVC is created. Most plug-ins don’t support this mode; only Amazon Elastic Block Store, Google Compute Engine Persistent Disk, and Azure Disk support it.
The mount options parameter allows an administrator to specify a list of options for mounting a volume. These options can be used to debug mounting issues or fine-tune a volume’s performance.
Sometimes when using <terminal inline>WaitForFirstConumer<terminal inline> binding mode, you want your disks to provision in certain specific regions or zones for proximity or cost. You can do this with the <terminal inline>allowedTopologies<terminal inline> property, as seen in the example below, where the regions are restricted to <terminal inline>us-central-1a<terminal inline> and <terminal inline>us-central-1b<terminal inline>:
With parameters, you change the type of your storage disks and enable encryption and replication. You can also configure file types with other parameters, which you can read about in the documentation.
In the manifest example, you used a general purpose SSD (<terminal inline>gp2<terminal inline>). By changing the type to <terminal inline>io1<terminal inline>, <terminal inline>sc1<terminal inline>, or <terminal inline>st1<terminal inline>, you can request different volume types.
When dealing with other storage provisioners, check the respective documentation before adding parameters, as each provisioner has different options.
Create Storage Classes
Now that you understand the components and parameters of storage classes, let’s look at how to create a pod that uses a storage class to provision volume.
You’ll need a basic understanding of Kubernetes to follow along, as well as an Azure subscription to use an Azure cluster. If you’re not using Azure, the general principles are universally applicable, but you’ll need to change the provisioner.
The following manifest defines the storage class that you’ll use:
In this example, you’re creating Azure Standard_LRS storage in the East US region. To create the storage class, save the contents of the above code block as a YAML file, then use the <terminal inline>kubectl apply -f <file_name><terminal inline> command to apply the definitions:
The file name here is <terminal inline>sc-definition.yml<terminal inline>, and you can see the storage class creation is successful.
List Storage Classes
In order to list all the storage classes in your cluster, you can use the <terminal inline>kubectl get sc<terminal inline> command. In this command, <terminal inline>sc<terminal inline> is an acronym for StorageClass:
The first listing, with the name <terminal inline>azure-storage<terminal inline>, is the one you just created. It has an <terminal inline>Immediate<terminal inline> reclaim policy, which means the volume is created as soon as a PVC is created. The other storage classes in the list are the default ones that get created with an Azure cluster.
Create a Persistent Volume Claim
To use the storage class you have created, you need to have a PVC. To create one, save the below manifest in a YAML file, and then apply it with <terminal inline>kubectl apply -f <file_name><terminal inline>.
The manifest tells Kubernetes to create a PVC with the name <terminal inline>myclaim<terminal inline>, and to use the <terminal inline>azure-storagestorage<terminal inline> class.
List PVCs and PVs
To verify that your PVC has been created successfully, you can run <terminal inline>kubectl get pvc<terminal inline> to list all PVCs:
If you see the above output and the name you gave your PVC, your cluster has successfully created a PVC.
As the reclaim policy is immediate, a PV is created as soon as the PVC is. You can list your newly created persistent volume using the following command:
Create a Pod
As a final step, you’ll create a pod that will use your persistent volume when it requires persistent storage for operational data. You can do this by applying the following manifest:
To apply, save it, then use the command below, updating <terminal inline>pod-definition.yml<terminal inline> to match your file name:
Now if you delete the pods, then create a new pod with the same manifest, your data will persist.
To delete the resources you created in this tutorial, you can use <terminal inline>kubectl delete object object-name<terminal inline>. However, because the PVC is dependent on the storage class, they need to be deleted sequentially. The correct order is listed below:
Trying to delete the storage class first would put the object in an indefinite <terminal inline>Terminating<terminal inline> state because it’s still referenced by PVC, which the pod references.
StorageClasses are an essential part of Kubernetes, and can provide a great deal of flexibility and control over how your data is stored. They allow you to dynamically provision storage, fine-tune performance and management capabilities, and create custom storage solutions.
In addition, StorageClasses can be used to provision storage for specific applications, making it possible to tailor the storage infrastructure to the needs of the workload.