Start your free 14-day ContainIQ trial

Kubernetes Storage Classes | In-Depth Tutorial

Storage Classes are an essential part of Kubernetes, and can provide a great deal of flexibility and control over how your data is stored. In this guide, we provide an in-depth tutorial on using storage classes effectively.

September 2, 2022
Hrittik Roy
Software Engineer

Kubernetes has gained popularity in recent years for its ability to manage containerized workloads at scale. One of the key features of Kubernetes is its support for multiple storage backends, which allows users to select the storage solution that best fits their needs.

With Kubernetes, the Cloud Native Computing Foundation graduated container orchestrator, you can manually provision storage whenever your pods need it, or do so automatically with the help of dynamic provisioning. When your pods require storage, Kubernetes creates it with the desired provisioner.

In this blog, you’ll look at the dynamic storage provisioning in Kubernetes by storage classes. You’ll also look at its components with examples and use cases to see why it’s so important and how it works with the cluster. By the end of this post, you’ll understand how to use storage classes to control the type of storage used for your workloads.

An Introduction to Kubernetes Storage

Before diving into the specifics of Kubernetes storage classes, you must understand the basics of persistent volumes and persistent volume claims.

A persistent volume (PV) is a Kubernetes object that represents a piece of storage, either locally or on the cloud. Pods use the PV of a cluster to store their data. A persistent volume claim (PVC) is a Kubernetes object representing a claim on a PV by forming a one-to-one mapping with your persistent volume and specifying what’s required for the persistent volume to be used by your pod.

Whenever you need storage for your stateful application, you should provision a disk for your PV. For example, <terminal inline>gcePersistentDisk<terminal inline> on Google Cloud:


hrittikhere@hrittik  ~ % gcloud compute disks create gcp-disk --size=10GB --zone=us-east1-b
WARNING: You have selected a disk size of under [200GB]. This may result in poor I/O performance. For more information, see: https://developers.google.com/compute/docs/disks#performance.
Created [https://www.googleapis.com/compute/v1/projects/customer-success-244100/zones/us-east1-b/disks/gcp-disk].
NAME      ZONE        SIZE_GB  TYPE         STATUS
gcp-disk  us-east1-b  10       pd-standard  READY

New disks are unformatted. You must format and mount a disk before it
can be used. You can find instructions on how to do this at:

https://cloud.google.com/compute/docs/disks/add-persistent-disk#formatting

After the disk provision is successful, create a PV using the kubectl command-line tool, or by creating YAML files that define the PV objects. A sample YAML file for your PV will look like this:


apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-disk
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  gcePersistentDisk:
    pdName: gcp-disk
    fsType: ext4

In the above manifest, you create a PV with one gibibyte that can be mounted to your cluster. The file system used is a fourth extended filesystem, and the name of the disk is gcp-disk.

The next step is creating a PVC that claims the storage. The same PVC will be used in your pod definition, which will allow your pods to use the disk. The above procedure, in which you need to provision a disk for your PV, is referred to as static storage provisioning, because you create the disks manually.

However, it’s not convenient or practical to manually create a disk and PV every time you need storage. Kubernetes solves this problem by providing a way to make your storage provisioning more dynamic with dynamic storage provisioning, where PVs and disks are created on the fly when PVCs or pods request storage.

What Is Dynamic Storage Provisioning?

Dynamic storage provisioning is the process of automatically allocating storage to containers in a Kubernetes cluster. This is done automatically by the Kubernetes control plane when it detects that a container needs storage.

The control plane will select a storage class based on the container’s requirements, then automatically create a new persistent volume using that storage class. After that, as in static provisioning, the PV is attached to the container, and the container can use the storage via a PVC.

What Is a Storage Class?

StorageClass objects were introduced in Kubernetes 1.5 and have since become essential to Kubernetes storage.

A storage class in Kubernetes defines different storage types, which allows the user to request a specific type of storage for their workloads. Storage classes also allow the cluster administrator to control which type of storage is used for specific workloads by specifying a type of storage.

The StorageClass object contains information about the provider, such as Amazon EBS or Google Cloud Storage, as well as which capabilities, such as replication factor or encryption, the storage should have. Kubernetes will then use information from the storage class when it creates new persistent volumes.

Why Use Kubernetes Storage Classes?

Kubernetes storage classes enable an administrator to create and manage multiple storage configurations and bind them to individual applications or workloads. This provides greater flexibility and control over managing storage resources in a Kubernetes cluster, as you don’t need to configure and create different types of storage with various specifications every time a storage request is made.

The dynamic provisioning and class isolation of storage can be used in many different scenarios, including the following common use cases.

Different Quality of Service Levels

Different classes can correspond to varying levels of quality of service, which can mean some classes are faster or have more storage than others.

For example, a class might be designed for files that are accessed frequently, or that are accessed only occasionally. A premium solid-state drive (SSD) can be provisioned for frequently accessed files, while a lower-cost but slower hard drive (HDD) can be provisioned for occasionally accessed files.

For example, Azure offers two different types of storage classes: Azurefile Premium Disk and Azurefile Basic.

Azurefile Premium Disk

The premium disk is a very performant disk with high input/output per second (IOPS), but is also costlier than using the standard storage class.

Azure creates this storage class with every managed Kubernetes cluster. To define the object, use the command <terminal inline>kubectl describe sc azurefile-premium<terminal inline>.


Name:                  azurefile-premium
IsDefaultClass:        No
Annotations:           <none>
Provisioner:           file.csi.azure.com
Parameters:            skuName=Premium_LRS
AllowVolumeExpansion:  True
MountOptions:
  mfsymlinks
  actimeo=30
ReclaimPolicy:      Delete
VolumeBindingMode:  Immediate
Events:             <none>

In the output above, you can see details about the StorageClass <terminal inline>azurefile-premium<terminal inline>, including information about the type of disk it should use, which in this instance is <terminal inline>Premium_LRS<terminal inline>. The <terminal inline>AllowVolumeExpansionindicates<terminal inline> that more storage should be provisioned as required.

Azurefile Basic

This is a less expensive alternative that’s best used for less intensive tasks, as it has limited performance when compared to the alternatives. It can be described with the command <terminal inline>kubectl describe sc azurefile<terminal inline>. This storage class also comes built in to every Azure cluster, and uses <terminal inline>Standard_LRS<terminal inline> as the storage disks, as shown below:


Name:                  azurefile
IsDefaultClass:        No
Annotations:           <none>
Provisioner:           file.csi.azure.com
Parameters:            skuName=Standard_LRS
AllowVolumeExpansion:  True
MountOptions:
  mfsymlinks
  actimeo=30
ReclaimPolicy:      Delete
VolumeBindingMode:  Immediate
Events:             <none>

Different Backup Policies

Different storage classes can be created for different backup policies, allowing you to provision disks with different performance and prices. For example, one class might be designed for files that are in the production environment and need to be backed up regularly, while another class might be designed for files in the quality assurance environment that only need to be backed up occasionally.

Arbitrary Policies

Different classes might also be used to support other kinds of administrator-determined policies, as well. Depending on the scenario, one class might be designed to replicate files to multiple servers, while another class might be designed to retain data on a single server, such as in test environments.

Another example can be of a cluster administrator who might want to use a specific type of storage for databases and a different type of storage for application logs.

Using Kubernetes Storage Classes

A StorageClass is simply a Kubernetes object, so like other objects, it’s defined by a YAML manifest. However, its properties and metadata are different from those associated with other objects.

A sample manifest of a StorageClass object is below:


apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
reclaimPolicy: Retain
allowVolumeExpansion: true
mountOptions:
  - debug
volumeBindingMode: Immediate

Understanding Components of Kubernetes Storage Classes

Some keys are present in all the objects, and some are exclusive to the object type. The first three keys, <terminal inline>apiVersion<terminal inline>, <terminal inline>kind<terminal inline>, and <terminal inline>metadata<terminal inline>, are used across all Kuberentes objects, while the others in the list are exclusive to StorageClass objects.

apiVersion

The apiVersion field indicates which version of the storage.k8s.io API is being used.

kind

The kind field indicates the type of object being created. In this case, the object is a StorageClass. Other <terminal inline>kind<terminal inline>s include pods and deployments.

metadata

The metadata field contains information about the object, such as its name, labels, namespace, and creation timestamp. The only metadata field in this example is the name field, which specifies the name of the StorageClass, and is required. Other fields in the list are optional.

provisioner

A storage class provisioner is a type of volume plug-in that handles the creation and management of storage within Kubernetes. Storage provisioners are responsible for creating and deleting storage volumes, as well as attaching and detaching them from nodes.

Some provisioners are built in and shipped with Kubernetes and some are third-party provisioners, which is helpful when internal provisioners don’t support your use case. Third-party provisioners are not shipped with Kubernetes, but are developed and maintained by third-party vendors. You can also create your own provisioner using the external provisioner libraries from the Kubernetes community.

The example implements the ’kubernetes.io/aws-ebs’ provisioner, which refers to the AWSElasticBlockStore volume plug-in. You can identify it as a built-in provisioner because it’s prefixed with <terminal inline>kubernetes.io/<terminal inline>. Some of the other built-in volume plug-ins are those of cloud providers such as Amazon, as in this example, and Azure, shown in the earlier examples.

More information about volume plug-ins can be found in the official documentation.

reclaimPolicy

The reclaim policy is the policy that describes how a storage class can reclaim a PV when it is no longer needed. The options are <terminal inline>Delete<terminal inline>, which will delete the data immediately, or <terminal inline>Retain<terminal inline>, which will keep the data until the user deletes it.

There are benefits and drawbacks to both approaches. <terminal inline>Delete<terminal inline> will de-provision your storage disk and <terminal inline>Retain<terminal inline> keeps the storage disk. If the disk is present, you can get your data back if the wrong pods are deleted, but this can result in increased costs if the disks aren’t deleted regularly.

The default reclaim policy is <terminal inline>Delete<terminal inline>; in the above example, it’s set to <terminal inline>Retain<terminal inline>.

allowVolumeExpansion

Sometimes, you may find that your application needs more storage than was initially provisioned. With the <terminal inline>allowVolumeExpansion<terminal inline> property, you can simply expand the storage volumes, and the storage class will take care of the rest.

When <terminal inline>allowVolumeExpansion<terminal inline> is set to true, the storage class will allow the underlying storage volumes to be expanded to add more storage if needed. This is useful when using storage solutions that will enable on-demand expansion, such as Amazon EBS in the example above.

volumeBindingMode

<terminal inline>volumeBindingMode<terminal inline> is a policy setting that determines how VolumeAttachments are handled when a PVC is created.

There are two volume binding modes that can be used:

  • Immediate: In this mode, the volume and storage binding are created immediately after the PVC is created. The manifest example uses this mode, so whenever a new PVC with the standard storage StorageClass is created, you can see it in the bound state. This mode is universally supported.
  • WaitForFirstConsumer: In this mode, volume is not created until a pod that consumes the PVC is created. Most plug-ins don’t support this mode; only Amazon Elastic Block Store, Google Compute Engine Persistent Disk, and Azure Disk support it.

mountOptions

The mount options parameter allows an administrator to specify a list of options for mounting a volume. These options can be used to debug mounting issues or fine-tune a volume’s performance.

Allowed Topologies

Sometimes when using <terminal inline>WaitForFirstConumer<terminal inline> binding mode, you want your disks to provision in certain specific regions or zones for proximity or cost. You can do this with the <terminal inline>allowedTopologies<terminal inline> property, as seen in the example below, where the regions are restricted to <terminal inline>us-central-1a<terminal inline> and <terminal inline>us-central-1b<terminal inline>:


apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - us-central-1a
    - us-central-1b

Parameters

With parameters, you change the type of your storage disks and enable encryption and replication. You can also configure file types with other parameters, which you can read about in the documentation.

In the manifest example, you used a general purpose SSD (<terminal inline>gp2<terminal inline>). By changing the type to <terminal inline>io1<terminal inline>, <terminal inline>sc1<terminal inline>, or <terminal inline>st1<terminal inline>, you can request different volume types.

When dealing with other storage provisioners, check the respective documentation before adding parameters, as each provisioner has different options.

Create Storage Classes

Now that you understand the components and parameters of storage classes, let’s look at how to create a pod that uses a storage class to provision volume.

You’ll need a basic understanding of Kubernetes to follow along, as well as an Azure subscription to use an Azure cluster. If you’re not using Azure, the general principles are universally applicable, but you’ll need to change the provisioner.

Tutorial

The following manifest defines the storage class that you’ll use:


apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: azure-storage
provisioner: kubernetes.io/azure-disk
parameters:
  skuName: Standard_LRS
  location: eastus

In this example, you’re creating Azure Standard_LRS storage in the East US region. To create the storage class, save the contents of the above code block as a YAML file, then use the <terminal inline>kubectl apply -f <file_name><terminal inline> command to apply the definitions:


hrittik@Azure:~/storage$ kubectl apply -f sc-definition.yml
storageclass.storage.k8s.io/azure-storage created

The file name here is <terminal inline>sc-definition.yml<terminal inline>, and you can see the storage class creation is successful.

List Storage Classes

In order to list all the storage classes in your cluster, you can use the <terminal inline>kubectl get sc<terminal inline> command. In this command, <terminal inline>sc<terminal inline> is an acronym for StorageClass:

hrittik@Azure:~/storage$ kubectl get sc

NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
azure-storage kubernetes.io/azure-disk Delete Immediate false 10s
azurefile file.csi.azure.com Delete Immediate true 23d
azurefile-csi file.csi.azure.com Delete Immediate true 23d
azurefile-csi-premium file.csi.azure.com Delete Immediate true 23d
azurefile-premium file.csi.azure.com Delete Immediate true 23d
default (default) disk.csi.azure.com Delete WaitForFirstConsumer true 23d
managed disk.csi.azure.com Delete WaitForFirstConsumer true 23d
managed-csi disk.csi.azure.com Delete WaitForFirstConsumer true 23d
managed-csi-premium disk.csi.azure.com Delete WaitForFirstConsumer true 23d
managed-premium disk.csi.azure.com Delete WaitForFirstConsumer true 23d

The first listing, with the name <terminal inline>azure-storage<terminal inline>, is the one you just created. It has an <terminal inline>Immediate<terminal inline> reclaim policy, which means the volume is created as soon as a PVC is created. The other storage classes in the list are the default ones that get created with an Azure cluster.

Create a Persistent Volume Claim

To use the storage class you have created, you need to have a PVC. To create one, save the below manifest in a YAML file, and then apply it with <terminal inline>kubectl apply -f <file_name><terminal inline>.


kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: myclaim
spec:
  accessModes: [ "ReadWriteOnce" ]
  storageClassName: azure-storage
  resources:
   requests:
     storage: 500Mi

The manifest tells Kubernetes to create a PVC with the name <terminal inline>myclaim<terminal inline>, and to use the <terminal inline>azure-storagestorage<terminal inline> class.


hrittik@Azure:~/storage$ kubectl apply -f pvc-definition.yml
persistentvolumeclaim/myclaim created

List PVCs and PVs

To verify that your PVC has been created successfully, you can run <terminal inline>kubectl get pvc<terminal inline> to list all PVCs:

hrittik@Azure:~/storage$ kubectl get pvc

NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
myclaim Bound pvc-fc949d34-f7af-4fdb-8ded-6589f39b2175 1Gi RWO azure-storage 16s

If you see the above output and the name you gave your PVC, your cluster has successfully created a PVC.

As the reclaim policy is immediate, a PV is created as soon as the PVC is. You can list your newly created persistent volume using the following command:

hrittik@Azure:~/storage$ kubectl get pv

NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-fc949d34-f7af-4fdb-8ded-6589f39b2175 1Gi RWO Delete Bound default/myclaim azure-storage 118s

Create a Pod

As a final step, you’ll create a pod that will use your persistent volume when it requires persistent storage for operational data. You can do this by applying the following manifest:


apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  containers:
    - name: frontend
      image: nginx
      volumeMounts:
      - mountPath: "/var/www/html"
        name: web
  volumes:
    - name: web
      persistentVolumeClaim:
        claimName: myclaim

To apply, save it, then use the command below, updating <terminal inline>pod-definition.yml<terminal inline> to match your file name:


hrittik@Azure:~/storage$ kubectl apply -f pod-definition.yml
pod/mypod created

Now if you delete the pods, then create a new pod with the same manifest, your data will persist.

Clean Up

To delete the resources you created in this tutorial, you can use <terminal inline>kubectl delete object object-name<terminal inline>. However, because the PVC is dependent on the storage class, they need to be deleted sequentially. The correct order is listed below:


hrittik@Azure:~$ kubectl delete pods mypod
pod "mypod" deleted

hrittik@Azure:~$ kubectl delete persistentvolumeclaim myclaim
persistentvolumeclaim "myclaim" deleted

hrittik@Azure:~$ kubectl delete sc azure-storage
storageclass.storage.k8s.io "azure-storage" deleted

Trying to delete the storage class first would put the object in an indefinite <terminal inline>Terminating<terminal inline> state because it’s still referenced by PVC, which the pod references.

Final Thoughts

StorageClasses are an essential part of Kubernetes, and can provide a great deal of flexibility and control over how your data is stored. They allow you to dynamically provision storage, fine-tune performance and management capabilities, and create custom storage solutions.

In addition, StorageClasses can be used to provision storage for specific applications, making it possible to tailor the storage infrastructure to the needs of the workload.

Start your free 14-day ContainIQ trial
Start Free TrialBook a Demo
No card required
Hrittik Roy
Software Engineer

Hrittik is a writer and a software engineer specializing in cloud native ecosystems. He has worked on many large-scale projects and has experience in both the technical and the business aspects of cloud computing. He is a frequent speaker at conferences and has written numerous articles on software development and distributed systems. In his free time, he likes to go for long walks.

READ MORE