Simple Kubernetes workloads can sometimes fail and be easily restarted by the kubelet to a clean state without any problem. Nontrivial workloads (for example, when containers need to persist a state or share files with other containers) need a way to recover their previous states whenever they restart.
Persistent Volumes provide an API that allows Kubernetes administrators to manage volumes in a safe and abstracted way, without them needing to understand the nitty-gritty of different storage providers. It also provides a convenient way for Pods to store necessary states to perform their tasks.
When working with Persistent Volumes, two common issues often seen with Kubernetes are FailedAttachVolume and FailedMount. These errors generally mean there was a failure using the desired volume, which, in turn, prevented workloads from functioning as intended.
Since there can be many different reasons why an underlying volume can malfunction, you need to dig deeper to find the root cause. In this article, you will learn how to troubleshoot the incident when you see this error.
Understanding Persistent Volumes
Persistent Volumes are storage resources created dynamically or statically by administrators, just like any other Kubernetes resource. It has its own life cycle, independent of the individual Pod that uses it. A strict dependency between a Pod and a Persistent Volume prevents normal workload operation.
Once a Persistent Volume object is created, an underlying disk is also created, which, in turn, is attached to the scheduled node and, consequently, mounted on the desired path. When the workload needs to move somewhere else in the cluster, the reverse process occurs by unmounting the volume, detaching it from the node, and moving it to its new destination.
When working with dynamically provisioned volumes in cloud environments (e.g., AWS, Azure, or Google Cloud Platform), it’s not uncommon for Persistent Volume life cycles to be broken, preventing the underlying disk from being correctly detached and attached. This will prevent correct workload scheduling, potentially causing downtime or data loss.
Troubleshooting the Error
The Persistent Volume life cycle can be broken for a number of reasons:
- node failure
- underlying service API call failure
- network partition
- incorrect access mode (e.g., ReadWriteOnce)
- new node already has too many disks attached
- new node does not have enough mount points
These issues usually manifest themselves through Pods failing to start and becoming stuck in an endless waiting loop. To help diagnose the issue, you’ll need to <terminal inline>describe<terminal inline> a Pod and try to understand what’s going on:
Under <terminal inline>[events](https://www.containiq.com/post/kubernetes-events)<terminal inline>, you’ll find a series of messages related to the Pod’s life cycle that can help you diagnose the issue.
The failures can generally be divided into two main categories. On one side, there are detach failures, where Kubernetes is unable to detach a disk from a specific node. On the other side, there are attach and mount failures, where Kubernetes can’t attach and/or mount a disk on the new node.
FailedAttachVolume occurs when a volume cannot be detached from its current node and attached to a new one. When Kubernetes performs the detach and attach operation, it first checks if the volume is safe to be detached and aborts the operation if the check fails. Also, Kubernetes does not force detach any volume. This error indicates a fundamental failure with the underlying storage infrastructure. The message <terminal inline>Volume is already exclusively attached to one node and can’t be attached to another<terminal inline> also confirms this. There can be other causes—for example, too many disks attached to a node—but it will be shown in the message.
FailedMount means a volume can’t be mounted on a specific path and can be a consequence of the previous error since the mount operation happens after attach. Because the attach operation fails, the mount timeout expires, meaning the mount operation is not possible. Other reasons can be incorrect device path or device mount path.
Recovering from the Failure
Since Kubernetes can’t automatically handle the FailedAttachVolume and FailedMount errors on its own, sometimes you have to take manual steps.
Failure to Detach
When Kubernetes fails to detach a disk, you can use the storage provider’s CLI or API to detach it manually. For example, when using Azure, you can detach a disk from a virtual machine by running this code:
When using AWS EBS volumes, you can perform the same operation by running this command:
Failure to Attach or Mount
There may be situations when Kubernetes can detach the volume but is unable to attach or mount the disk in the scheduled node. In this situation, the easiest way to overcome the issue is to force Kubernetes to schedule the workload to another node. This can be done in a few different ways.
Cordon marks a node as unschedulable. This means that the Kubernetes Scheduler will not take a cordoned node as an available node. Let’s say you have a Pod scheduled to <terminal inline>node-2<terminal inline>, but it’s unable to start because the node doesn’t have enough mount points available. The node can be cordoned using kubectl:
And then the Pod can be rescheduled to another node:
Node Selectors, Affinity, and Anti-Affinity
Node selectors, affinity, and anti-affinity tell Kubernetes whether to schedule Pods in specific nodes. Nodes will have certain labels that will be used in <terminal inline>nodeSelector<terminal inline> as well as in <terminal inline>affinity<terminal inline> and <terminal inline>anti-affinity<terminal inline> rules to force Pods to be scheduled accordingly.
The simplest mechanism is to use <terminal inline>nodeSelector<terminal inline> where a node is assigned a label and the Pod is configured with a matching label. For example, if you are sure <terminal inline>node-1<terminal inline> can have another disk attached and has enough mount points available, you can run this command:
You can then configure the Pod with the <terminal inline>schedule=nginx<terminal inline> node selector:
Persistent Volumes provide an abstraction that allows Kubernetes workloads to easily provision persistent storage that can survive restarts and scheduling to different nodes. Sometimes the Persistent Volume life cycle is broken and Kubernetes can’t perform rescheduling on its own. FailedAttachVolume and FailedMount are two common errors in this situation that mean Kubernetes is unable to detach, reattach, and mount a volume.
When this happens, you may need to manually detach a disk or instruct Kubernetes Scheduler to start the Pod in a specific node.
The first step to fixing any issue is to understand it. Unless you are proactively alerted, you’ll have to spend time to find the root cause, using precious time that will be adding to the already ticking downtime, or even worse, data loss.
ContainIQ can give you a hand by monitoring your Kubernetes cluster and alerting on events whenever an error, like FailedAttachVolume or FailedMount happens, making it easy to fix it and perhaps paving the way for developing automated self-healing capabilities.