Intro to YAML: Kubernetes Objects and Configurations

October 15, 2021

Kubernetes objects can be expressed in YAML format. This article gives readers examples of various Kubernetes configurations in YAML, how to make them, and how to keep them organized.

Kasper Siig
DevOps Engineer

YAML has become a very popular choice for configuration files. It takes what many developers are used to, JSON, and increases readability. This is mainly done by removing unnecessary characters like brackets and quotes, as well as using an indentation approach, which is more intuitive for humans to parse.

You might think that the structure of YAML closely resembles that of JSON. YAML is a superset of JSON, so it’s not entirely untrue. JSON being a superset means that any <terminal inline>.json<terminal inline> file can be parsed by a YAML parser. Yes, this means you can write all your YAML in JSON instead, but with how popular YAML is, and how widespread its use is in guides and tutorials, it’s a good idea to learn how it works. This is especially true in Kubernetes, where all resources are typically defined as <terminal inline>.yaml<terminal inline> files.

YAML Formatting

There are many possibilities within YAML, but at the base of it you only need to know two concepts: lists and maps. Although a <terminal inline>.yaml<terminal inline> file can seem complex, it all boils down to these two concepts. A map is a collection of keys and values, an example of which can be seen at the top of almost any Kubernetes configuration file. Take a look at the first two lines defining a Pod:


apiVersion: v1
kind: Pod

Notice the lack of quotes. Based on the underlying context, YAML will know what data type is needed. Here both values will be parsed as strings, whereas a value of <terminal inline>1<terminal inline> will be parsed as an integer, and a value of <terminal inline>true<terminal inline> will be parsed as a boolean. Values in a map are not limited to only simple data types. The value of a map can also be another map, as is the case for the metadata field in a Kubernetes configuration file:


metadata:
 name: nginx

You can think of maps as objects in JSON in that they are key-value objects with the possibility of nesting objects.

The second concept you need to know are lists. You can find these in JSON as arrays. Here’s how they look in YAML:


args:
- sleep
- 1000

To make a list, you make a new line and start with a dash. These lists can be endless, and values can be strings, integers, booleans, and even maps.

Now that you know the two core concepts of any <terminal inline>.yaml<terminal inline> file, there’s one last important thing to know. Indentation. Because YAML aims to be human-readable, it relies heavily on indentation. As seen in the example of a nested map, this was denoted by indenting the value-map by two spaces. It’s very important that you keep your indentation in order, as it can make it tough to troubleshoot at times.

Representing Kubernetes Objects with YAML (Deployment Example)

You may be used to creating Pods, Deployments, Services etc. in Kubernetes via the <terminal inline>kubectl<terminal inline> create command. This way of creating objects is indeed valid and great for learning purposes. However, when running Kubernetes in production you often want to have all your objects defined as <terminal inline>.yaml<terminal inline> files. This makes it easier for others to know what’s running in the cluster, and allows for your deployments to be version controlled.

Kubernetes makes it easy to know how any objects created are defined in YAML. When you run <terminal inline>kubectl get<terminal inline>, add the flag <terminal inline>-o yaml<terminal inline>. This will output the objects in YAML, rather than the typical list view. Try running <terminal inline>kubectl create deployment nginx --image=nginx<terminal inline>. This will create a deployment that you can now view by running <terminal inline>kubectl get deployment nginx -o yaml<terminal inline>. You’ll see a lot of lines being printed, displaying everything there is to know about the <terminal inline>nginx<terminal inline> deployment. Thankfully, when writing the <terminal inline>.yaml<terminal inline> files you don’t need to write all the lines you see printed in your terminal, since when using the <terminal inline>kubectl get<terminal inline> command Kubernetes is also showing you all the auto-generated fields. When you remove those, you get the following configuration file:


apiVersion: apps/v1
kind: Deployment
metadata:
 labels:
   app: nginx
 name: nginx
spec:
 replicas: 1
 selector:
  matchLabels:
   app: nginx
 template:
  metadata:
   labels:
    app: nginx
  spec:
   containers:
   - image: nginx
       name: nginx

While this can seem like a lot if you’re used to only running <terminal inline>kubectl create<terminal inline> commands, it’s recommended that you start looking into configuration definitions. This will not only help you be more knowledgeable about Kubernetes in general, but will also help you in terms of keeping your deployments defined as code.

If you don’t want to go through the process of creating an object, viewing it, and trimming it down, there’s another option built into <terminal inline>kubectl<terminal inline>. For example, if you want to know what a Pod definition looks like, you can run <terminal inline>kubectl run nginx --image=nginx --dry-run=client -o yaml<terminal inline>. The <terminal inline>--dry-run=client<terminal inline> part is typically used to validate a <terminal inline>create<terminal inline> command. The <terminal inline>-o yaml<terminal inline> will make it output it in YAML, given you the following output:


apiVersion: v1
kind: Pod
metadata:
 creationTimestamp: null
 labels:
  run: nginx
 name: nginx
spec:
 containers
 - image: nginx
  name: nginx
  resources: {}
 dnsPolicy: ClusterFirst
 restartPolicy: Always
status: {}

Comparing these two definitions, you can see that there are similarities, like <terminal inline>apiVersion<terminal inline>and <terminal inline>kind<terminal inline>. These fields are required, whereas some other fields are optional. Whether these fields are required or not depends on what type of object you are creating.

Required and Important Optional Fields

As noted before, some required fields have to be set in all configuration files. <terminal inline>apiVersion<terminal inline>, <terminal inline>kind<terminal inline>, <terminal inline>metadata<terminal inline>, and <terminal inline>spec<terminal inline> all have to be set. You can read more about them here.

Other than these, some fields are generally accepted as best practice to have. A common field to set is <terminal inline>labels<terminal inline> under <terminal inline>metadata<terminal inline>. When you’re going to create a new object, take a look at some examples online, as these important optional fields typically vary depending on the type of object.

One of the most commonly used fields that will impact how your service is running are annotations. Annotations are used by different services like operators, for example like the DataDog Agent, which is used for logging and monitoring purposes. The DataDog Agent will look for specific annotations as a way of figuring out which deployments it should scan for logs.

YAML Best Practices

Now that you know the basis of YAML, you are ready to start writing your own configurations. However, as with anything in software, there are some best practices you should follow.

First, you should not be using tabs, you should use spaces. This is a very hot topic between developers, but in terms of YAML it’s not so much an opinion. Many YAML parsers will fail if you are using tabs to indent your file, which is why you should use spaces.

Another best practice has more to do with making your life easier as an engineer. Use a monospaced font when viewing and editing <terminal inline>.yaml<terminal inline> files. This makes it a lot easier to spot any errors in indentation. Taking it a step further, you can consider installing or configuring your text editor/IDE to visually show spaces and tabs.

Finally, use as little indentation as possible. While keys and values on the same level must be indented the same amount, YAML isn’t too picky about how much they’re indented. These two files are both equally valid:


args:
- sleep 1000
args:
- sleep 1000

With a small example like this, indentation may seem insignificant, but once you start nesting many maps and lists, it can become tough to manage. Keeping your indentation to a minimum in width can help quite a bit with readability. If you do choose to indent according to the second example, make sure you’re consistent. While different indentation levels can be mixed in a single file, it will quickly become very hard to read if it’s not at least consistent.

Conclusion

You now know a bit more about how YAML plays into Kubernetes as a whole, and you can start writing your own <terminal inline>.yaml<terminal inline> files. You are now able to more comprehensively define your configurations, share them with others, and version control them. This also aids in making them more replicable, as many tools exist which let you expand on configuration files, like Kustomize and Helm.

Time to go ahead and take a look at how you can best implement YAML into your workflow. Remember to use spaces for indentation, and remember: less is more.

Article by

Kasper Siig

DevOps Engineer

As a DevOps engineer, Kasper Siig is used to working with a variety of exciting technologies, from automating simple tasks to CI/CD to Docker.

Read More