Resources

Kubernetes Logging | Using Kubectl & Best Practices

October 18, 2021

Logging is important for any application. This article goes through tools and practices for effectively using logging with Kubernetes.

Tyler Charboneau
Software Engineer

Our systems are subject to numerous events, performance fluctuations, and management processes multiple times per operating hour. This is especially true within Kubernetes—a layered management system that ties together critical software-hardware components behind your application. It tracks issues during runtime so that engineers can analyze them later, which is the principal importance of logging.

Just like good governance is important during both peace and war, logging has immense value during flawless runtime or application crises. Also, just because your application is performing as required doesn’t mean resource allocations are fully optimized. Logs can reveal areas for improvement—both obvious and obscure.

There are many ways you can collect and view logs within Kubernetes. This guide breaks down the particulars of Kubernetes logging to help steer you in the right direction. You’ll learn about methodologies, tools, and more to better manage Kubernetes.

What’s Logged in Kubernetes

By default, your Kubernetes system will self-monitor for any notable events that occur throughout uptime. Because there are different events that hold meaning at unique levels of the system, Kubernetes uses a logging architecture to keep things organized. You can leverage logging at the container, node, and cluster during operation, or after a set point, using <terminal inline>kubectl<terminal inline>.

Let’s take nodes, for example. Many pods live on each node and carry out processes; when these processes are completed—or resources become scarce—those pods are terminated or evicted. This is an event worthy of investigation, since it may indicate a systemic problem. Perhaps better provisioning is in order.

Similarly, the Kubernetes log might also track different scheduling behaviors as they occur throughout the environment.

You’d want to hear about a <terminal inline>FailedScheduling<terminal inline> event and understand why it occurred. <terminal inline>Rebooted<terminal inline>, <terminal inline>NodeNotReady<terminal inline>, and <terminal inline>HostPortConflict<terminal inline> also fit this bill. You may also view Failed events, Warnings, and more within your logs.

However, this might now display as you’d expect. Kubernetes doesn’t default to a GUI where you can search through logs visually. Instead, you’d use <terminal inline>kubectl<terminal inline> commands to get log files from the Kubernetes “black box”—choosing filters or other criteria with significance to your deployment.

Viewing Logs With Kubectl

There are a number of commands you might use to summon log files. Varying widely in complexity, each taps into background details on specific resources within Kubernetes. That might include simply grabbing a log for a single pod or something relatively complex, like leveraging arguments to access labeled logs for pods sharing multiple containers.

Let’s tackle the first example via the following command:


kubectl -n <namespace> logs <podname>

This effectively summons any relevant system logs for a specific pod. <terminal inline>kubectl<terminal inline> outputs the result of that command immediately beneath it. Thankfully, Kubernetes is pretty good about spitting this information out in a format that’s human-readable:

The readability of these results makes it much simpler to understand your system’s inner workings. It’s typically assumed that Kubernetes will produce a log reliably during a given period of time. However, that’s not always the case.

It’s possible that time elapses where no critical events—or processes—are observed. You might see this when attempting to view logs during a predefined window of time. The terminal window will be blank to signal this result.

That example brings us to syntax additions; what exactly are they, and how do they function?

You might think of specialized syntax as an added level of complexity, or a portion of your <terminal inline>kubectl<terminal inline> command which provides deeper instruction. One such example is the double dash --. This acts as a separator between segments of a single command. Anything typed after the symbol acts as a positional argument, which is useful for defining specificity in your outputs. In this case, the double dash would be pivotal in viewing time-based pod logs:


kubectl -n -kube-system logs --since=2h hyperpod2

The above command would pull all logs for the pod, <terminal inline>hyperpod2<terminal inline>, from the last two hours. Again, when applying this to a legitimate Kubernetes setup, there may or may not be a relevant output.

Note the equal sign within that command. This is used to denote a specific value. That value might be a period of time, specific time, or existing label. Overall, syntax also refers to the general structure of a command that’s vital to successful execution. It’s a logical ordering of components.

So, what’s an argument? Your YAML configuration file might contain a field called <terminal inline>args<terminal inline>, consisting of strings applicable to processes or commands within Kubernetes. Your argument might act as a shortcut within a command—telling Kubernetes to perform an action without requiring excessive character entry.

These arguments apply to your pods and their associated containers.

Built-in Logging Options

Note that there are two things you can monitor in Kubernetes:

  • the Kubernetes system itself
  • the various applications supported within it

Divorced somewhat from your applications are the Kubernetes system components. These are the default components that drive the system’s overall functionality. Some run in containers (kube-proxy and Kubernetes scheduler) while others don’t (kubelet and container runtime).

There’s no cluster-level logging specifically built into Kubernetes. Achieving similar results requires a little bit of creativity. You might run a node-level logging agent on each node, implement a logging sidecar container, or push logs to a dedicated backend. These are reliable ways to shift information back and forth, while retaining it for later analysis.

How are system component logs stored? That answer depends on the type of machine you’re running.

Those with <terminal inline>systemd<terminal inline> use <terminal inline>journald<terminal inline> as a logging destination, which kubelet and container runtime write to. Otherwise, machines will create <terminal inline>.log<terminal inline> files within the <terminal inline>/var/log<terminal inline> directory. System components send logs to the same location, but instead use the <terminal inline>klog<terminal inline> mechanism to get this done. Default methods are ignored in these cases.

Once logs are stored, they don’t simply stagnate. It’s the responsibility of your configuration to rotate these periodically, or once they surpass the 100MB size threshold. These files are compressed, relocated, renamed, or deleted automatically to counteract gradual bloating. Just imagine if you run hundreds of containers or nodes—those file sizes would add up quickly!

You’ll want to know what’s happening to your specific applications. Application logging is crucial for processes like debugging or monitoring activity across clusters. Luckily, the majority of today’s applications do include some sort of logging functionality, which really helps out developers and administrators alike.

Container engines support continuous logging. These logs are written to streams, which automatically transport data from source to repository. Standard output and standard error are the most common and simplest streams available for applications.

However, no system is perfect. Native logging lifecycles are tied to nodes, pods, and containers. Should these terminate, fail, or get deleted, those logs are possibly lost forever. This isn’t ideal, as it puts an unpredictable expiration date on all log files contained within. You might very well miss opportunities to debug or conduct a thorough investigation into crippling issues.

Cluster-based logging can provide an answer to this problem, as the fate of those logs is independent of the components they’re tied to. You have to supply your own backend to make this possible; Kubernetes provides no baked-in storage solution.

Logging via Standard Output

What if we want to specifically target the standard output stream? That process involves two puzzle pieces:

  • the pod specification file
  • <terminal inline>kubectl<terminal inline> commands

Your file might resemble the following:


---
apiVersion: v1
kind: pod
metadata:
 name: bugger
spec:
 containers:
  -name: bugger
  image: busybox
  args: [argument strings]

Within your argument field, you’ll want to specify how often your pod writes to standard output. This establishes consistent behavior.

Next, you’d want to run this pod by invoking the following command: <terminal inline>kubectl apply -f https://examplepod.com/examples/debugging/bugger-pod.yaml<terminal inline>. A confirmation message will appear signifying successful pod creation.

From there, grabbing the pod’s logs is quite easy:


kubectl logs bugger

Your output might resemble the following, if you choose to write a log every five minutes:

0: Tue Jun 8 00:00:01 UTC 2021
1: Tue Jun 8 00:00:06 UTC 2021
2: Tue Jun 8 00:00:11 UTC 2021
3: Tue Jun 8 00:00:16 UTC 2021

However, you must incorporate a third-party solution in order to collect (and view) these logs effectively. DevOps teams regularly turn to solutions like ContainIQ, Prometheus, Grafana, the ELK Stack, and Fluentd (among others) for comprehensive monitoring.

Not only do these tools link with storage databases, they also include visualizations and processing functions to simplify digestion. Google Cloud’s operations suite (formerly Stackdriver) is another popular freemium tool for monitoring memory and CPU usage across your clusters.

Logging Nodes

A key element of node logging is the logging driver, which receives the streams from standard output and standard error and writes output files in JSON. This structured format is easy to parse and keeps data organized for computers and human readers alike.

If you opt for the Docker JSON logging driver, note that only single-line messages are directly supported. Messages with multiple lines might require management at the logging agent and upwards.

Kubernetes cannot rotate created logs itself, so you must implement some sort of rotation solution to remain storage conscious. Thankfully, the <terminal inline>logrotate<terminal inline> tool automatically runs every hour within a given cluster, should you enable it. You might also accomplish this via container runtime. For management purposes, the <terminal inline>configure-helper<terminal inline> and <terminal inline>kube-up.sh<terminal inline> scripts are highly useful.

When writing logs from a node, leveraging a logging agent is essential for uncovering and pushing logs externally. This process points to an aggregation tool and database of your choosing that interfaces with Kubernetes.

Be sure to run each logging agent as a <terminal inline>DaemonSet<terminal inline> since they’ll be running on every node. This logging is considered a recurring background task, hence why these <terminal inline>DaemonSets<terminal inline> are appropriate.

The node logging agent has five moving parts that interact seamlessly: pods communicate runtime information via <terminal inline>stdout<terminal inline> and <terminal inline>stderr<terminal inline> and compile it into a log file. A rotational tool interacts periodically with these logs to keep things fresh and optimized. A logging agent writes directly to the log directory without an intermediary, while communicating back and forth with an established backend.

The five parts of the node logging agent

If you want to add a sidecar container, this container would coexist with a given app container in the relevant pod. Your app container communicates with the sidecar, and vice versa, enabling an ongoing exchange of information. This allows your node’s kubelet to do some heavy lifting, while the sidecar ships logging information via its own standard output and error streams.

This segregation allows for organization of logs, and compartmentalizes certain application pieces as needed. One stream of data might have standalone importance. Additionally, not all application parts can write using the default <terminal inline>stdout<terminal inline> and <terminal inline>stderr<terminal inline> pathways.

What if you want to access logs following crashes? The <terminal inline>-previous<terminal inline> qualifier is ideal for smaller collections of containers and instances.

Accessing logs is pretty straightforward. Use the file pathway <terminal inline>/var/log<terminal inline> on your machine to view the log directory, Otherwise, you may enter the <terminal inline>journalctl<terminal inline> command to pull up systemd logs. This outputs a long list of log lines. Nodes without systemd are subject to the logging rules of traditional servers.

Closing Thoughts

Logging in Kubernetes is undoubtedly essential. The process unlocks greater visibility into your system, and allows for much faster issue remediation than if you were flying blindly. While Kubernetes native logging does present some unique challenges, there are multiple ways to approach logging that are both reliable and relatively user friendly.

Both cluster health and application monitoring practices can sharply improve via intelligent logging. By using the tools available—externally and internally—you’ll be on your way to maintaining a performant, stable application infrastructure in the long term.

Article by

Tyler Charboneau

Software Engineer

Tyler is a hardware-software devotee and researcher. He specializes in simplifying the complex while speaking effectively to all audiences.

Read More