Kubernetes is an open source system for managing containerized applications. The orchestrator provides a platform for automating deployment, scaling, and operating applications to facilitate easy management and reliable operation of services. Currently hosted by the CNCF, it’s one of the most popular orchestration systems available, and is used across various organizations in different industries.
Meanwhile, PostgreSQL has become a popular choice for organizations looking for a database management system that is both powerful and easy to use. Postgres is known for its reliability, flexibility, and performance, making it an excellent choice for mission-critical applications.
Though Kubernetes was originally developed to primarily support stateless applications, it’s grown to support stateful applications, as well, and PostgreSQL has been widely adopted inside enterprises as a database to maintain the state. In this article, you’ll walk through combining these popular technologies, Kubernetes and Postgres, with a step-by-step guide to deploying PostgreSQL on Kubernetes. You’ll also look at when you may want to deploy Postgres on Kubernetes, as well as some best practices for doing so.
Why PostgreSQL on Kubernetes?
The combination of PostgreSQL and Kubernetes provides a scalable and highly available (HA) database solution that’s well suited for modern application development and deployment practices. While creating a HA solution is out of the scope of this article, you’ll learn how to set up a simple container with PostgreSQL, which offers a number of benefits.
Modern cloud-native applications are often built using microservices, which are small, self-contained services that can be individually deployed and scaled. PostgreSQL can be used as the database for each microservice, and Kubernetes can be used to manage the deployment and scaling of the application as a whole.
Easier Disaster Recovery
You don’t want to lose your operational or user data in any environment, but user error or technical failure may result in it anyhow. PostgreSQL’s Write-Ahead Logs (WAL) allows for easier disaster recovery by ensuring that all data is stored in the logs before the write operation to the database is performed, easing data recovery when required, and allowing even unwritten updates to be salvaged.
Better Utilization of Resources
Kubernetes is very efficient with scaling, and allows for use cases like scaling pods up during peak hours and down afterwards without service interruption. Scaling helps optimize resource utilization and save on cost, as you use only the resources necessary, not over provisioning to accommodate an infrequent or irregular surge in demand.
Deploying PostgreSQL on Kubernetes
To deploy PostgreSQL on Kubernetes, you need to have some tools set up.
- A working Kubernetes cluster. For this tutorial, a DigitalOcean cluster is used, but the steps of this tutorial will be the same for any cluster. To work locally, you can use something like kind or minikubeto set your cluster.
- A basic understanding of psql.
- kubectl installed and authenticated on your environment. You’ll also need some working knowledge of the tool.
ConfigMaps help you separate data from code, and prevent secrets from exposing themselves in your application’s source code. With ConfigMaps, you can more easily deploy and update applications.
Create a ConfigMap by pasting the following code into your terminal:
The fields <terminal inline>POSTGRES_DB<terminal inline>, <terminal inline>POSTGRES_USER<terminal inline>, and <terminal inline>POSTGRES_PASSWORD<terminal inline> are your secrets, and you can change the values according to your preference. You can edit these values using text editors like vim or nano.
The command below creates a new ConfigMap for our PostgreSQL deployment with a custom configuration. The configuration consists of the fields <terminal inline>POSTGRES_DB<terminal inline>, <terminal inline>POSTGRES_USER<terminal inline>, and <terminal inline>POSTGRES_PASSWORD<terminal inline>.
Use the following command to verify that your configmap is present and ensure you can locate <terminal inline>postgres-config<terminal inline> on the terminal.
Create and Apply Persistent Storage Volume and Persistent Volume Claim
In order to ensure data persistence, you should use a persistent volume (PV) and persistent volume claims (PVC). A persistent volume (PV) is a durable volume that will remain even if the pod is deleted and stores data.
A persistent volume claim (PVC) is how users request and consume PV resources. Think of it as requesting the PV with parameters such as size of your storage disk, access modes, and storage class.
To deploy stateful applications such as a PostgreSQL database, for example, you’ll need to create a PVC for the database data. You can create a pod that mounts the PVC and runs the MySQL database.
For this tutorial, you will move forward with a local volume, using <terminal inline>/mnt/data<terminal inline> as the path to volume:
Run the following command to create a new PVC and PV for your PostgreSQL deployment:
Use the command below to check if PVC is bound to PV:
If the <terminal inline>STATUS<terminal inline> is “Bound”, you can use it for your deployments.
Create and Apply PostgreSQL Deployment
Deployments are a way to manage rolling out and updating applications in a Kubernetes cluster. They provide a declarative way to define how an application should be deployed and updated, and can be used to roll back to previous versions if needed.
After creating PVCs, PVs, and ConfigMaps, you can create a stateful application by creating a stateful pod as follows:
The following command will create a new PostgreSQL deployment:
Use the following command to check if your deployments and the children objects, such as pods, are created successfully.
Create and Apply PostgreSQL Service
Kubernetes services help you expose ports in various ways, including through a NodePort. NodePorts expose a service on every node in a cluster, meaning that the service is accessible from outside the cluster. This can be useful for services that need to be accessible from outside the cluster. To keep things simple for this tutorial, you’ll expose the database using NodePort with the help of the following manifest:
The command below will create a new PostgreSQL service which helps you to connect to <terminal inline>psql<terminal inline>:
List All Objects
Listing all the objects can be done using the following command:
Connect to PostgreSQL
The Kubernetes command line client ships with a feature that lets you connect to a pod directly from your host command line. The kubectl exec command accepts a pod name, any commands that should be executed, and an interactive flag that lets you launch a shell. You’ll use <terminal inline>kubectl exec<terminal inline> to connect to PostgreSQL pod:
Use the password from the ConfigMap you created earlier, and the options <terminal inline>-it<terminal inline>.
- -i: Stands for interactive.
- -t: Attaches a tty (terminal) to the running command.
With PostgreSQL running, you’re now able to connect to the database and start writing some data to the tables.
Best Practices Deploying PostgreSQL on Kubernetes
When deploying PostgreSQL on Kubernetes, there are some best practices that you should follow to ensure the security and stability of your application.
Run the Container as Unprivileged User
You should always run the database container as an unprivileged user. This helps secure your data and avoid unauthorized access to your database. The most essential things to ensure that you run the container as an unprivileged user are:
*Make sure your container image launches as a user other than root (e.g. ensure USER is not 0 or root).
- Make sure your Pod Security Context is set to non-root by setting runAsNonRoot to true.
Encrypt Your Data
You should always encrypt your data to avoid data loss or theft. You can use various tools and make sure your data is encrypted in transit as well as in rest to prevent various misconfiguration and breaches. To learn more about how to encrypt your data, check out this CNCF webinar on the subject.
Create a Separate Namespace for Your Database
You should create a separate namespace for your database so that it’s isolated from other applications and services. RBAC should be implemented to your namespace via ClusterRole and RoleBindings to prevent unauthorized access.
A new database namespace also helps monitor resources, and you can apply limits if you need to balance resources.
You can create a namespace as follows:
In this tutorial, you’ve deployed a PostgreSQL database running on Kubernetes. This setup is great for most use cases, but it’s important to remember you have configured it to store data in node-local memory. The official documentation provides more details about support for cloud volumes, NFS, cephs, and more.
Finally, keep an eye on your resource usage, and scale your deployment accordingly to avoid performance issues. You can do it with the help of ContainIQ, an out-of-the-box solution that allows you to monitor the health of your cluster and your objects. It monitors events, logs, and traces making troubleshooting easier, and offers user-friendly, pre-built dashboards to allow you to get a view of your cluster health at a glance.