Resources

Using etcd Clusters with Kubernetes

June 23, 2021

etcd is a data store that can serve either a single node or an entire Kubernetes cluster. Here’s what to be aware of when setting up an etcd cluster, and how to upgrade your version.

Eze Sunday
Software Developer

etcd is a critical part of Kubernetes—it controls the storage and management of important information that your Kubernetes cluster needs to work properly, such as configuration data, state data, and metadata. It’s a reliable, single source of truth for your Kubernetes cluster data at any point in time.

Since Kubernetes is a distributed system, it makes sense that it uses a distributed data store to ensure reliability, consistency, and the availability of the entire system. This makes etcd the perfect tool for the job.

There are a number of things to be aware of when setting up your etcd cluster, or upgrading your etcd version. In this article, we’ll walk through both of these scenarios.

How Does etcd Work?

etcd watches for changes to either the actual or desired state of the distributed system. If there are disparities, Kubernetes will make changes to reconcile them. For example, whenever you run <terminal inline>kubectl create<terminal inline> or <terminal inline>kubectl apply<terminal inline>, or when there is a crash, you’re making changes to the etcd data store automatically.

etcd is built on top of the Raft algorithm, which is a leader-based distributed system consensus. It consists of a leader node and follower nodes. When the etcd cluster receives a request from the client, the data is updated in all the follower nodes. When most of the follower nodes have been updated, the leader node will send a successful response back to the client.

If the leader node dies, the followers will conduct an election to choose their next leader. If a request is routed to a follower node that has not been updated yet, it will forward the request to the leader node, which will respond with the updated response. This is neatly illustrated in “Understanding Distributed Consensus with Raft” by Kasun Indrasiri.

Leader follower architecture

Setting Up an etcd Cluster

First, let’s download etcd by running the following command:

<terminal>wget -q --show-progress --https-only --timestamping \ "https://github.com/etcd-io/etcd/releases/download/v3.4.0/etcd-v3.4.0-linux-amd64.tar.gz"<terminal>

It might take a couple of minutes depending on your internet speed. Once it’s done, extract it by running:

<terminal>tar -xvf etcd-v3.4.0-linux-amd64.tar.gz<terminal>

Now, you have etcd in your current directory but that’s not where it ought to be, as Kubernetes will be looking for it in <terminal inline>/usr/local/bin<terminal inline>. So move it into that directory:

<terminal>sudo mv etcd-v3.4.0-linux-amd64/etcd* /usr/local/bin/<terminal>

Now, you need to create a couple of directories etcd will be looking at for SSL certificates:

<terminal>sudo mkdir -p /etc/etcd /var/lib/etcd<terminal>

<terminal>sudo cp ca.pem kubernetes-key.pem kubernetes.pem /etc/etcd/<terminal>

If you’re having problems creating the certificates, you can check out this comprehensive guide to help with creating one for yourself.

Next, get your current compute instance IP address to help with dealing with client requests and data transmission with the members of your etcd cluster:

<terminal inline>INTERNAL_IP=$(curl -s -H "Metadata-Flavor: Google" \ http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/0/ip)<terminal inline>

Your etcd name should match the name of the current compute instance, so make it happen by running this line on the terminal:

<terminal>ETCD_NAME=$(hostname -s)<terminal>

Create the etcd.service systemd unit file:

cat <<EOF | sudo tee /etc/systemd/system/etcd.service[Unit]


Description=etcd

Documentation=https://github.com/coreos

[Service]

Type=notify

ExecStart=/usr/local/bin/etcd \\

--name $[ETCD_NAME] \\

--cert-file=/etc/etcd/kubernetes.pem \\

--key-file=/etc/etcd/kubernetes-key.pem \\

--peer-cert-file=/etc/etcd/kubernetes.pem \\

--peer-key-file=/etc/etcd/kubernetes-key.pem \\

--trusted-ca-file=/etc/etcd/ca.pem \\

--peer-trusted-ca-file=/etc/etcd/ca.pem \\

--peer-client-cert-auth \\

--client-cert-auth

--initial-advertise-peer-urls https://${INTERNAL_IP}:2380 \\

--listen-peer-urls https://${INTERNAL_IP}:2380 \\

--listen-client-urls https://${INTERNAL_IP}:2379,https://127.0.0.1:2379 \\

--advertise-client-urls https://${INTERNAL_IP}:2379 \\

--initial-cluster-token etcd-cluster-0 \\

--initial-cluster controller-0=https://10.240.0.10:2380,controller-1=https://10.240.0.11:2380,controller-2=https://10.240.0.12:2380 \\

--initial-cluster-state neww \\

--data-dir=/var/lib/etcd

Restart=on-failure

RestartSec=5

[Install]

WantedBy=multi-user.target

EOF

Now that your setup is complete, initiate etcd server by running the following commands:

<terminal>sudo systemctl daemon-reload<terminal>

<terminal>sudo systemctl enable etcd<terminal>

<terminal>sudo systemctl start etcd<terminal>

Now, on your different controllers—controller-0, controller-1, controller-2—repeat the previous commands.

You can now go ahead to list the etcd cluster members:

sudo ETCDCTL_API=3 etcdctl member list \
–endpoints=https://127.0.0.1:2379 \
–cacert=/etc/etcd/ca.pem \
–cert=/etc/etcd/kubernetes.pem \
–key=/etc/etcd/kubernetes-key.pem

And your output will look like so:

3a57933972cb5131, started, controller-2, https://10.240.0.12:2380, https://10.240.0.12:2379

f98dc20bce6225a0, started, controller-0, https://10.240.0.10:2380, https://10.240.0.10:2379

ffed16798470cab5, started, controller-1, https://10.240.0.11:2380, https://10.240.0.11:2379

Congratulations, you’ve got your etcd cluster all set up.

Security Concerns

Because of the sensitive nature of the data in your etcd cluster, it’s a primary target for hackers. Someone with as much information as is contained in the etcd database can do a lot of havoc with it, such as a ransomware attack, domain takeover, identity theft, and a lot more.

Well, the bad guys will always be there. The most important thing is that you should always be ten steps ahead of them and focus on securing your Kubernetes cluster right now. To secure your etcd cluster, you should:

  1. Ensure you implement a proper SSL certificate specifically for your etcd instance. etcd enjoys automatic TLS and authentication via certificates from both clients to server and server to server communication. And it’s recommended to create and sign individual certificates for every member of the etcd cluster. This way only your etcd members have your certificate and can communicate to each other securely.
  2. Encrypt your etcd cluster. Encrypting your etcd cluster means even if your Kubernetes cluster is compromised your private keys will still stay safe. The hacker will need to break into your etcd cluster to gain access to your credentials.
  3. Authenticate. etcd v2 introduced an authentication API to allow only authorized users to edit key-values in the etc cluster. By default this authentication is turned off. You can turn it on as a root user by running a put request to endpoint <terminal inline>PUT /v2/auth/enable<terminal inline>.

Backing Up

Backing up your etcd database is as important as backing up your conventional database. In case of unrecoverable system failures, you can always get your system up and running again. Other reasons to back up your etcd could be to replicate your production environment into staging before a major upgrade or a migration from one environment to another.

You can set up a cron job to regularly back up your etcd clusters.

The command below will back up an etcd cluster:

ETCDCTL_API=3 etcdctl1 --endpoints=https://[172.31.98.230]:2379 -- cacart=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etd/server.key snapshot save /data/backup.db

With the previous command, you can back up your etcd cluster using the etcdctl command. You’ll be required to provide your SSL certificate as shown in the snippet above. You already have that, so you’ll have to set the path to the certificate on your instance.

To restore your etcd cluster, it’s recommended that you first stop your etcd cluster, Kube API server and your schedulers, and then run the following commands:

ETCDCTL_API=3 etcdctl1 snapshot restore /data/backup.db --data-dir=/var/lib/myetcd - - cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --name kubemaster --initial-cluster=kubemaster=https://172.31.98.230:2380 --initial-advertise-peer-urls=https://172.31.98.230:2380 --initial-cluster-token kubemaster

Note the name of the backup file. Make sure the backup is in the directory you are looking to restore it from and the certificate files as well.

Upgrading etcd

etcd has gone through several versions, with each one solving vital bug fixes and adding new features. The current recommended etcd version is 3.x. Depending on the version you’re upgrading from, you might have a little bit of downtime or none at all.

It’s recommended that you migrate between minor versions incrementally. For example, if you are in version 2.0, you should migrate to 2.3 first before migrating to 3.0 or 3.4. This will help reduce the complications that might come with an upgrade.

To upgrade from one version to another, simply modify the command in the etcd manifest file to target the new version and start it:

TARGET_STORAGE=etcd3

ETCD_IMAGE=3.0.17

TARGET_VERSION=3.0.17

STORAGE_MEDIA_TYPE=application/json

For pre-upgrade, you’ll need a script to stop etcd, wait until it’s healthy, and then run the official migration script created by the etcd team by running <terminal inline>./etcdctl migrate<terminal inline>.

Conclusion

etcd is a core component of Kubernetes. Knowing how to use it, protecting it from the bad guys, upgrading, and backing it up is important, and hopefully you now know how to do all of that. This article should serve as a starting point for your etcd exploration, as there’s a lot more to learn. You can read more about etcd by looking at the etcd documentation.

Article by

Eze Sunday

Software Developer

Eze is a software developer and technical writer trying to make sense of the world-building amazing stuff and documenting every step of the journey.