Resources

Getting Started with Terraform and Kubernetes

October 18, 2021

Terraform has been a popular way to scale virtual machine deployments on cloud hosting providers. This tutorial walks users through managing a Kubernetes deployment using Terraform.

Sean Bangalore
Software Engineer

Since its conception as an internal Google project, Kubernetes has evolved to simplify container orchestration and provide scale, portability, and standardization for services worldwide.

While the processes to migrate to and manage Kubernetes are simple and well-documented, there are certain concepts to consider that can impact your flexibility with deployments in the future. For example, how are your Kubernetes configuration and infrastructure changes managed?

Like regular cloud services such as compute and triggers, your Kubernetes deployment can be set up through ad hoc processes or through a CI/CD integration pipeline (think Azure or AWS CDK. These pipelines can be used for Kubernetes deployment and cloud services, however, they don’t support managing and sharing resources with other providers.

Therefore, you’ve got three options during deployment:

  1. Manage your services through one cloud provider and risk vendor lock-in.
  2. Administer services without a CDK, introducing an operational load.
  3. Use a service to manage multi-cloud architecture through code.

The first two aren’t feasible if you’re hosting or planning on expanding to a multi-provider architecture with multiple stages and services, and generally not recommended even with smaller architectures.

Managing a Multi-cloud Kubernetes Architecture with Terraform

A few providers offer support for this type of multi-cloud architecture and Kubernetes through configuration files that can be deployed through a CLI. Among them, Terraform by HashiCorp is the most complete and has the highest market share, making it enterprise friendly. It’s used by companies like Uber, Twitch, and Robinhood.

Through Terraform, you can manage your infrastructure through declarative configuration files. In these files, you perform a platform agnostic deployment by defining your ami, instance type, region or AZ, security groups, services, and much more through code. You can host your data on Firebase, and connect it to a Kubernetes deployment in EKS, while using Azure for Authentication. Each set of such configuration files in a single directory including the main config, input, and output is referred to as a module. A typical structure for a module within a Terraform project is:


modules
→ aws-s3-static-website-bucket
 → LICENSE
 → README.md
 → main.tf
 → variables.tf
 → outputs.tf
LICENSE
README.md
main.tf
outputs.tf
terraform.tfstate
terraform.tfstate.backup
variables.tf

While you could use <terminal inline>kubectl<terminal inline> to deploy resources on Kubernetes using the CLI, it’s not a Terraform competitor. <terminal inline>kubectl<terminal inline> doesn’t offer commands to manage the full Kubernetes lifecycle nor does it model relationships between resources. With Terraform, you can track and attribute ownership to configuration changes, maintain version control, and enable automated pipeline management.

Like <terminal inline>kubectl<terminal inline>, Terraform is natively supported by Kubernetes. In a recent update, Kubernetes introduced the alpha-provider, and a new merge algorithm to support Terraform. The alpha-provider enables access to all Kubernetes resources, including custom resource definitions. The new merging algorithm called Server-Side Apply detects differences in Terraform HCL configuration files when using <terminal inline>terraform plan<terminal inline> and <terminal inline>terraform apply<terminal inline>.

Through these applications, Terraform brings the advantages of Infrastructure as Code (IAC) and native support for Kubernetes deployments with multiple cloud providers.

Infrastructure as Code (IAC) for Kubernetes Deployments

At a high level, IAC allows for faster, multistage deployment, reduces the single point of failure associated with knowledge transfer, and improves service ownership as changes can be documented, commented on, logged, tracked and reviewed.

Faster, More Efficient Multistage Deployment

Through IAC, you can set up your development, staging, and production environments concurrently, eliminating redundant manual processes and abstracting out infrastructure changes. This helps maintain consistency in configurations, especially important for companies using a large variety of resources. Manually iterating changes through configurations for 20-30 services in 2-3 stages through the console or through separate CLIs costs developer time and motivation.

IAC also allows developers to set up their own child (testing) environments rapidly in which they can replicate dev/test/prod stages to test changes locally, leading to a quicker and more isolated development process.

Reduces Single Point of Failure

Alongside these technical improvements, IAC eliminates the SPOF associated with engineering expertise. Without IAC, effectively reviewing and monitoring configuration changes without a thorough understanding of multiple services is difficult. With IAC, configurations are clearly laid out through the code. There is also an operational risk when the engineer who implemented or understands the architecture best leaves. This risk is critical when coupled with security issues and a strategic weakness when you need to scale or migrate your services.

Improves Service Ownership

As a part of improving knowledge transfer, IAC is self-documenting. It improves the code review and commit process, advancing your engineering excellence initiatives.

IAC enables developers who are unfamiliar with your service configurations to learn about the service’s architecture. They can learn about the larger architecture without directly accessing credentials (these can be stored in an encrypted object storage platform such as GCP Secret Manager or AWS Secrets Manager/Param Store). Using code also allows you to integrate configuration with your Version Control System. This allows developers to identify the reason and the owner behind a particular configuration change, useful for understanding business logic.

Overall, IAC improves ownership, reduces SPOF, and improves development speed and efficiency.

Deploying a Kubernetes Cluster with Terraform

IAC is most widely implemented through Terraform due to its support, features, and adoption. HashiCorp has some self-service introductory tutorials on basic Terraform implementations which guide you through a number of cloud providers.

In this section, you’ll walk through an abbreviated example of using Terraform to deploy a server on GCS on OSX.

To begin, set up a GCP account with a project that you’ll be using for this tutorial and note down your project ID, region, and zone.

Install Terraform:


brew tap hashicorp/tap
brew install hashicorp/tap/terraform

Set up your Terraform project:


mkdir learn-terraform-gcp
cd learn-terraform-gcp

Set up your configuration in <terminal inline>main.tf<terminal inline>. Replace <terminal inline> <PROJECT_ID><terminal inline>, <terminal inline><REGION><terminal inline>, and <terminal inline> <ZONE><terminal inline>  with your project ID from the GCS console.


terraform {
 required_providers {
  google {
   source = "hashicorp/google"
   version = "3.5.0"
  }
}
}

provider "google" {
 credentials = file(">NAME<.json")

 project = ">PROJECT_ID<"
 region = "us-central1"
 zone = "us-central1-c"
}

resource "google_compute_network" "vpc_network" {
 name = "terraform-network"
}

Initialize the Terraform directory:


terraform init
// validate your configuration
terraform validate

Create your infrastructure on GCS:

<terminal>terraform apply<terminal>

Note that <terminal inline>teraform plan<terminal inline>, referenced here, is another command that’s useful, but is not covered in this tutorial. It is applied before <terminal inline>terraform apply<terminal inline>and allows you to view, share and validate the difference between the present and the proposed configuration files in a particular change. It can also be used as an artifact that’s stored for history and in code reviews.

This was a simple exercise of provisioning a compute on GCS. There are tutorials for other providers such as Azure and AWS, with more resources and connected modules.

Limitations of Terraform

Terraform does have shortcomings, such as resource and dependency management and limits to customization.

Some resources cannot be managed through configuration files. Relational databases (ie, AWS RDS) are one example. You can’t change encryption or instance types of Relational Databases through Terraform without significant database downtime. Additionally, if engineers set up branches of configuration files or modify resources outside of Terraform, there could be issues with syncing and merging the configurations in different branches. These challenges could be mitigated through operational procedures, education, or limits on your version control system.

Next, while one line deploys for each module and ease of setup are impressive, having all your resources connected and deployed through one command creates some tradeoffs. If you use more than a few resources in your configuration with nesting modules, Terraform’s dependency graph can become unwieldy. You might need to split the modules across multiple environments which introduces complexity: namely, dependency resolution and version and environment management. This is a challenge that’s tough to solve, which architecting thoroughly can preemptively address.

Customization, while impressive, is not comprehensive. For example, using commands on multiple modules is not supported, and neither is setup and teardown pre and post commands, nor is working with multiple AWS accounts in one project. To mitigate some of these issues, Terraform has a number of plugins such as Terragrunt which could be worth consideration.

Conclusion

In light of these limitations, Terraform has been updating consistently with releases every other week (for example, secrets was not a day 0 feature; they were exposed through state files before one such update). Major versions and new partnerships, such as with Kubernetes’ alpha-provider, have consistently improved the service.

Terraform remains the leading choice for IAC multi-provider deployments and can help improve your engineering operations significantly if well architected and implemented. It’s a choice that should be seriously considered when migrating or managing Kubernetes to reduce your workload in the long term.

Article by

Sean Bangalore

Software Engineer

Sean is a Software Development Engineer at Amazon and interested in software, artificial intelligence, and quantitative finance.

Read More