Hello, Kubernetes

10/29/2022

Kubernetes Basics

This is by no means a comprehensive introduction to Kubernetes. However, here's what you need to know to follow along.

What is a Kubernetes Cluster?

From the docs:

A Kubernetes cluster consists of a set of worker machines, called nodes, that run containerized applications. Every cluster has at least one worker node. The worker node(s) host the Pods that are the components of the application workload. The control plane manages the worker nodes and the Pods in the cluster.

Google Kubernetes Engine (GKE) takes care of the control plane, so we don't care about it as long as it works. The nodes in our cluster are the two VMs in our node pool that host the containers we deploy. Pods run on those nodes.

What is a Pod?

Again from the docs:

A Pod is similar to a set of containers with shared namespaces and shared filesystem volumes.

If you like, you can think of a pod as a container. Just know that it's not quite right even if it's a useful simplification. A pragmatist like me won't judge you, but someone probably will.

How Do We Deploy Pods?

Well... we don't. At least not directly. Pods are at the heart of Kubernetes and we'll be interacting with them a lot, but they are too basic on their own for most use cases. For example, a pod cannot scale or restart if its host node goes down.

Instead, we use other Kubernetes resources to describe workloads at a higher level of abstraction. These resources can own more basic resources like pods to wrap them in richer capabilities. One commonly used resource - the one we'll employ here - is called a deployment.

Deployments build on the functionality of pods to enable features such as:

Rolling deployments to predictably release new application versions
Ensuring the desired number of pods are always running through node failures and scaling events. (Achieved with a ReplicaSet owned by the deployment)
Rollbacks to revert to a previous version of the application

How Do We Manage Kubernetes Objects?

An object is an instance of a Kubernetes resource like a deployment. The Kubernetes control plane exposes a REST API to manage these objects. We'll be interacting with that API via a command line tool called kubectl.

We describe objects in YAML files using the spec defined by a resource. kubectl can create, delete, and update these objects to manage the state of the cluster.

With all that out of the way, let's move on to the main event.

A Simple Deployment

Now that we have a cluster, let's get something up and running on it. We'll use hashicorp/http-echo as a starting place. It's a small web server that serves the text we specify.

First let's create a file called deployment.yaml to describe the deployment.

This is a bog standard deployment. Here's the rub:

The label app: echo1 is specified in the deployment's spec.selector and template.metadata. An unfortunately verbose requirement.
spec.replicas is set to create two instances of the application.
The container spec.image is set to use the hashicorp/http-echo:alpine image (from Docker Hub by default). More on the alpine tag later.
The argument -text=echo1 tells the application to serve the text "echo1"
The containerPort is set to 5678 to match the application default

Now let's create the deployment object descibed in that file.

We should see two pods since we specified two replicas.

Let's forward a port to one of the pods to test the deployment.

Then in a separate terminal, we can make a request to localhost:8080 since that's the port we forwarded to the pod.

With this setup, we can forward a port to make requests to one pod or the other. What if we want to distribute requests across the pods?

Distributing Traffic Across Pods

We can front a deployment with a service to spread traffic across its replicas. To illustrate the point, let's adjust the deployment.

Now we're injecting the name of the pod to into the container as the environment variable POD_NAME. Then we use that environment variable to serve the name of the pod instead of some hardcoded text. Now we can tell which pod is handling our request. Here's the new response in action.

Now let's create a service to expose both pods on one port.

That will forward all traffic to the echo1 service on port 80 to port 5678 to all pods that match the label app: echo1. Since the echo1 deployment applies that label to all its replicas, this traffic will be routed to all the replicas. By default it "uses the standard behavior of routing to all endpoints evenly." I'm assuming this uses a scheme like round-robin, but I haven't been able to confirm that.

kubectl port-forward forwards ports only to specific pods. And we haven't yet covered publically exposing our cluster over the internet. For now the easiest way to test this is from within the cluster.

Our GKE cluster comes with a cluster-aware DNS server called kube-dns out of the box. This DNS server automatically creates DNS records to allow pods to resolve services by their name. So inside the cluster, we can make requests to the echo1 service.

The only pods we have on our cluster at the moment are the echo1 pods themselves. However contrived, we can test that our service works as expected from inside one of those pods. Let's use kubectl exec to start an interactive shell in one of the echo1 pods.

Note the /bin/ash command given to kubectl exec here. That's not a typo for /bin/bash. Remember the alpine tag specified in the hashicorp/http-echo:alpine image? Without that tag, the http-echo image is built from scratch. We can't shell into such an image because there is no shell! ash is the shell that comes with Alpine Linux by default.

Now let's make some HTTP requests to "echo1" and see what happens. Although the container exposes port 5678, the service listens on port 80. No need to specify that since it's the default.

Traffic is distributed across pods as expected! Awesome 🎉