10/29/2022
This is by no means a comprehensive introduction to Kubernetes. However, here's what you need to know to follow along.
From the docs:
A Kubernetes cluster consists of a set of worker machines, called nodes, that run containerized applications. Every cluster has at least one worker node. The worker node(s) host the Pods that are the components of the application workload. The control plane manages the worker nodes and the Pods in the cluster.
Google Kubernetes Engine (GKE) takes care of the control plane, so we don't care about it as long as it works. The nodes in our cluster are the two VMs in our node pool that host the containers we deploy. Pods run on those nodes.
Again from the docs:
A Pod is similar to a set of containers with shared namespaces and shared filesystem volumes.
If you like, you can think of a pod as a container. Just know that it's not quite right even if it's a useful simplification. A pragmatist like me won't judge you, but someone probably will.
Well... we don't. At least not directly. Pods are at the heart of Kubernetes and we'll be interacting with them a lot, but they are too basic on their own for most use cases. For example, a pod cannot scale or restart if its host node goes down.
Instead, we use other Kubernetes resources to describe workloads at a higher level of abstraction. These resources can own more basic resources like pods to wrap them in richer capabilities. One commonly used resource - the one we'll employ here - is called a deployment.
Deployments build on the functionality of pods to enable features such as:
An object is an instance of a Kubernetes resource like a deployment. The Kubernetes control plane exposes a REST API to manage these objects. We'll be interacting with that API via a command line tool called kubectl
.
We describe objects in YAML files using the spec defined by a resource. kubectl
can create, delete, and update these objects to manage the state of the cluster.
With all that out of the way, let's move on to the main event.
Now that we have a cluster, let's get something up and running on it. We'll use hashicorp/http-echo as a starting place. It's a small web server that serves the text we specify.
First let's create a file called deployment.yaml
to describe the deployment.
This is a bog standard deployment. Here's the rub:
app: echo1
is specified in the deployment's spec.selector
and template.metadata
. An unfortunately verbose requirement.spec.replicas
is set to create two instances of the application.spec.image
is set to use the hashicorp/http-echo:alpine
image (from Docker Hub by default). More on the alpine
tag later.-text=echo1
tells the application to serve the text "echo1"containerPort
is set to 5678 to match the application defaultNow let's create the deployment object descibed in that file.
We should see two pods since we specified two replicas.
Let's forward a port to one of the pods to test the deployment.
Then in a separate terminal, we can make a request to localhost:8080
since that's the port we forwarded to the pod.
With this setup, we can forward a port to make requests to one pod or the other. What if we want to distribute requests across the pods?
We can front a deployment with a service to spread traffic across its replicas. To illustrate the point, let's adjust the deployment.
Now we're injecting the name of the pod to into the container as the environment variable POD_NAME
. Then we use that environment variable to serve the name of the pod instead of some hardcoded text. Now we can tell which pod is handling our request. Here's the new response in action.
Now let's create a service to expose both pods on one port.
That will forward all traffic to the echo1
service on port 80 to port 5678 to all pods that match the label app: echo1
. Since the echo1
deployment applies that label to all its replicas, this traffic will be routed to all the replicas. By default it "uses the standard behavior of routing to all endpoints evenly." I'm assuming this uses a scheme like round-robin, but I haven't been able to confirm that.
kubectl port-forward
forwards ports only to specific pods. And we haven't yet covered publically exposing our cluster over the internet. For now the easiest way to test this is from within the cluster.
Our GKE cluster comes with a cluster-aware DNS server called kube-dns out of the box. This DNS server automatically creates DNS records to allow pods to resolve services by their name. So inside the cluster, we can make requests to the echo1
service.
The only pods we have on our cluster at the moment are the echo1
pods themselves. However contrived, we can test that our service works as expected from inside one of those pods. Let's use kubectl exec
to start an interactive shell in one of the echo1
pods.
Note the /bin/ash
command given to kubectl exec
here. That's not a typo for /bin/bash
. Remember the alpine
tag specified in the hashicorp/http-echo:alpine
image? Without that tag, the http-echo
image is built from scratch. We can't shell into such an image because there is no shell! ash
is the shell that comes with Alpine Linux by default.
Now let's make some HTTP requests to "echo1" and see what happens. Although the container exposes port 5678, the service listens on port 80. No need to specify that since it's the default.
Traffic is distributed across pods as expected! Awesome 🎉