Authenticating Kubernetes Pods with Service Accounts

 

Most of the time when you're running a pod on Kubernetes, you're more concerned with accessing the other pods which make up your application. Usually, that's though service names, cluster-local DNS names for services, or through a service mesh. When your application needs to interact with Kubernetes itself, however, things get complicated.

Some k8s hosting providers, such as Digitalocean, will provide you with a credentials file called a kubeconfig. This file gives whomever has it admin privileges on the cluster in question, allowing them to do anything they want, to whatever they want, in whatever namespace they want. It's full access.

For many people, they'd assume that the way to resolve access to the underlying k8s API is to put their kubeconfig in a secret, then mount that secret into a pod, and then use that for any API calls needed. And while you can do that, it's probably the worst way to do it.

Once that level of access is persistent in the cluster, available mounted in a pod, it can become a source of vulnerability. This is magnified if that pod is also hosting an external application, as a nefarious actor could exploit that application, use it to get a shell, then dump the contents of your kubeconfig to compromise the entire cluster. Even if the secret were kept to a pod which is purely internal, the level of access represented in the kubeconfig is simply too much. It's best to instead grant the minimum permissions necessary, to limit the possible damage if an exploit is found.

You might not know, but every pod on your cluster operates under a Kubernetes user account called a ServiceAccount. The service account acts as an identity and can be associated with specific permissions. Just like how there's a default namespace, there's also a default user.

To test this all out, let's create a static pod which will do nothing but run the sleep command:

apiVersion: v1
kind: Pod
metadata:
  labels:
    app: web
  name: manual-pod
spec:
  containers:
  - image: ten7/flightdeck-util
    imagePullPolicy: Always
    name: mr-sleep
    command:
      - sleep
      - "604800"

The above is the kind of static pod you might create to create a beachhead within your cluster to test out your application. Since k8s controllers such as Deployments, Statefulsets, and Cronjobs all contain a Pod template, what we do here will work just as well in those best-practice controllers. You should never use a static pod as a load bearing part of your application.

When k8s creates our pod, it actually will populate the fields with additional values from internal defaults. This includes the one we're most interested in, the serviceAccountName:

apiVersion: v1
kind: Pod
metadata:
  labels:
    app: web
  name: manual-pod
spec:
  serviceAccountName: default
  containers:
  - image: ten7/flightdeck-util
    imagePullPolicy: Always
    name: mr-sleep
    command:
      - sleep
      - "604800"

The default user shouldn't have much access to the k8s API, if any. What we really want to do is create our own ServiceAccount, and have our pod run under that. In my mind, I was expecting this to be a rather complicated process, but it was surprisingly simple. Just like everything else in k8s, a ServiceAccount is defined using YAML:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-service-account
  namespace: my-namespace

Once created, we can update our pod's serviceAccountName to run under our new ServiceAccount.

apiVersion: v1
kind: Pod
metadata:
  labels:
    app: web
  name: manual-pod
spec:
  serviceAccountName: my-service-account
  containers:
  - image: ten7/flightdeck-util
    imagePullPolicy: Always
    name: mr-sleep
    command:
      - sleep
      - "604800"

So we should have access to the k8s API now in our pod, right? Well, no. The ServiceAccount by itself doesn't do anything. For that, we need to grant it permissions though role based access control.

In k8s, permissions are granted to Roles. ServiceAccounts are then added to Roles using a RoleBinding.

To grant a permission, we'll make a Role, again, using YAML:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: my-pod-reader
  namespace: my-namespace
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - watch
  - list

Roles are were a lot of the complexity of the system comes in. Each rule in the Role specifies a list of apiGroups, a list of resources in those apiGroups, and a list of verbs -- what you can do with those resources.

Most of the time:

  • The apiGroup corresponds to the apiVersion at the top of the target object's YAML definition.
  • The resource corresponds to the kind.

This isn't always true, as there are some odd "subresources" like pod/exec or pod/cp which correspond to kubectl exec and kubectl cp respectively. The Kubernetes documentation has more details, but often you may find you have to hunt around for just the right constellation of permissions to do the operation you required.

The Role doesn't link to the ServiceAccount in any way, though. You can't usermod -a -G a ServiceAccount, as service accounts have no innate field to store the role associations. Instead, k8s relegates that job to yet another object, the RoleBinding. This actually works better than you think, as now each object in the chain (ServiceAccount, RoleBinding, Role) has a specific job and can be managed like any other k8s object.

Defining a RoleBinding is pretty straightforward:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: my-role-binding
  namespace: my-namespace
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: my-pod-reader
subjects:
- kind: ServiceAccount
  name: my-service-account
  namespace: my-namespace

You can see that the roleRef specifies which Role we're binding. Interestingly, a RoleBinding can link a single Role to multiple ServiceAccounts. You can't, however, join to multiple Roles.

One thing you might have noticed is that both the Role and RoleBinding are namespaced objects. That is, they work within, and exist in, a single namespace. To have permissions to another namespace, you'd need Roles and RoleBindings in those namespaces. ServiceAccounts are namespaced too, but you can see how in the subjects list in the RoleBinding you can specify the namespace.

What if you need these permissions across all namespaces? Do we need to keep creating new Roles and RoleBindings? Actually no, there's a ClusterRole and ClusterRoleBinding which are not namespaced. They look the same, when you define them, of course. Just the name is different.

Even after setting all of this up and restarting your static pod with the appropriate serviceAccountName, how does kubectl -- or any application using the k8s API -- know how to authenticate?

You might assume that (assuming your pod is running on Linux) that a KUBECONFIG environment variable or ~/.kube/config was created for you with all the appropriate values. When you check, however, you see nothing of the sort. If you poke around, however, you'll notice something interesting when you check your pod's disk mounts:

$ df -h 
Filesystem                Size      Used Available Use% Mounted on
overlay                  78.7G     24.4G     51.1G  32% /
...
tmpfs                     1.9G     12.0K      1.9G   0% /run/secrets/kubernetes.io/serviceaccount

Huh...now where did that come from? It looks like a k8s secret, but mounted in a directory we didn't specify, from a secret we didn't create.

If you look at that directory, you'll notice something even more interesting:

$ ls /run/secrets/kubernetes.io/serviceaccount/
ca.crt     namespace  token

There are three files in that mounted directory, a certificate, a file containing the namespace in plaintext, and a file called token containing a seemingly random string.

It turns out that this token file is in fact, our authorization credentials to the cluster. Better yet, commands like kubectl and software which relies on the k8s API check for this directory and token transparently. In fact, this token was created as a secret implicitly when you created your ServiceAccount. You may often see a secret starting with default-token-, which is the secret for the default ServiceAccount.

If you try to run kubectl with no authorization, it picks up the ServiceAccount without complaint:

$ kubectl get pods
NAME         READY   STATUS    RESTARTS   AGE
manual-pod   1/1     Running   0          1h

Of course it works with Ansible! Let's make a quick playbook to list out some pods:

---
- hosts: all
  tasks:
    - name: lookup some pods
      debug:
        msg: "{{ lookup('k8s', kind='pod', namespace=lookup('file', '/run/secrets/kubernetes.io/serviceaccount/namespace')) }}"

When we run this, we'll actually get a large amount of YAML back, describing the pods in the namespace we have access to, no kubeconfigs or weird hacks required!

This is only the briefest introduction to how RBAC works on Kubernetes, but hopefully it answers your most critical questions as a beginner. With a few YAML definitions, you can grant your pods permissions to work with the Kubernetes API, allowing you to perform in-cluster operations.