Determine Pod Escape (ipynb)#

References

Install Dependencies#

Install the dependencies ipywidgets, pandas and kubectl. Skip the next cell if they had already been installed.

# install ipywidgets, pandas
!pip3 install ipywidgets pandas
# install kubectl
!gcloud components install kubectl --quiet

Imports and Configuration#

import ipywidgets as widgets
import json
import os
import pandas as pd

from IPython.display import HTML, display

# extend width of widgets
display(HTML('''<style>
    .widget-label { min-width: 18ex !important; font-weight:bold; }
</style>'''))
# extend width of cells
display(HTML("<style>.container { width:100% !important; }</style>"))
display(HTML("<style>.output_result { max-width:100% !important; }</style>"))

# extend width and max rows of pandas output
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)
# [OPTIONAL] authenticate using your service account
!gcloud auth activate-service-account --key-file <json_key_file>

Define Environment Variables#

Specify the following information

Fields

Description

Source Project

Project id of target project that contains the k8s cluster

Cluster Name

Name of k8s cluster

Cluster Type

Type of k8s cluster (i.e. Regional or Zonal)

# create text boxes for user input
src_project = widgets.Text(description = "Source Project: ", disabled=False)
cluster_name = widgets.Text(description = "Cluster Name: ", disabled=False)
cluster_type = widgets.Dropdown(options=['Regional', "Zonal"], value='Zonal', description="Cluster Type: ", disabled=False)

display(src_project, cluster_name, cluster_type)

If Cluster Type is Regional, specify the Cluster Region (e.g. asia-southeast1).
Else, if Cluster Type is Zonal, specify the Cluster Zone (e.g. asia-southeast1-b).

# create text boxes for user input
if cluster_type.value == 'Regional':
    cluster_region = widgets.Text(description = "Cluster Region: ", disabled=False)
    display(cluster_region)
elif cluster_type.value == 'Zonal':
    cluster_zone = widgets.Text(description = "Cluster Zone: ", disabled=False)
    display(cluster_zone)
# store user input in environment variables for use in subsequent comamnds
os.environ['SRC_PROJECT'] = src_project.value
os.environ['CLUSTER_NAME'] = cluster_name.value

if cluster_type.value == 'Regional':
    os.environ['CLUSTER_REGION'] = cluster_region.value
elif cluster_type.value == 'Zonal':
    os.environ['CLUSTER_ZONE'] = cluster_zone.value

Get Cluster nodeConfig#

if cluster_type.value == 'Regional':
    !gcloud container clusters describe $CLUSTER_NAME --region $CLUSTER_REGION --project $SRC_PROJECT --format='json' > cluster_descr.json
elif cluster_type.value == 'Zonal':
    !gcloud container clusters describe $CLUSTER_NAME --zone $CLUSTER_ZONE --project $SRC_PROJECT --format='json' > cluster_descr.json

with open('./cluster_descr.json') as infile:
    cluster_descr = json.load(infile)
cluster_descr_df = pd.json_normalize(cluster_descr['nodeConfig'])

columns = ['metadata.disable-legacy-endpoints', 'serviceAccount', 'oauthScopes']
display(cluster_descr_df[columns]
       .rename(columns={'metadata.disable-legacy-endpoints': 'disable-legacy-endpoints'}))

disable-legacy-endpoints

  • Ensure that value is true

  • When true, config requires specified header is present when querying GCP metadata service, and disable querying of v1beta1 endpoints

serviceAccount

  • Service account attached to cluster

  • Default value is default, which is the <12-digit>-compute@developer.gserviceaccount.com

  • If not default, worthwhile to check the IAM roles/permissions granted to this service account

oauthScopes

  • Scope of service account attached to cluster

  • Ensure that it IS NOT https://www.googleapis.com/auth/cloud-platform, which enables the authentication to any API function and leverage the full powers of IAM permissions assigned to the service account

  • Default is devstorage.read_only, logging.write, monitoring, servicecontrol, service.management.readonly, trace.append, which prevent the leveraging of full powers of IAM permissions assigned to the service account

  • If NOT the above, scope is user-customised

  • Scope DOES NOT matter if the access token of the service account is obtained from the metadata service and used outside of the cluster

Connect to Cluster#

if cluster_type.value == 'Regional':
    !gcloud container clusters get-credentials $CLUSTER_NAME --region $CLUSTER_REGION --project $SRC_PROJECT
elif cluster_type.value == 'Zonal':
    !gcloud container clusters get-credentials $CLUSTER_NAME --zone $CLUSTER_ZONE --project $SRC_PROJECT

Get Pods’ Security Context#

def highlight_not_na(value):
    if pd.isna(value):
        return None
    else:
        return 'color:white; background-color:purple'

!kubectl get pods -A --output=json > pods_sc.json

with open('./pods_sc.json') as infile:
    pods_sc = json.load(infile)
pods_sc_df = pd.json_normalize(pods_sc['items'], max_level=3)

desired_columns=['metadata.name', 'metadata.namespace', 'spec.securityContext.runAsNonRoot', 'spec.securityContext.runAsGroup', 'spec.securityContext.runAsUser', 'spec.securityContext.seLinuxOptions']
columns = list(set(pods_sc_df.columns) & set(desired_columns))
pods_sc_df_formatted = pods_sc_df[columns].rename(columns={'metadata.name': 'Pod Name', 
                     'metadata.namespace': 'Namespace',
                     'spec.securityContext.runAsNonRoot': 'runAsNonRoot',
                     'spec.securityContext.runAsGroup': 'runAsGroup',
                     'spec.securityContext.runAsUser': 'runAsUser',
                     'spec.securityContext.seLinuxOptions': 'seLinuxOptions'}).sort_index(axis=1)
        
unwanted_columns = ['Namespace', 'Pod Name']
columns = [x for x in list(pods_sc_df_formatted.columns) if x not in unwanted_columns]
display(pods_sc_df_formatted
        .dropna(thresh=3)
        .style.format(precision=0).applymap(highlight_not_na, subset=pd.IndexSlice[:, columns]))

Due to the potential overwhelming output that the kubectl command could return, the output had been parsed to return only values that are not NA. Check against the following documentation to determine if these values are of concern.

runAsNonRoot - Indicates that the container must run as a non-root user

runAsGroup

  • GID to run the entrypoint of the container process

  • Uses runtime default if unset

  • Often set up in conjunction with volume mounts containing files that have the same ownership IDs

  • In GKE, it is normal for event-exporter-gke, konnectivity-agent and konnectivity-agent-autoscaler to have runAsGroup value of 1000

runAsUser

  • UID to run the entrypoint of the container process

  • Defaults to user specified in image metadata if unspecified

  • Enables the viewing of environment variables or file descriptors of processes with the specified UID

  • Often set up in conjunction with volume mounts containing files that have the same ownership IDs

  • Check /etc/passwd of host/node to map uid to username

  • In GKE, it is normal for event-exporter-gke, konnectivity-agent and konnectivity-agent-autoscaler to have runAsUser value of 1000

seLinuxOptions

  • SELinux is a policy driven system to control access to apps, processes and files on a Linux system

  • Implements the Linux Security Modules framework in the Linux kernel

  • Based on the concept of labels - it applies these labels to all the elements in the system which group elements together

  • Labels are also known as the security context (not to be confused with the Kubernetes securityContext)

  • Labels consist of a user, role, type, and an optional field level in the format user:role:type:level

  • SELinux then uses policies to define which processes of a particular context can access other labelled objects in the system

  • SELinux can be strictly enforced, in which case access will be denied, or it can be configured in permissive mode where it will log access

  • In containers, SELinux typically labels the container process and the container image in such a way as to restrict the process to only access files within the image

  • Changing the SELinux labeling for a container could potentially allow the containerized process to escape the container image and access the host filesystem

Get Containers’ Security Context (Precedence over Pods’)#

def highlight_not_na(value):
    if pd.isna(value):
        return None
    else:
        return 'color:white; background-color:purple'

with open('./pods_sc.json') as infile:
    ctrs_sc = json.load(infile)

frames = list()
for item in ctrs_sc['items']:
    for ctr in item['spec']['containers']:
        ctr_series = dict()
        ctr_series['Namespace'] = item['metadata']['namespace']
        ctr_series['Pod Name'] = item['metadata']['name']
        ctr_series['Container Name'] = ctr['name']
        if 'securityContext' in ctr:
            securityContext = ctr['securityContext']
            if 'privileged' in securityContext: ctr_series['privileged'] = securityContext['privileged']
            if 'allowPrivilegeEscalation' in securityContext: ctr_series['allowPrivilegeEscalation'] = securityContext['allowPrivilegeEscalation']
            if 'capabilities' in securityContext: ctr_series['capabilities'] = securityContext['capabilities']
            if 'procMount' in securityContext: ctr_series['procMount'] = securityContext['procMount']
            if 'readOnlyRootFilesystem' in securityContext: ctr_series['readOnlyRootFilesystem'] = securityContext['readOnlyRootFilesystem']
            if 'runAsGroup' in securityContext: ctr_series['runAsGroup'] = securityContext['runAsGroup']
            if 'runAsNonRoot' in securityContext: ctr_series['runAsNonRoot'] = securityContext['runAsNonRoot']
            if 'runAsUser' in securityContext: ctr_series['runAsUser'] = securityContext['runAsUser']
            if 'seLinuxOptions' in securityContext: ctr_series['seLinuxOptions'] = securityContext['seLinuxOptions']
            if 'windowsOptions' in securityContext: ctr_series['windowsOptions'] = securityContext['windowsOptions']   
        ctr_series = pd.Series(ctr_series)
        frames.append(ctr_series)
ctrs_sc_df = pd.DataFrame(frames)

unwanted_columns = ['Namespace', 'Pod Name', 'Container Name']
columns = [x for x in list(ctrs_sc_df.columns) if x not in unwanted_columns]
display(ctrs_sc_df
        .dropna(thresh=4)
        .style.format(precision=0).applymap(highlight_not_na, subset=pd.IndexSlice[:, columns]))

Due to the overwhelming output that the kubectl command could return, the output had been parsed to return only values that are not NA. Amongst the displayed output are pods in kube-system namespace which come with the GKE cluster by default and they can be ignored. For others, check against the following to determine if the values are of concern.

privileged

  • Runs container in privileged mode

  • Processes in privileged containers are essentially equivalent to root on the node/host

  • Provides access to /dev on the host, which enables the mounting of the node/host filesytem to the privileged pod

    • But provides a limited view of the filesystem - files that require privilege escalation (e.g. to root) are not accessible

  • Enables multiple options to gaining RCE with root privileges on the node/host

allowPrivilegeEscalation

  • Controls whether a process can gain more privileges than its parent process

  • This bool directly controls if the no_new_privs flag will be set on the container process

  • Always true when the container is: 1) run as privileged 2) has CAP_SYS_ADMIN

capabilities

  • Kernel level permissions that allow for more granular controls over kernel call permissions than simply running as root

  • Capabilities include things like the ability to change file permissions, control the network subsystem, and perform system-wide administration functions

  • Can be configured to drop or add capabilities

procMount

  • By default, container runtimes mask certain parts of the /proc filesystem from inside a container in order to prevent potential security issues

  • However, there are times when access to those parts of /proc is required; particularly when using nested containers as is often used as part of an in-cluster build process

  • There are only two valid options for this entry:

    • Default, which maintains the standard container runtime behavior, or

    • Unmasked, which removes all masking for the /proc filesystem.

readOnlyRootFilesystem

  • Default is false (represented by nan in the output)

  • If true, limits the actions that an attacker can perform on the container filesystem

windowsOptions

  • Windows specific settings applied to all containers

Check Pods’ hostPID, hostIPC, hostNetwork Config#

!kubectl get pods -A --field-selector=metadata.namespace!=kube-system \
    -o custom-columns=Name:.metadata.name,Namespace:.metadata.namespace,HostPID:.spec.hostPID,HostIPC:.spec.hostIPC,HostNetwork:.spec.hostNetwork

hostPID

  • Unable to get privileged code execution on the host directly with only hostPID: true

  • If true, possible options for attacker

    • View processes on host, including processes running in each pod

    • View environment variables for each pod on the host (which may contain credentials)

      • Applies only to processes running within pods that share the same UID as the hostPID pod

      • To get the environment variables from processes that do not share the same UID, hostPID pod needs to run with the runAsUser set to the desired UID

    • View file descriptors for each pod on the host (which may contain credentials)

      • Permissions about environment variables above applies here as well

    • Kill process on the node

hostIPC

  • Unable to get privileged code execution on the host directly with only hostIPC: true

  • If any process on the host or any processes in a pod uses the host’s inter-process communication mechanisms (shared memory, semaphore arrays, message queues, etc), these mechanisms can be read or written to

  • If true, possible options for attacker

    • Access data used by any pods that also use the host’s IPC namespace by inspecting /dev/shm

      • /dev/shm is shared between any pod with hostIPC: true and the host

      • Look for any files in this shared memory location

    • Inspect existing IPC facilities - Check to see if any IPC facilities are being used with /usr/bin/ipcs -a

hostNetwork

  • Unable to get privileged code execution on the host directly with only hostNetwork: true

  • If true, possible options for attacker

    • Sniff traffic - Use tcpdump to sniff unencrypted traffic on any interface on the host

    • Access services bound to localhost - Can reach services that only listen on the host’s loopback interface or that are otherwise blocked by network policies

    • Bypass network policy - Pod would be bound to the host’s network interfaces and not the pods/namspaces’

Check Pods’ hostpath Config#

!kubectl get pods -A --field-selector=metadata.namespace!=kube-system \
    -o custom-columns=Name:.metadata.name,Namespace:.metadata.namespace,HostPath:.spec.volumes[].hostPath
  • No results returned if there are no pods with hostpath configured

  • If the administrator had not limited what can be mounted, the entire host’s filesystem can be mounted

  • Provides read/write access on the host’s filesystem (limited to what the administrator defined)

  • If configured, possible options for attacker

    • Look for kubeconfig files on the host filesystem (may find a cluster-admin config with full access to everything)

      • Not applicable to GKE as GKE by default DOES NOT store kubeconfig files (i.e. .kube/config) on the node hosting the pod

    • Add persistence

      • Add own SSH key

      • Add own CRON job

    • Crack hashed passwords in /etc/shadow

  • Mount point can be found with

    • kubectl describe pod hostpath-exec-pod | sed -ne '/Mounts/,/Conditions/p'