Playbook#

Note

Kubernetes Dashboard and API server (ports 10250/10255) exposures, and control-plane attacks are unlikely to happen due to GKE’s default security configuration. Hence, they are given little or no focus in this playbook.

../_images/playbook_1.png

References

Breakdown#

Phase

Activity

Description

Detection

Incident escalated involving GKE

Possible escalation sources:
SOC - Detections from use cases
CTI - Threat intel reports
End user - Unusual activity (e.g. higher billing cost, unfamiliar deployments) noticed
3rd party - Found public exposure of sensitive information or unpatched vulnerability

If root cause is not known from escalation source, proceed in parallel to Determine root cause.

Analysis

Identify GCP project, cluster and owner/PIC

Identify GCP Project & Cluster
One way is to use the Search products and resources search in GCP menu bar to search for IP Addresses and Names of resources. Note that the search is limited to only the resources you have access to.

Identify Owner/PIC
One way is to access https://console.cloud.google.com/iam-admin/iam?project=<project_id> and user account with the role Owner

Request Owner/PIC to grant the IR team the necessary permissions at his/her project and SSH access to the nodes in the GKE cluster under investigation

Request for preservation of evidence

Especially important given the self-healing and auto-scaling capabilities of Kubernetes which contributes to the volatility

Request Owner/PIC NOT to

  • Delete/restart the impacted cluster/deployment/node/pod(s)
  • Perform any remediation activities unless it had been cleared by the IR team

Discover cluster

Refer to Discover Cluster

Artifacts [logs] in persistent storage?

This impacts how the artifacts would be collected for analysis

Potential persistent storage locations:
gcePersistentDisk - Google Compute Engine Volume
PersistentVolumeClaim - Makes a claim from the cluster for an allocation of persistent storage (provided by a matching PersistentVolume object)
hostPath - Uses a file or directory on the node to emulate network-attached storage

Refer to Application Logs

Acquire persistent storage

Refer to Application Logs

Acquire node persistent storage

Refer to Application Logs

Perform live response

Refer to Live Response

Determine root cause

Compromised Credentials

  • Compromised Google Account
  • Access key (e.g. service account key or SSH key) exposed to public
  • Adversary obtaining valid kubeconfig file (e.g. via a compromised endpoint) which they can use for accessing the clusters
Compromised Image
  • Adversary plants their own compromised image(s) in a private registry (e.g. within the GCP project) they obtained access to. These images can then be pulled by a user
  • Using untrusted images from public registries (e.g. Docker Hub) that may be malicious
Unintended Exposure
  • Interfaces that are not intended to be exposed to the Internet, and therefore don’t require (secure) authentication by default. Exposing them to the Internet allows accesses which might enable running code or deploying containers in the cluster by an adversary
  • Examples of such interfaces that were seen exploited include Apache NiFi, Kubeflow, Argo Workflows, Weave Scope, and the Kubernetes dashboard
Vulnerable Application
  • Running an Internet-facing vulnerable application in a cluster can enable initial access to the cluster, especially if it is vulnerable to remote code execution vulnerability (RCE) that may be exploited

Analyse collected artifacts

Start by mounting the container filesystem

Determine Impact

Pod Escape

Data Destruction
  • Adversary may attempt to destroy data and resources in the cluster. This includes deleting deployments, configurations, storage and compute resources
Resource Hijacking
  • Adversary may abuse a compromised resource (e.g. pod) for running tasks such as cryptocurrency mining
  • Adversary may also create new pods for such activities
Denial of Service
  • Includes attempts to block the availability of the pods themselves, the underlying nodes, or the API server
Exfiltration
  • Adversary may attempt to extract and steal data that is being processed or stored by cluster resources

Containment

Compromised Credentials | Revoke access and/or disable/reset credentials

Google Account - Follow your company’s Account Compromise playbook
Access Key - Follow your company’s Key Exposure playbook
kubeconfig - Revoke access (and refresh token) in kubeconfig

Compromised Image | Isolate the resource and take down the image

Refer to Containment

Take down the image

Unintended Exposure | Close the exposure

Depending on the type of exposure, consider the following steps

  • Rectify the ACL of the resource
  • Close/Filter the exposed network port
  • Implement secure authentication (e.g. MFA, PKI)

Vulnerable Application | Isolate the resource and take down the application

Refer to Containment

Remediation / Eradication

Remove any persistence

Based on persistence mechanisms identified during analysis

Reset other credentials accessed by adversary

Google Account - Follow your company’s Account Compromise playbook
Access Key - Follow your company’s Key Exposure playbook

Patch any exploited vulnerabilities

Download patches and apply them according to vendor’s advisories / instructions

Reset unauthorised modifications

Based on analysis performed, and includes all impacted resources