Skip to content

Instantly share code, notes, and snippets.

@namishelex01
Last active December 17, 2025 20:23
Show Gist options
  • Select an option

  • Save namishelex01/b575d9d280f8aaaa8e75873ad5bc9b19 to your computer and use it in GitHub Desktop.

Select an option

Save namishelex01/b575d9d280f8aaaa8e75873ad5bc9b19 to your computer and use it in GitHub Desktop.
Kubernetes security guide for cloud-native environments, covering control plane and workload security from a security engineer’s perspective. Explains Kubernetes threat models, API server risks, RBAC and ServiceAccount security, admission controllers, pod security standards, container runtime security, kubelet exposure, etcd protection, network …

Kubernetes Security — Detailed Notes (Basic → Advanced)

Kubernetes Cluster-2025-12-17-202325

1. Core Kubernetes Security Principles

1.1 Kubernetes is an API-driven operating system

Kubernetes behaves like a distributed OS where the API server is the system call interface.

  • You don’t “log into servers and run commands” as the primary control mechanism.
  • You declare intent to the API; controllers reconcile the world to match it.

Security implication:

  • If an attacker can use the API legitimately (stolen token, weak RBAC), they can often achieve objectives without exploits.

1.2 Shared responsibility

Security ownership is split across layers:

  • Cloud provider (managed K8s): control plane availability, base platform patching, some hardening defaults
  • Platform team: cluster configuration, networking, RBAC, admission controls, node management
  • Application teams: images, manifests, secrets usage patterns, app-layer controls
  • Security team: guardrails, detection, response, risk prioritization

Failure mode:

  • Most breaches exploit gaps between teams, where each assumes the other is enforcing a control.

1.3 Threat model shift

Kubernetes changes the dominant threat model:

  • Perimeter controls matter less than identity and authorization
  • Lateral movement is easier due to flat networking and shared nodes
  • Ephemeral workloads make forensic evidence short-lived
  • Attackers chain legitimate features: exec, Jobs/CronJobs, admission webhooks, RBAC grants

2. Architecture & Trust Boundaries

2.1 Control plane components

API server

  • Entry point for all cluster operations
  • Handles authentication and authorization
  • Runs admission chain (mutating/validating)

Security implications:

  • Compromise or misuse of API access enables:

    • workload creation/modification
    • secret access (if permitted)
    • persistence via controllers/RBAC

etcd

  • Key-value store for all cluster state
  • Often contains Secrets and sensitive configuration

Security implications:

  • etcd is a crown jewel: full state visibility + potential cluster control.
  • Encrypting Secrets at rest helps, but access to etcd still provides high-value metadata.

scheduler

  • Decides node placement

Security implications:

  • Less directly attacked, but scheduling constraints can be abused for:

    • co-location attacks
    • noisy-neighbor disruption
    • targeting specific nodes (GPU nodes, special taints)

controller manager

  • Runs controllers that reconcile desired state

Security implications:

  • Controllers are excellent persistence mechanisms because they continually re-apply state.

2.2 Node components

kubelet

  • Node agent; executes pod specs and manages containers via runtime

Security implications:

  • kubelet has high authority on the node; if exposed or misconfigured it can enable:

    • remote command execution against pods
    • node filesystem access via mounts
    • running privileged workloads

container runtime (containerd/CRI-O + runc)

  • Pulls images, configures namespaces/cgroups, starts processes

Security implications:

  • Runtime bugs can produce escape-class issues.
  • Runtime sockets are dangerous: access often equals host control.

CNI / kube-proxy

  • Implements networking model

Security implications:

  • Without policy enforcement, attackers get easy east-west reach.

3. Identity, Authentication, Authorization

3.1 Authentication

Common sources of identity:

  • x509 client certs (kubeconfigs)
  • OIDC (enterprise SSO)
  • cloud IAM integration
  • service account tokens (pod identity)

Security implications:

  • Authentication is not the hard part; authorization mistakes are.
  • Stolen credentials typically lead to low-noise compromise.

3.2 Authorization (RBAC)

RBAC decides what actions an identity can take.

Core objects:

  • Role / ClusterRole: permissions (verbs + resources + scope)
  • RoleBinding / ClusterRoleBinding: who gets those permissions

Common high-risk RBAC patterns:

  • Wildcards on resources or verbs (*)

  • Binding human users or service accounts to cluster-admin

  • Granting permission to create/modify:

    • roles, rolebindings, clusterroles, clusterrolebindings

Why this matters:

  • If an attacker can create a binding granting themselves broad permissions, they can escalate without exploiting anything.

3.3 Service accounts and token abuse

Service accounts are the most common Kubernetes escalation bridge.

Typical runtime chain:

  1. App vulnerability → container execution
  2. Read service account token from filesystem
  3. Call API using token
  4. Use RBAC permissions to expand access

Hardening concepts:

  • Avoid token auto-mount unless required
  • Use narrowly scoped service accounts per workload
  • Prefer short-lived projected tokens where possible

4. Workload and Pod Security

4.1 Pod abstraction and trust

Pods are the real execution unit.

  • Containers in a pod share network namespace
  • They share volumes

Security implication:

  • Compromise of one container often equals pod compromise.
  • Sidecars increase attack surface if they are privileged or have broad access.

4.2 Pod Security Standards and enforcement

Pod Security Standards (restricted/baseline) define safe defaults.

Key controls you want enforced broadly:

  • no privileged pods
  • no host namespace sharing unless justified
  • restrict hostPath
  • require non-root execution

In practice:

  • Enforcement usually requires admission controls (built-in mechanisms or policy engines).

4.3 Linux runtime hardening inside pods

Important settings and why they matter:

  • runAsNonRoot / runAsUser: reduces default privilege
  • readOnlyRootFilesystem: blocks many persistence techniques
  • allowPrivilegeEscalation: false: prevents gaining extra privs (e.g., setuid)
  • capabilities drop/add: removes dangerous CAPs; do not rely only on UID
  • seccompProfile: reduces syscall surface
  • AppArmor/SELinux: adds mandatory access control

Key insight:

  • Most real container breakouts are enabled by privilege + mounts, not by “containers being insecure.”

5. Networking and Lateral Movement

5.1 Default networking model

Kubernetes typically provides:

  • pod-to-pod connectivity by default
  • each pod has an IP

Security implication:

  • Once an attacker lands in one pod, east-west movement is often trivial.

5.2 NetworkPolicy and enforcement gaps

NetworkPolicy is the primary in-cluster segmentation tool.

Common failures:

  • No default-deny policies
  • Policies only on ingress, not egress
  • Policies not enforced by CNI (or inconsistently enforced)

Advanced considerations:

  • DNS egress becomes a major channel for data exfiltration if not constrained
  • Egress controls matter more than many teams realize

6. Secrets and Sensitive Data

6.1 Kubernetes Secrets are not magic

  • Kubernetes Secrets are base64-encoded objects stored in etcd.
  • Unless encryption at rest is enabled, Secrets in etcd can be readable by anyone with etcd access.

Exposure paths:

  • kubectl get secrets via over-permissive RBAC
  • mount as env vars (easy to leak in logs)
  • reading mounted files from compromised containers

6.2 Better secret strategies

  • Encrypt secrets at rest (KMS)
  • Use external secret stores (Vault / cloud secret managers)
  • Use short-lived credentials and rotation
  • Minimize secret distribution across namespaces

7. Admission Control and Policy

7.1 Admission chain

Admission happens before objects are persisted and acted upon.

  • Mutating admission: can change pods/specs
  • Validating admission: can allow/deny based on rules

Security implications:

  • Admission is your last preventive gate.
  • It can block the most dangerous configs: privileged, host mounts, risky capabilities, untrusted registries.

7.2 Policy failure modes

Why admission policies fail in practice:

  • policies applied only to “prod namespaces”
  • exemptions accumulate and become permanent
  • drift: new workload types appear; policies don’t cover them
  • “break-glass” accounts become normal workflow

Security engineering goal:

  • Reduce the gap between policy intent and actual enforcement.

8. Runtime Security and Detection

8.1 Why runtime monitoring is mandatory

Static scanning helps, but it cannot detect:

  • exploited running apps
  • credential theft
  • API misuse
  • in-memory or fileless behaviors

Runtime security focuses on behavioral signals.


8.2 High-signal runtime indicators

Workload-level signals:

  • unexpected shell execution (sh, bash)
  • unexpected package manager use (apt, yum, apk)
  • access to service account token paths
  • unexpected outbound connections (especially to metadata services or control plane endpoints)

Cluster-level signals:

  • creation of privileged pods
  • new RoleBindings/ClusterRoleBindings
  • creation of admission webhooks
  • unusual API verbs from unusual identities

Key practice:

  • Correlate runtime (process/syscall/network) with API audit logs.

9. Common Attack Chains (What Actually Happens)

9.1 Typical compromise path

  1. Exploit application vulnerability
  2. Obtain code execution in container
  3. Credential harvest (service account token, cloud creds, config files)
  4. API abuse to expand permissions or deploy new workloads
  5. Persistence via RBAC, controllers, webhooks, or GitOps pipeline

Why this matters:

  • Most attackers do not need kernel exploits to win in Kubernetes.

9.2 Persistence patterns

  • Hidden RoleBindings granting access
  • Malicious admission webhook that injects sidecars
  • Controller/operator modifications
  • CronJobs that reintroduce malicious pods

10. Supply Chain and GitOps

10.1 Image supply chain risks

  • untrusted base images
  • mutable tags (e.g., latest)
  • compromised registries

Security outcomes:

  • fleet-wide compromise through “legitimate” image pulls

10.2 GitOps and pipeline trust

GitOps makes Git and CI/CD a control plane.

Attack paths:

  • compromise Git credentials
  • compromise CI runners
  • steal deploy tokens

Security insight:

  • You can “secure the cluster” and still lose through the pipeline.

11. etcd and Control Plane Persistence

11.1 Why etcd is a crown jewel

etcd contains:

  • full cluster configuration
  • Secrets and sensitive references
  • historical cluster evolution

If etcd is exposed or weakly protected, attackers gain system-wide visibility.


11.2 Control-plane-level backdoors

High-impact persistence techniques:

  • admission webhooks
  • CRDs and controllers
  • impersonation and RBAC grants

These attacks survive node reboots and pod redeploys.


12. Observability and Response

12.1 Logging sources to prioritize

  • Kubernetes API audit logs (who did what)
  • runtime signals (process, network, syscall)
  • node logs and kubelet events

12.2 Response basics

When you suspect compromise:

  • identify the identity and its RBAC
  • contain by revoking tokens/permissions
  • inspect recent bindings/webhooks/controllers
  • rotate secrets potentially exposed
  • hunt for persistence mechanisms

13. Common Production Security Gaps

  • Assuming namespaces provide isolation
  • Over-permissive RBAC and shared service accounts
  • No default-deny NetworkPolicy
  • Privileged/host-mounted workloads allowed “temporarily”
  • Admission policy drift and exemptions
  • Lack of runtime detection and API audit correlation

14. Strategic Takeaways

  • Kubernetes security is control-plane-centric.
  • Identity is the real perimeter; network boundaries are secondary.
  • Misconfiguration dominates real-world risk.
  • Runtime visibility is mandatory.
  • Security must be automated to avoid drift.

Final Mental Model

Securing Kubernetes means controlling intent, identity, and behavior across a distributed system that continuously reconciles state. If you secure only containers, you miss the real threat surface: the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment