Kubernetes behaves like a distributed OS where the API server is the system call interface.
- You don’t “log into servers and run commands” as the primary control mechanism.
- You declare intent to the API; controllers reconcile the world to match it.
Security implication:
- If an attacker can use the API legitimately (stolen token, weak RBAC), they can often achieve objectives without exploits.
Security ownership is split across layers:
- Cloud provider (managed K8s): control plane availability, base platform patching, some hardening defaults
- Platform team: cluster configuration, networking, RBAC, admission controls, node management
- Application teams: images, manifests, secrets usage patterns, app-layer controls
- Security team: guardrails, detection, response, risk prioritization
Failure mode:
- Most breaches exploit gaps between teams, where each assumes the other is enforcing a control.
Kubernetes changes the dominant threat model:
- Perimeter controls matter less than identity and authorization
- Lateral movement is easier due to flat networking and shared nodes
- Ephemeral workloads make forensic evidence short-lived
- Attackers chain legitimate features:
exec, Jobs/CronJobs, admission webhooks, RBAC grants
- Entry point for all cluster operations
- Handles authentication and authorization
- Runs admission chain (mutating/validating)
Security implications:
-
Compromise or misuse of API access enables:
- workload creation/modification
- secret access (if permitted)
- persistence via controllers/RBAC
- Key-value store for all cluster state
- Often contains Secrets and sensitive configuration
Security implications:
- etcd is a crown jewel: full state visibility + potential cluster control.
- Encrypting Secrets at rest helps, but access to etcd still provides high-value metadata.
- Decides node placement
Security implications:
-
Less directly attacked, but scheduling constraints can be abused for:
- co-location attacks
- noisy-neighbor disruption
- targeting specific nodes (GPU nodes, special taints)
- Runs controllers that reconcile desired state
Security implications:
- Controllers are excellent persistence mechanisms because they continually re-apply state.
- Node agent; executes pod specs and manages containers via runtime
Security implications:
-
kubelet has high authority on the node; if exposed or misconfigured it can enable:
- remote command execution against pods
- node filesystem access via mounts
- running privileged workloads
- Pulls images, configures namespaces/cgroups, starts processes
Security implications:
- Runtime bugs can produce escape-class issues.
- Runtime sockets are dangerous: access often equals host control.
- Implements networking model
Security implications:
- Without policy enforcement, attackers get easy east-west reach.
Common sources of identity:
- x509 client certs (kubeconfigs)
- OIDC (enterprise SSO)
- cloud IAM integration
- service account tokens (pod identity)
Security implications:
- Authentication is not the hard part; authorization mistakes are.
- Stolen credentials typically lead to low-noise compromise.
RBAC decides what actions an identity can take.
Core objects:
- Role / ClusterRole: permissions (verbs + resources + scope)
- RoleBinding / ClusterRoleBinding: who gets those permissions
Common high-risk RBAC patterns:
-
Wildcards on resources or verbs (
*) -
Binding human users or service accounts to
cluster-admin -
Granting permission to create/modify:
roles,rolebindings,clusterroles,clusterrolebindings
Why this matters:
- If an attacker can create a binding granting themselves broad permissions, they can escalate without exploiting anything.
Service accounts are the most common Kubernetes escalation bridge.
Typical runtime chain:
- App vulnerability → container execution
- Read service account token from filesystem
- Call API using token
- Use RBAC permissions to expand access
Hardening concepts:
- Avoid token auto-mount unless required
- Use narrowly scoped service accounts per workload
- Prefer short-lived projected tokens where possible
Pods are the real execution unit.
- Containers in a pod share network namespace
- They share volumes
Security implication:
- Compromise of one container often equals pod compromise.
- Sidecars increase attack surface if they are privileged or have broad access.
Pod Security Standards (restricted/baseline) define safe defaults.
Key controls you want enforced broadly:
- no privileged pods
- no host namespace sharing unless justified
- restrict hostPath
- require non-root execution
In practice:
- Enforcement usually requires admission controls (built-in mechanisms or policy engines).
Important settings and why they matter:
- runAsNonRoot / runAsUser: reduces default privilege
- readOnlyRootFilesystem: blocks many persistence techniques
- allowPrivilegeEscalation: false: prevents gaining extra privs (e.g., setuid)
- capabilities drop/add: removes dangerous CAPs; do not rely only on UID
- seccompProfile: reduces syscall surface
- AppArmor/SELinux: adds mandatory access control
Key insight:
- Most real container breakouts are enabled by privilege + mounts, not by “containers being insecure.”
Kubernetes typically provides:
- pod-to-pod connectivity by default
- each pod has an IP
Security implication:
- Once an attacker lands in one pod, east-west movement is often trivial.
NetworkPolicy is the primary in-cluster segmentation tool.
Common failures:
- No default-deny policies
- Policies only on ingress, not egress
- Policies not enforced by CNI (or inconsistently enforced)
Advanced considerations:
- DNS egress becomes a major channel for data exfiltration if not constrained
- Egress controls matter more than many teams realize
- Kubernetes Secrets are base64-encoded objects stored in etcd.
- Unless encryption at rest is enabled, Secrets in etcd can be readable by anyone with etcd access.
Exposure paths:
kubectl get secretsvia over-permissive RBAC- mount as env vars (easy to leak in logs)
- reading mounted files from compromised containers
- Encrypt secrets at rest (KMS)
- Use external secret stores (Vault / cloud secret managers)
- Use short-lived credentials and rotation
- Minimize secret distribution across namespaces
Admission happens before objects are persisted and acted upon.
- Mutating admission: can change pods/specs
- Validating admission: can allow/deny based on rules
Security implications:
- Admission is your last preventive gate.
- It can block the most dangerous configs: privileged, host mounts, risky capabilities, untrusted registries.
Why admission policies fail in practice:
- policies applied only to “prod namespaces”
- exemptions accumulate and become permanent
- drift: new workload types appear; policies don’t cover them
- “break-glass” accounts become normal workflow
Security engineering goal:
- Reduce the gap between policy intent and actual enforcement.
Static scanning helps, but it cannot detect:
- exploited running apps
- credential theft
- API misuse
- in-memory or fileless behaviors
Runtime security focuses on behavioral signals.
Workload-level signals:
- unexpected shell execution (
sh,bash) - unexpected package manager use (
apt,yum,apk) - access to service account token paths
- unexpected outbound connections (especially to metadata services or control plane endpoints)
Cluster-level signals:
- creation of privileged pods
- new RoleBindings/ClusterRoleBindings
- creation of admission webhooks
- unusual API verbs from unusual identities
Key practice:
- Correlate runtime (process/syscall/network) with API audit logs.
- Exploit application vulnerability
- Obtain code execution in container
- Credential harvest (service account token, cloud creds, config files)
- API abuse to expand permissions or deploy new workloads
- Persistence via RBAC, controllers, webhooks, or GitOps pipeline
Why this matters:
- Most attackers do not need kernel exploits to win in Kubernetes.
- Hidden RoleBindings granting access
- Malicious admission webhook that injects sidecars
- Controller/operator modifications
- CronJobs that reintroduce malicious pods
- untrusted base images
- mutable tags (e.g.,
latest) - compromised registries
Security outcomes:
- fleet-wide compromise through “legitimate” image pulls
GitOps makes Git and CI/CD a control plane.
Attack paths:
- compromise Git credentials
- compromise CI runners
- steal deploy tokens
Security insight:
- You can “secure the cluster” and still lose through the pipeline.
etcd contains:
- full cluster configuration
- Secrets and sensitive references
- historical cluster evolution
If etcd is exposed or weakly protected, attackers gain system-wide visibility.
High-impact persistence techniques:
- admission webhooks
- CRDs and controllers
- impersonation and RBAC grants
These attacks survive node reboots and pod redeploys.
- Kubernetes API audit logs (who did what)
- runtime signals (process, network, syscall)
- node logs and kubelet events
When you suspect compromise:
- identify the identity and its RBAC
- contain by revoking tokens/permissions
- inspect recent bindings/webhooks/controllers
- rotate secrets potentially exposed
- hunt for persistence mechanisms
- Assuming namespaces provide isolation
- Over-permissive RBAC and shared service accounts
- No default-deny NetworkPolicy
- Privileged/host-mounted workloads allowed “temporarily”
- Admission policy drift and exemptions
- Lack of runtime detection and API audit correlation
- Kubernetes security is control-plane-centric.
- Identity is the real perimeter; network boundaries are secondary.
- Misconfiguration dominates real-world risk.
- Runtime visibility is mandatory.
- Security must be automated to avoid drift.
Securing Kubernetes means controlling intent, identity, and behavior across a distributed system that continuously reconciles state. If you secure only containers, you miss the real threat surface: the API.