Kubernetes Security — Detailed Notes (Basic → Advanced)

1. Core Kubernetes Security Principles

1.1 Kubernetes is an API-driven operating system

Kubernetes behaves like a distributed OS where the API server is the system call interface.

You don’t “log into servers and run commands” as the primary control mechanism.
You declare intent to the API; controllers reconcile the world to match it.

Security implication:

If an attacker can use the API legitimately (stolen token, weak RBAC), they can often achieve objectives without exploits.

1.2 Shared responsibility

Security ownership is split across layers:

Cloud provider (managed K8s): control plane availability, base platform patching, some hardening defaults
Platform team: cluster configuration, networking, RBAC, admission controls, node management
Application teams: images, manifests, secrets usage patterns, app-layer controls
Security team: guardrails, detection, response, risk prioritization

Failure mode:

Most breaches exploit gaps between teams, where each assumes the other is enforcing a control.

1.3 Threat model shift

Kubernetes changes the dominant threat model:

Perimeter controls matter less than identity and authorization
Lateral movement is easier due to flat networking and shared nodes
Ephemeral workloads make forensic evidence short-lived
Attackers chain legitimate features: exec, Jobs/CronJobs, admission webhooks, RBAC grants

2. Architecture & Trust Boundaries

2.1 Control plane components

API server

Entry point for all cluster operations
Handles authentication and authorization
Runs admission chain (mutating/validating)

Security implications:

Compromise or misuse of API access enables:
- workload creation/modification
- secret access (if permitted)
- persistence via controllers/RBAC

etcd

Key-value store for all cluster state
Often contains Secrets and sensitive configuration

Security implications:

etcd is a crown jewel: full state visibility + potential cluster control.
Encrypting Secrets at rest helps, but access to etcd still provides high-value metadata.

scheduler

Decides node placement

Security implications:

Less directly attacked, but scheduling constraints can be abused for:
- co-location attacks
- noisy-neighbor disruption
- targeting specific nodes (GPU nodes, special taints)

controller manager

Runs controllers that reconcile desired state

Security implications:

Controllers are excellent persistence mechanisms because they continually re-apply state.

2.2 Node components

kubelet

Node agent; executes pod specs and manages containers via runtime

Security implications:

kubelet has high authority on the node; if exposed or misconfigured it can enable:
- remote command execution against pods
- node filesystem access via mounts
- running privileged workloads

container runtime (containerd/CRI-O + runc)

Pulls images, configures namespaces/cgroups, starts processes

Security implications:

Runtime bugs can produce escape-class issues.
Runtime sockets are dangerous: access often equals host control.

CNI / kube-proxy

Implements networking model

Security implications:

Without policy enforcement, attackers get easy east-west reach.

3. Identity, Authentication, Authorization

3.1 Authentication

Common sources of identity:

x509 client certs (kubeconfigs)
OIDC (enterprise SSO)
cloud IAM integration
service account tokens (pod identity)

Security implications:

Authentication is not the hard part; authorization mistakes are.
Stolen credentials typically lead to low-noise compromise.

3.2 Authorization (RBAC)

RBAC decides what actions an identity can take.

Core objects:

Role / ClusterRole: permissions (verbs + resources + scope)
RoleBinding / ClusterRoleBinding: who gets those permissions

Common high-risk RBAC patterns:

Wildcards on resources or verbs (*)
Binding human users or service accounts to cluster-admin
Granting permission to create/modify:
- roles, rolebindings, clusterroles, clusterrolebindings

Why this matters:

If an attacker can create a binding granting themselves broad permissions, they can escalate without exploiting anything.

3.3 Service accounts and token abuse

Service accounts are the most common Kubernetes escalation bridge.

Typical runtime chain:

App vulnerability → container execution
Read service account token from filesystem
Call API using token
Use RBAC permissions to expand access

Hardening concepts:

Avoid token auto-mount unless required
Use narrowly scoped service accounts per workload
Prefer short-lived projected tokens where possible

4. Workload and Pod Security

4.1 Pod abstraction and trust

Pods are the real execution unit.

Containers in a pod share network namespace
They share volumes

Security implication:

Compromise of one container often equals pod compromise.
Sidecars increase attack surface if they are privileged or have broad access.

4.2 Pod Security Standards and enforcement

Pod Security Standards (restricted/baseline) define safe defaults.

Key controls you want enforced broadly:

no privileged pods
no host namespace sharing unless justified
restrict hostPath
require non-root execution

In practice:

Enforcement usually requires admission controls (built-in mechanisms or policy engines).

4.3 Linux runtime hardening inside pods

Important settings and why they matter:

runAsNonRoot / runAsUser: reduces default privilege
readOnlyRootFilesystem: blocks many persistence techniques
allowPrivilegeEscalation: false: prevents gaining extra privs (e.g., setuid)
capabilities drop/add: removes dangerous CAPs; do not rely only on UID
seccompProfile: reduces syscall surface
AppArmor/SELinux: adds mandatory access control

Key insight:

Most real container breakouts are enabled by privilege + mounts, not by “containers being insecure.”

5. Networking and Lateral Movement

5.1 Default networking model

Kubernetes typically provides:

pod-to-pod connectivity by default
each pod has an IP

Security implication:

Once an attacker lands in one pod, east-west movement is often trivial.

5.2 NetworkPolicy and enforcement gaps

NetworkPolicy is the primary in-cluster segmentation tool.

Common failures:

No default-deny policies
Policies only on ingress, not egress
Policies not enforced by CNI (or inconsistently enforced)

Advanced considerations:

DNS egress becomes a major channel for data exfiltration if not constrained
Egress controls matter more than many teams realize

6. Secrets and Sensitive Data

6.1 Kubernetes Secrets are not magic

Kubernetes Secrets are base64-encoded objects stored in etcd.
Unless encryption at rest is enabled, Secrets in etcd can be readable by anyone with etcd access.

Exposure paths:

kubectl get secrets via over-permissive RBAC
mount as env vars (easy to leak in logs)
reading mounted files from compromised containers

6.2 Better secret strategies

Encrypt secrets at rest (KMS)
Use external secret stores (Vault / cloud secret managers)
Use short-lived credentials and rotation
Minimize secret distribution across namespaces

7. Admission Control and Policy

7.1 Admission chain

Admission happens before objects are persisted and acted upon.

Mutating admission: can change pods/specs
Validating admission: can allow/deny based on rules

Security implications:

Admission is your last preventive gate.
It can block the most dangerous configs: privileged, host mounts, risky capabilities, untrusted registries.

7.2 Policy failure modes

Why admission policies fail in practice:

policies applied only to “prod namespaces”
exemptions accumulate and become permanent
drift: new workload types appear; policies don’t cover them
“break-glass” accounts become normal workflow

Security engineering goal:

Reduce the gap between policy intent and actual enforcement.

8. Runtime Security and Detection

8.1 Why runtime monitoring is mandatory

Static scanning helps, but it cannot detect:

exploited running apps
credential theft
API misuse
in-memory or fileless behaviors

Runtime security focuses on behavioral signals.

8.2 High-signal runtime indicators

Workload-level signals:

unexpected shell execution (sh, bash)
unexpected package manager use (apt, yum, apk)
access to service account token paths
unexpected outbound connections (especially to metadata services or control plane endpoints)

Cluster-level signals:

creation of privileged pods
new RoleBindings/ClusterRoleBindings
creation of admission webhooks
unusual API verbs from unusual identities

Key practice:

Correlate runtime (process/syscall/network) with API audit logs.

9. Common Attack Chains (What Actually Happens)

9.1 Typical compromise path

Exploit application vulnerability
Obtain code execution in container
Credential harvest (service account token, cloud creds, config files)
API abuse to expand permissions or deploy new workloads
Persistence via RBAC, controllers, webhooks, or GitOps pipeline

Why this matters:

Most attackers do not need kernel exploits to win in Kubernetes.

9.2 Persistence patterns

Hidden RoleBindings granting access
Malicious admission webhook that injects sidecars
Controller/operator modifications
CronJobs that reintroduce malicious pods

10. Supply Chain and GitOps

10.1 Image supply chain risks

untrusted base images
mutable tags (e.g., latest)
compromised registries

Security outcomes:

fleet-wide compromise through “legitimate” image pulls

10.2 GitOps and pipeline trust

GitOps makes Git and CI/CD a control plane.

Attack paths:

compromise Git credentials
compromise CI runners
steal deploy tokens

Security insight:

You can “secure the cluster” and still lose through the pipeline.

11. etcd and Control Plane Persistence

11.1 Why etcd is a crown jewel

etcd contains:

full cluster configuration
Secrets and sensitive references
historical cluster evolution

If etcd is exposed or weakly protected, attackers gain system-wide visibility.

11.2 Control-plane-level backdoors

High-impact persistence techniques:

admission webhooks
CRDs and controllers
impersonation and RBAC grants

These attacks survive node reboots and pod redeploys.

12. Observability and Response

12.1 Logging sources to prioritize

Kubernetes API audit logs (who did what)
runtime signals (process, network, syscall)
node logs and kubelet events

12.2 Response basics

When you suspect compromise:

identify the identity and its RBAC
contain by revoking tokens/permissions
inspect recent bindings/webhooks/controllers
rotate secrets potentially exposed
hunt for persistence mechanisms

13. Common Production Security Gaps

Assuming namespaces provide isolation
Over-permissive RBAC and shared service accounts
No default-deny NetworkPolicy
Privileged/host-mounted workloads allowed “temporarily”
Admission policy drift and exemptions
Lack of runtime detection and API audit correlation

14. Strategic Takeaways

Kubernetes security is control-plane-centric.
Identity is the real perimeter; network boundaries are secondary.
Misconfiguration dominates real-world risk.
Runtime visibility is mandatory.
Security must be automated to avoid drift.

Final Mental Model

Securing Kubernetes means controlling intent, identity, and behavior across a distributed system that continuously reconciles state. If you secure only containers, you miss the real threat surface: the API.

namishelex01/kubernetes_security.md

Kubernetes Security — Detailed Notes (Basic → Advanced)

1. Core Kubernetes Security Principles

1.1 Kubernetes is an API-driven operating system

1.2 Shared responsibility

1.3 Threat model shift

2. Architecture & Trust Boundaries

2.1 Control plane components

API server

etcd

scheduler

controller manager

2.2 Node components

kubelet

container runtime (containerd/CRI-O + runc)

CNI / kube-proxy

3. Identity, Authentication, Authorization

3.1 Authentication

3.2 Authorization (RBAC)

3.3 Service accounts and token abuse

4. Workload and Pod Security

4.1 Pod abstraction and trust

4.2 Pod Security Standards and enforcement

4.3 Linux runtime hardening inside pods

5. Networking and Lateral Movement

5.1 Default networking model

5.2 NetworkPolicy and enforcement gaps

6. Secrets and Sensitive Data

6.1 Kubernetes Secrets are not magic

6.2 Better secret strategies

7. Admission Control and Policy

7.1 Admission chain

7.2 Policy failure modes

8. Runtime Security and Detection

8.1 Why runtime monitoring is mandatory

8.2 High-signal runtime indicators

9. Common Attack Chains (What Actually Happens)

9.1 Typical compromise path

9.2 Persistence patterns

10. Supply Chain and GitOps

10.1 Image supply chain risks

10.2 GitOps and pipeline trust

11. etcd and Control Plane Persistence

11.1 Why etcd is a crown jewel

11.2 Control-plane-level backdoors

12. Observability and Response

12.1 Logging sources to prioritize

12.2 Response basics

13. Common Production Security Gaps

14. Strategic Takeaways

Final Mental Model