Security & Compliance

Securing Kubernetes at enterprise scale requires a comprehensive approach spanning infrastructure, workloads, data, and access controls. This guide outlines security best practices and compliance strategies for production Kubernetes environments.

Defense-in-Depth Security Model

Enterprise Kubernetes security follows a layered approach:

┌─────────────────────────────────────────────────────┐
│ Cluster Infrastructure Security                     │
├─────────────────────────────────────────────────────┤
│ Kubernetes Control Plane Security                   │
├─────────────────────────────────────────────────────┤
│ Network Security & Segmentation                     │
├─────────────────────────────────────────────────────┤
│ Workload Security (Pods & Containers)               │
├─────────────────────────────────────────────────────┤
│ Data Security & Secrets Management                  │
├─────────────────────────────────────────────────────┤
│ Authentication & Authorization (IAM)                │
├─────────────────────────────────────────────────────┤
│ Audit Logging & Monitoring                          │
├─────────────────────────────────────────────────────┤
│ Compliance & Governance                             │
└─────────────────────────────────────────────────────┘

Cluster Infrastructure Hardening

Private Cluster Architecture

Implement security best practices at the infrastructure level:

┌─────────────────────────────────────────────────────────┐
│                     Private VPC/VNET                    │
│                                                         │
│  ┌─────────────────┐                                    │
│  │ Bastion Host/   │                                    │
│  │ VPN Gateway     │                                    │
│  └────────┬────────┘                                    │
│           │                                             │
│  ┌────────▼─────────────────────────────────────┐       │
│  │           Private Kubernetes Cluster         │       │
│  │                                              │       │
│  │   ┌──────────────┐      ┌──────────────┐     │       │
│  │   │ Control Plane│      │ Worker Nodes │     │       │
│  │   └──────────────┘      └──────────────┘     │       │
│  └──────────────────────────────────────────────┘       │
│                                                         │
└─────────────────────────────────────────────────────────┘

Cloud-Specific Recommendations

AWS EKS:

Enable envelope encryption of EKS secrets using AWS KMS
Use Security Groups to restrict traffic between nodes
Implement private endpoint access for the EKS API
Use EC2 instances with IMDSv2 for node groups

Azure AKS:

Deploy AKS with Azure Private Link
Implement Azure Service Endpoints for service connections
Use Azure Policy for AKS security controls
Enable Azure Defender for Kubernetes

Google GKE:

Deploy private GKE clusters
Use VPC Service Controls to restrict API access
Enable Shielded GKE Nodes
Implement Binary Authorization

Kubernetes-Native Security Controls

Pod Security Standards

Enforce pod security using the built-in Pod Security Standards:

apiVersion: v1
kind: Namespace
metadata:
  name: restricted-namespace
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Policy Enforcement with OPA Gatekeeper

Deploy policy guardrails with OPA Gatekeeper:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: require-team-label
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Namespace"]
  parameters:
    labels: ["team", "environment", "application"]

Image Scanning and Admission Control

apiVersion: v1
kind: ConfigMap
metadata:
  name: trivy-operator-policies
  namespace: trivy-system
data:
  policy.yaml: |
    package trivy
    
    deny[msg] {
      input.vulnerabilities[_].Severity == "CRITICAL"
      msg := "Critical vulnerabilities are not allowed"
    }
    
    deny[msg] {
      input.securityIssues[_].Severity == "HIGH"
      msg := "Images with high security issues are not allowed"
    }

Network Security

Core Network Security Components

┌──────────────────────────────────────────┐
│              Egress Firewall             │
└──────────────────┬───────────────────────┘
                   │
┌──────────────────▼───────────────────────┐
│              Ingress Controller          │
│              with WAF/DDoS               │
└──────────────────┬───────────────────────┘
                   │
┌──────────────────▼───────────────────────┐
│              Service Mesh                │
│         (East-West Traffic Control)      │
└──────────────────┬───────────────────────┘
                   │
┌──────────────────▼───────────────────────┐
│           Network Policies               │
│           (Pod-level Firewalls)          │
└──────────────────────────────────────────┘

Network Policy Implementation

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          purpose: database
      podSelector:
        matchLabels:
          app: postgres
    ports:
    - protocol: TCP
      port: 5432

Service Mesh Security (Istio Example)

apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system
spec:
  mtls:
    mode: STRICT
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: allow-specific-methods
  namespace: production
spec:
  selector:
    matchLabels:
      app: backend
  action: ALLOW
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/default/sa/frontend-service-account"]
    to:
    - operation:
        methods: ["GET", "POST"]
        paths: ["/api/v1/*"]

Secret Management

External Secret Management Integration

# Using External Secrets Operator with AWS Secrets Manager
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
  namespace: production
spec:
  refreshInterval: "15m"
  secretStoreRef:
    name: aws-secretsmanager
    kind: ClusterSecretStore
  target:
    name: database-credentials
  data:
  - secretKey: username
    remoteRef:
      key: production/database
      property: username
  - secretKey: password
    remoteRef:
      key: production/database
      property: password

Sealed Secrets for GitOps

apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: api-key
  namespace: production
spec:
  encryptedData:
    api-key: AgBy8hCNjjSa...truncated...P8kQ9H3mAyxF3A

Authentication & Authorization

RBAC Implementation Best Practices

Principle of Least Privilege:

# Team-specific role with limited permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: team-a
  name: team-a-developer
rules:
- apiGroups: [""]
  resources: ["pods", "services", "configmaps", "secrets"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apps"]
  resources: ["deployments", "statefulsets"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
# Bind role to team group from identity provider
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: team-a-developers
  namespace: team-a
subjects:
- kind: Group
  name: "ad:team-a-developers"
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: team-a-developer
  apiGroup: rbac.authorization.k8s.io

SSO Integration with OIDC

# Using OIDC with Azure AD for AKS
apiVersion: v1
kind: ConfigMap
metadata:
  name: azure-ad-oidc-config
  namespace: kube-system
data:
  oidc-client-id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
  oidc-issuer-url: "https://sts.windows.net/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/"
  oidc-username-claim: "email"
  oidc-groups-claim: "groups"

Audit Logging & Monitoring

Enhanced Audit Policy

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all requests at the Metadata level
- level: Metadata
  # Long-running requests like watches that aren't recorded at RequestReceived
  omitStages:
    - "RequestReceived"
# Log pod changes at RequestResponse level
- level: RequestResponse
  resources:
  - group: ""
    resources: ["pods"]
# Log auth at RequestResponse level
- level: RequestResponse
  resources:
  - group: "authentication.k8s.io"
    resources: ["*"]
# Log all other resources at the Metadata level
- level: Metadata
  # Long-running requests like watches that aren't recorded at RequestReceived
  omitStages:
    - "RequestReceived"

Security-Focused Monitoring

# Falco security monitoring
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: falco
  namespace: security
spec:
  selector:
    matchLabels:
      app: falco
  template:
    metadata:
      labels:
        app: falco
    spec:
      containers:
      - name: falco
        image: falcosecurity/falco:latest
        securityContext:
          privileged: true
        volumeMounts:
        - mountPath: /var/run/docker.sock
          name: docker-socket
        - mountPath: /host/var/run/docker.sock
          name: docker-socket
        - mountPath: /host/dev
          name: dev-fs
        - mountPath: /host/proc
          name: proc-fs
          readOnly: true
        - mountPath: /host/boot
          name: boot-fs
          readOnly: true
        - mountPath: /host/lib/modules
          name: lib-modules
          readOnly: true
        - mountPath: /host/usr
          name: usr-fs
          readOnly: true
        - mountPath: /etc/falco
          name: falco-config
      volumes:
      - name: docker-socket
        hostPath:
          path: /var/run/docker.sock
      - name: dev-fs
        hostPath:
          path: /dev
      - name: proc-fs
        hostPath:
          path: /proc
      - name: boot-fs
        hostPath:
          path: /boot
      - name: lib-modules
        hostPath:
          path: /lib/modules
      - name: usr-fs
        hostPath:
          path: /usr
      - name: falco-config
        configMap:
          name: falco-config

Compliance Automation

Continuous Compliance Validation

# Kyverno policy for PCI-DSS compliance
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: pci-dss-restricted
spec:
  validationFailureAction: Enforce
  rules:
  - name: host-path-volumes-restricted
    match:
      resources:
        kinds:
        - Pod
    validate:
      message: "Host path volumes are not allowed for PCI-DSS compliance"
      pattern:
        spec:
          =(volumes):
            - =(hostPath): "null"
  - name: require-pod-probes
    match:
      resources:
        kinds:
        - Deployment
    validate:
      message: "Liveness and readiness probes are required"
      pattern:
        spec:
          template:
            spec:
              containers:
              - livenessProbe:
                  =(httpGet):
                    path: "*"
                    port: "*"
                readinessProbe:
                  =(httpGet):
                    path: "*"
                    port: "*"

Compliance Scanning and Reporting

# Trivy Operator configuration for vulnerability reporting
apiVersion: aquasecurity.github.io/v1alpha1
kind: VulnerabilityReport
metadata:
  name: app-vulnerability-report
  namespace: compliance
spec:
  target:
    resource:
      apiVersion: apps/v1
      kind: Deployment
      name: web-application
      namespace: production
  scanner:
    name: Trivy
    parameters:
      - name: "ignoreUnfixed"
        value: "true"
      - name: "severity"
        value: "CRITICAL,HIGH"
  schedule: "0 */6 * * *" # Every 6 hours

Disaster Recovery & Security Incident Response

Security Incident Response Plan

Detection: Monitor security alerts from:
- Kubernetes audit logs
- Container runtime security tools (Falco)
- Cloud provider security services

Containment:

# Isolate compromised namespace
kubectl label namespace compromised security=isolated
kubectl apply -f isolate-network-policy.yaml
   
# Force restart suspicious pods
kubectl delete pod suspicious-pod-xyz -n compromised
   
# Temporarily disable compromised service account
kubectl patch serviceaccount -n compromised compromised-sa \
  -p '{"metadata":{"annotations":{"security.alpha.kubernetes.io/disabled":"true"}}}'

Eradication & Recovery:

# Rotate credentials
kubectl create secret generic app-credentials --from-literal=password=$(openssl rand -base64 32) -n compromised --dry-run=client -o yaml | kubectl apply -f -
   
# Apply updated security policies
kubectl apply -f updated-pod-security-policies.yaml
   
# Restore from known good state
kubectl apply -f https://gitops-repo/known-good-state.yaml

Post-Incident Analysis:
- Forensic analysis of compromised containers
- Audit log review
- Root cause identification and remediation

Cloud-Specific Compliance Controls

AWS EKS Compliance

Requirement	Implementation
Access Logging	AWS CloudTrail + EKS audit logs to CloudWatch
Data Encryption	EBS encryption with KMS for PVs
Network Segmentation	Security Groups, NACLs, and K8s NetworkPolicies
Vulnerability Management	Amazon Inspector + ECR image scanning
Compliance Reporting	AWS Config Rules + AWS Security Hub

Azure AKS Compliance

Requirement	Implementation
Access Logging	Azure Monitor + AKS diagnostic settings
Data Encryption	Azure Disk Encryption + Azure Key Vault
Network Segmentation	NSGs, Azure Firewall, and K8s NetworkPolicies
Vulnerability Management	Microsoft Defender for Containers
Compliance Reporting	Azure Policy for AKS + Azure Security Center

GCP GKE Compliance

Requirement	Implementation
Access Logging	Cloud Audit Logs + GKE audit logging
Data Encryption	Application-layer encryption with Cloud KMS
Network Segmentation	VPC Firewalls and K8s NetworkPolicies
Vulnerability Management	GKE container threat detection + Binary Authorization
Compliance Reporting	Security Command Center + Compliance Reports