Security & Compliance
Securing Kubernetes at enterprise scale requires a comprehensive approach spanning infrastructure, workloads, data, and access controls. This guide outlines security best practices and compliance strategies for production Kubernetes environments.
Defense-in-Depth Security Model
Enterprise Kubernetes security follows a layered approach:
┌─────────────────────────────────────────────────────┐
│ Cluster Infrastructure Security │
├─────────────────────────────────────────────────────┤
│ Kubernetes Control Plane Security │
├─────────────────────────────────────────────────────┤
│ Network Security & Segmentation │
├─────────────────────────────────────────────────────┤
│ Workload Security (Pods & Containers) │
├─────────────────────────────────────────────────────┤
│ Data Security & Secrets Management │
├─────────────────────────────────────────────────────┤
│ Authentication & Authorization (IAM) │
├─────────────────────────────────────────────────────┤
│ Audit Logging & Monitoring │
├─────────────────────────────────────────────────────┤
│ Compliance & Governance │
└─────────────────────────────────────────────────────┘
Cluster Infrastructure Hardening
Private Cluster Architecture
Implement security best practices at the infrastructure level:
┌─────────────────────────────────────────────────────────┐
│ Private VPC/VNET │
│ │
│ ┌─────────────────┐ │
│ │ Bastion Host/ │ │
│ │ VPN Gateway │ │
│ └────────┬────────┘ │
│ │ │
│ ┌────────▼─────────────────────────────────────┐ │
│ │ Private Kubernetes Cluster │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Control Plane│ │ Worker Nodes │ │ │
│ │ └──────────────┘ └──────────────┘ │ │
│ └──────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘
Cloud-Specific Recommendations
AWS EKS:
- Enable envelope encryption of EKS secrets using AWS KMS
- Use Security Groups to restrict traffic between nodes
- Implement private endpoint access for the EKS API
- Use EC2 instances with IMDSv2 for node groups
Azure AKS:
- Deploy AKS with Azure Private Link
- Implement Azure Service Endpoints for service connections
- Use Azure Policy for AKS security controls
- Enable Azure Defender for Kubernetes
Google GKE:
- Deploy private GKE clusters
- Use VPC Service Controls to restrict API access
- Enable Shielded GKE Nodes
- Implement Binary Authorization
Kubernetes-Native Security Controls
Pod Security Standards
Enforce pod security using the built-in Pod Security Standards:
apiVersion: v1
kind: Namespace
metadata:
name: restricted-namespace
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Policy Enforcement with OPA Gatekeeper
Deploy policy guardrails with OPA Gatekeeper:
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
name: require-team-label
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Namespace"]
parameters:
labels: ["team", "environment", "application"]
Image Scanning and Admission Control
apiVersion: v1
kind: ConfigMap
metadata:
name: trivy-operator-policies
namespace: trivy-system
data:
policy.yaml: |
package trivy
deny[msg] {
input.vulnerabilities[_].Severity == "CRITICAL"
msg := "Critical vulnerabilities are not allowed"
}
deny[msg] {
input.securityIssues[_].Severity == "HIGH"
msg := "Images with high security issues are not allowed"
}
Network Security
Core Network Security Components
┌──────────────────────────────────────────┐
│ Egress Firewall │
└──────────────────┬───────────────────────┘
│
┌──────────────────▼───────────────────────┐
│ Ingress Controller │
│ with WAF/DDoS │
└──────────────────┬───────────────────────┘
│
┌──────────────────▼───────────────────────┐
│ Service Mesh │
│ (East-West Traffic Control) │
└──────────────────┬───────────────────────┘
│
┌──────────────────▼───────────────────────┐
│ Network Policies │
│ (Pod-level Firewalls) │
└──────────────────────────────────────────┘
Network Policy Implementation
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-policy
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- namespaceSelector:
matchLabels:
purpose: database
podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
Service Mesh Security (Istio Example)
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system
spec:
mtls:
mode: STRICT
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: allow-specific-methods
namespace: production
spec:
selector:
matchLabels:
app: backend
action: ALLOW
rules:
- from:
- source:
principals: ["cluster.local/ns/default/sa/frontend-service-account"]
to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/v1/*"]
Secret Management
External Secret Management Integration
# Using External Secrets Operator with AWS Secrets Manager
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: database-credentials
namespace: production
spec:
refreshInterval: "15m"
secretStoreRef:
name: aws-secretsmanager
kind: ClusterSecretStore
target:
name: database-credentials
data:
- secretKey: username
remoteRef:
key: production/database
property: username
- secretKey: password
remoteRef:
key: production/database
property: password
Sealed Secrets for GitOps
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: api-key
namespace: production
spec:
encryptedData:
api-key: AgBy8hCNjjSa...truncated...P8kQ9H3mAyxF3A
Authentication & Authorization
RBAC Implementation Best Practices
Principle of Least Privilege:
# Team-specific role with limited permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: team-a
name: team-a-developer
rules:
- apiGroups: [""]
resources: ["pods", "services", "configmaps", "secrets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apps"]
resources: ["deployments", "statefulsets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
# Bind role to team group from identity provider
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: team-a-developers
namespace: team-a
subjects:
- kind: Group
name: "ad:team-a-developers"
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: team-a-developer
apiGroup: rbac.authorization.k8s.io
SSO Integration with OIDC
# Using OIDC with Azure AD for AKS
apiVersion: v1
kind: ConfigMap
metadata:
name: azure-ad-oidc-config
namespace: kube-system
data:
oidc-client-id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
oidc-issuer-url: "https://sts.windows.net/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/"
oidc-username-claim: "email"
oidc-groups-claim: "groups"
Audit Logging & Monitoring
Enhanced Audit Policy
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all requests at the Metadata level
- level: Metadata
# Long-running requests like watches that aren't recorded at RequestReceived
omitStages:
- "RequestReceived"
# Log pod changes at RequestResponse level
- level: RequestResponse
resources:
- group: ""
resources: ["pods"]
# Log auth at RequestResponse level
- level: RequestResponse
resources:
- group: "authentication.k8s.io"
resources: ["*"]
# Log all other resources at the Metadata level
- level: Metadata
# Long-running requests like watches that aren't recorded at RequestReceived
omitStages:
- "RequestReceived"
Security-Focused Monitoring
# Falco security monitoring
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: falco
namespace: security
spec:
selector:
matchLabels:
app: falco
template:
metadata:
labels:
app: falco
spec:
containers:
- name: falco
image: falcosecurity/falco:latest
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/run/docker.sock
name: docker-socket
- mountPath: /host/var/run/docker.sock
name: docker-socket
- mountPath: /host/dev
name: dev-fs
- mountPath: /host/proc
name: proc-fs
readOnly: true
- mountPath: /host/boot
name: boot-fs
readOnly: true
- mountPath: /host/lib/modules
name: lib-modules
readOnly: true
- mountPath: /host/usr
name: usr-fs
readOnly: true
- mountPath: /etc/falco
name: falco-config
volumes:
- name: docker-socket
hostPath:
path: /var/run/docker.sock
- name: dev-fs
hostPath:
path: /dev
- name: proc-fs
hostPath:
path: /proc
- name: boot-fs
hostPath:
path: /boot
- name: lib-modules
hostPath:
path: /lib/modules
- name: usr-fs
hostPath:
path: /usr
- name: falco-config
configMap:
name: falco-config
Compliance Automation
Continuous Compliance Validation
# Kyverno policy for PCI-DSS compliance
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: pci-dss-restricted
spec:
validationFailureAction: Enforce
rules:
- name: host-path-volumes-restricted
match:
resources:
kinds:
- Pod
validate:
message: "Host path volumes are not allowed for PCI-DSS compliance"
pattern:
spec:
=(volumes):
- =(hostPath): "null"
- name: require-pod-probes
match:
resources:
kinds:
- Deployment
validate:
message: "Liveness and readiness probes are required"
pattern:
spec:
template:
spec:
containers:
- livenessProbe:
=(httpGet):
path: "*"
port: "*"
readinessProbe:
=(httpGet):
path: "*"
port: "*"
Compliance Scanning and Reporting
# Trivy Operator configuration for vulnerability reporting
apiVersion: aquasecurity.github.io/v1alpha1
kind: VulnerabilityReport
metadata:
name: app-vulnerability-report
namespace: compliance
spec:
target:
resource:
apiVersion: apps/v1
kind: Deployment
name: web-application
namespace: production
scanner:
name: Trivy
parameters:
- name: "ignoreUnfixed"
value: "true"
- name: "severity"
value: "CRITICAL,HIGH"
schedule: "0 */6 * * *" # Every 6 hours
Disaster Recovery & Security Incident Response
Security Incident Response Plan
- Detection: Monitor security alerts from:
- Kubernetes audit logs
- Container runtime security tools (Falco)
- Cloud provider security services
-
Containment:
# Isolate compromised namespace kubectl label namespace compromised security=isolated kubectl apply -f isolate-network-policy.yaml # Force restart suspicious pods kubectl delete pod suspicious-pod-xyz -n compromised # Temporarily disable compromised service account kubectl patch serviceaccount -n compromised compromised-sa \ -p '{"metadata":{"annotations":{"security.alpha.kubernetes.io/disabled":"true"}}}' -
Eradication & Recovery:
# Rotate credentials kubectl create secret generic app-credentials --from-literal=password=$(openssl rand -base64 32) -n compromised --dry-run=client -o yaml | kubectl apply -f - # Apply updated security policies kubectl apply -f updated-pod-security-policies.yaml # Restore from known good state kubectl apply -f https://gitops-repo/known-good-state.yaml - Post-Incident Analysis:
- Forensic analysis of compromised containers
- Audit log review
- Root cause identification and remediation
Cloud-Specific Compliance Controls
AWS EKS Compliance
| Requirement | Implementation |
|---|---|
| Access Logging | AWS CloudTrail + EKS audit logs to CloudWatch |
| Data Encryption | EBS encryption with KMS for PVs |
| Network Segmentation | Security Groups, NACLs, and K8s NetworkPolicies |
| Vulnerability Management | Amazon Inspector + ECR image scanning |
| Compliance Reporting | AWS Config Rules + AWS Security Hub |
Azure AKS Compliance
| Requirement | Implementation |
|---|---|
| Access Logging | Azure Monitor + AKS diagnostic settings |
| Data Encryption | Azure Disk Encryption + Azure Key Vault |
| Network Segmentation | NSGs, Azure Firewall, and K8s NetworkPolicies |
| Vulnerability Management | Microsoft Defender for Containers |
| Compliance Reporting | Azure Policy for AKS + Azure Security Center |
GCP GKE Compliance
| Requirement | Implementation |
|---|---|
| Access Logging | Cloud Audit Logs + GKE audit logging |
| Data Encryption | Application-layer encryption with Cloud KMS |
| Network Segmentation | VPC Firewalls and K8s NetworkPolicies |
| Vulnerability Management | GKE container threat detection + Binary Authorization |
| Compliance Reporting | Security Command Center + Compliance Reports |