Service Mesh
A service mesh is an infrastructure layer that manages service-to-service communication in a microservices architecture. It provides traffic management, security, observability, and reliability features without requiring changes to application code.
Why Use a Service Mesh?
- Traffic Management: Fine-grained control over routing, retries, timeouts, and circuit breaking
- Security: mTLS encryption, service authentication, and policy enforcement
- Observability: Distributed tracing, metrics, and logging for all service traffic
- Reliability: Automatic retries, failover, and health checks
- Zero-Trust Networking: Enforce least-privilege and secure-by-default communication
Pros and Cons
| Pros | Cons |
|---|---|
| Enhanced security (mTLS, RBAC) | Added complexity and resource overhead |
| Consistent traffic policies | Steep learning curve for teams |
| Deep observability and tracing | May impact latency/performance |
| Platform-agnostic (multi-cloud) | Debugging can be harder |
| Enables progressive delivery (canary, blue/green) |
Popular Service Mesh Providers
- Istio (open source, works on any Kubernetes, supported by GKE, AKS, EKS)
- Linkerd (lightweight, easy to install, CNCF project)
- Consul Connect (HashiCorp, integrates with VMs and Kubernetes)
- AWS App Mesh (managed for EKS, ECS, EC2)
- Azure Service Mesh (preview, managed for AKS)
- Anthos Service Mesh (GCP, managed Istio)
Example: Installing Istio on Kubernetes (Cloud-Agnostic)
curl -L https://istio.io/downloadIstio | sh -
cd istio-*
export PATH=$PWD/bin:$PATH
istioctl install --set profile=demo -y
kubectl label namespace default istio-injection=enabled
- For AKS: Use Azure CLI to create the cluster, then follow the above steps
- For EKS: Use AWS CLI and eksctl to create the cluster, then follow the above steps
- For GKE: Use gcloud to create the cluster, then follow the above steps
Example: Deploying a Sample App with Istio
kubectl apply -f samples/bookinfo/platform/kube/bookinfo.yaml
kubectl apply -f samples/bookinfo/networking/bookinfo-gateway.yaml
Access the app via Istio ingress gateway (see Istio docs for cloud-specific instructions).
Example: Enabling mTLS for All Services
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system
spec:
mtls:
mode: STRICT
Best Practices (2025)
- Start with a minimal mesh (e.g., Linkerd or Istio demo profile) and scale up
- Use GitOps (ArgoCD, Flux) to manage mesh configuration and CRDs
- Monitor mesh health with Prometheus, Grafana, and Jaeger
- Use LLMs (Copilot, Claude) to generate and review mesh policies and manifests
- Document mesh usage and onboarding for your team
Common Pitfalls
- Overcomplicating the mesh with too many features at once
- Not monitoring mesh resource usage (can impact cluster performance)
- Failing to secure the mesh dashboard and control plane
- Manual changes outside Git (causes drift in GitOps setups)