Service Mesh — Middle¶
A service mesh moves service-to-service concerns — mutual TLS, retries, timeouts, load balancing, and telemetry — out of application code and into a network of proxies that sit next to every workload. This tier explains how that works mechanically: what runs where, how a request actually traverses the proxy path, and how configuration reaches the proxies without redeploying your app.
Table of Contents¶
- The two-plane model
- The sidecar and injection
- Tracing a request through the mesh
- How config reaches the data plane: xDS
- Identity and mTLS
- Traffic management: shifting, retries, timeouts, circuit breaking
- Telemetry for free
- Where this goes next
1. The two-plane model¶
Every mesh splits cleanly into two planes with distinct jobs and lifecycles.
| Aspect | Data plane | Control plane |
|---|---|---|
| What it is | Proxies (Envoy in Istio; a purpose-built Rust micro-proxy in Linkerd) co-located with each workload | Central component (istiod, Linkerd's destination/identity controllers) |
| Job | Actually move packets: terminate connections, apply mTLS, route, retry, load-balance, emit metrics | Compute intent from config and push it to proxies; issue certificates; track endpoints |
| Where it runs | One instance per pod/workload (sidecar) | A few replicas, cluster-wide |
| On the request path? | Yes — every request goes through it | No — a proxy that loses its control plane keeps serving with its last-known config |
| Reacts to | Live traffic | API changes (routing rules, policies) and service-registry changes (pods coming and going) |
The key property: the control plane is off the hot path. It configures proxies asynchronously. A control-plane outage stops new config from propagating but does not stop traffic — proxies run on cached state. This is why meshes can be highly available even during upgrades.
2. The sidecar and injection¶
The data-plane proxy runs as a sidecar: a second container in the same pod as your application container. Same network namespace, same lifecycle, but a separate process. Because they share the namespace, the proxy can transparently intercept the app's inbound and outbound traffic.
Injection is how the sidecar gets there. Two mechanisms:
- Automatic (admission webhook): you label a namespace (e.g.
istio-injection=enabled). When Kubernetes admits a new pod, a mutating admission webhook rewrites the pod spec to add the proxy container plus an init container. - Manual: you run a CLI (
istioctl kube-inject,linkerd inject) that emits the mutated YAML ahead of time.
Traffic capture is the second half. An init container (or a CNI plugin) installs iptables rules in the pod's network namespace that redirect all inbound and outbound TCP to the sidecar's ports. The application makes an ordinary connect() to orders:8080; the kernel silently reroutes that connection to the local proxy. The application code is unchanged and unaware. That transparency is the whole point — no SDK, no library, no per-language client.
3. Tracing a request through the mesh¶
Consider service A calling service B. Both have sidecars. Here is the full path, including the mTLS handshake and one retry.
Read the diagram as three concerns layered on one request:
- Interception (steps 3, 8, 13):
iptablesmakes the app-to-proxy and proxy-to-app hops look like ordinary local connections. - Security (steps 4–7): the wire between the two pods is an mTLS tunnel. The app sends and receives plaintext on loopback; encryption happens entirely in the proxies.
- Resilience (steps 9–12): sidecar A owns the retry. The app never sees the transient 503; it sees a single successful call.
4. How config reaches the data plane: xDS¶
The control plane and Envoy sidecars speak xDS — the x Discovery Service protocol, a gRPC streaming API. Each proxy opens a long-lived stream to the control plane and receives typed configuration resources, pushed whenever they change:
| Resource | Answers the question |
|---|---|
| LDS (Listener Discovery) | What ports/filters do I listen on? |
| RDS (Route Discovery) | Given a request, which cluster does it go to? |
| CDS (Cluster Discovery) | What upstream services exist, and how do I load-balance them? |
| EDS (Endpoint Discovery) | What are the current healthy IPs behind each cluster? |
| SDS (Secret Discovery) | What certificates/keys do I use for mTLS? |
The flow: you write high-level intent as Kubernetes resources (an Istio VirtualService, DestinationRule, or a Gateway API HTTPRoute). The control plane watches the Kubernetes API and the service registry, translates that intent plus live endpoint data into concrete xDS resources, and streams them to exactly the proxies that need them. When a pod for service B scales up, EDS pushes the new endpoint to every proxy that talks to B — no restart, no polling.
This is the mechanism behind "no redeploy to change routing": your app's binary is fixed, but its network behavior is reconfigured live through xDS.
5. Identity and mTLS¶
Mesh security rests on workload identity, not IP addresses. Each workload gets a cryptographic identity, commonly a SPIFFE ID encoded in an X.509 certificate — for example spiffe://cluster.local/ns/prod/sa/orders, derived from the Kubernetes service account.
- The control plane runs (or fronts) a certificate authority. On startup, each sidecar requests a cert for its workload's identity; the CA signs it. Certs are short-lived and rotated automatically (hours, not months), delivered over SDS.
- On every connection, both sidecars perform a mutual TLS handshake: each presents its cert and verifies the peer's cert against the mesh CA. This authenticates both ends — the caller proves who it is, not just the server.
- Because identity is verified per connection, you can write authorization policies in terms of identity ("
paymentsmay callledger; nothing else may") rather than brittle network ACLs.
The application never handles keys, never manages certs, and never rotates anything. mTLS is a property of the mesh, applied uniformly.
6. Traffic management: shifting, retries, timeouts, circuit breaking¶
All of these are proxy behaviors configured through the control plane. The app is not involved.
Traffic shifting / canary. Route rules split traffic by weight across subsets of a service. A canary release sends, say, 95% to reviews:v1 and 5% to reviews:v2. You adjust the weights (or gate on headers, e.g. only internal users hit v2) and the change takes effect via RDS/CDS push — no deploy. This is how meshes decouple release (getting code onto machines) from rollout (directing traffic to it).
Retries. The sidecar retries failed requests on retriable conditions (connection failures, 5xx, specific gRPC statuses) up to a configured limit, ideally with a per-try timeout and a bounded overall budget. Because the caller's sidecar owns this, retry policy is consistent across every client language.
Timeouts. A request timeout caps how long the caller's proxy waits for the upstream before failing fast. This prevents a slow dependency from holding connections open and exhausting the caller.
Circuit breaking. Configured as connection-pool and outlier-detection limits on a cluster: cap concurrent connections/requests to an upstream, and eject an endpoint that returns errors past a threshold, temporarily removing it from load balancing. This stops a failing instance from absorbing traffic and prevents cascading overload.
A caution that senior tiers develop: retries and timeouts interact. Naive retries at every hop multiply load during an incident. The mesh gives you the knobs; using them safely (budgets, jitter, per-try caps) is a discipline, not a default.
7. Telemetry for free¶
Because every request passes through proxies on both ends, the mesh is the ideal place to observe traffic uniformly, without instrumenting each service:
- Metrics: each proxy emits request count, error rate, and latency percentiles per source/destination pair — the "golden signals" for every edge in the call graph, in a consistent schema across all languages.
- Distributed tracing: proxies can start and propagate trace context (spans), giving you a request's path across services. One caveat that matters: the app must forward the incoming trace headers on its outbound calls, or the trace breaks at each hop — the proxy cannot correlate ingoing and outgoing requests for you.
- Access logs: per-request logs from the proxy, including which route matched, whether a retry fired, and the response flags (e.g. why a connection was ejected).
The result is a live topology of who-calls-whom with health data on every link — obtained by adding a proxy, not by editing services.
8. Where this goes next¶
You now have the mechanics: two planes, a transparently injected sidecar, the request path with its mTLS tunnel and proxy-owned retry, and xDS as the config pipeline. The senior tier weighs the costs this machinery imposes — the added latency and resource overhead of two proxy hops per request, the operational burden of running and upgrading the control plane, the debugging complexity of a transparent proxy in the path — and covers the alternatives (proxyless/gRPC, ambient/per-node data planes) and when a mesh is not worth it.
Canonical references: istio.io, linkerd.io, envoyproxy.io.
Next step: Service Mesh — Senior
In this topic
- junior
- middle
- senior
- professional