Ztunnel (L4)
Set up and test basic L4 load balancing and failover with ztunnel in an ambient mesh.
About this guide
In an ambient mesh, ztunnel handles Layer 4 (L4) load balancing for in-mesh traffic. When traffic flows between services without a waypoint in the path, ztunnel distributes connections across available backend endpoints using round-robin load balancing.
This guide shows you how to deploy a service with multiple replicas, observe the default load balancing behavior, and test failover when an endpoint becomes unavailable. For conceptual information about how L4 load balancing works with ztunnel, see the load balancing and failover overview.
Before you begin
Set up an ambient mesh in one cluster by using the Gloo Operator or Helm.Step 1: Deploy sample apps
Deploy a client app and a backend service with multiple replicas to test L4 load balancing.
Traffic flow: The following diagram shows how traffic flows from the client through ztunnel to the backend service replicas. ztunnel performs L4 (TCP) load balancing and distributes connections across all available endpoints using round-robin.
graph LR
Client[client-in-ambient] -->|1. Request| Ztunnel[ztunnel<br/>L4 load balancer]
Ztunnel -->|2. Round-robin| Backend1[in-ambient pod 1]
Ztunnel -->|2. Round-robin| Backend2[in-ambient pod 2]
Ztunnel -->|2. Round-robin| Backend3[in-ambient pod 3]
Deploy the
in-ambienthttpbin sample app. This manifest creates thehttpbinnamespace with anin-ambientbackend service. The app is already labeled for inclusion in the ambient mesh withistio.io/dataplane-mode: ambient.kubectl apply -f https://raw.githubusercontent.com/solo-io/doc-examples/main/istio/sample-apps/in-ambient.yamlDeploy the
client-in-ambientapp in the same namespace. This app is also already labeled for the ambient mesh.kubectl apply -f https://raw.githubusercontent.com/solo-io/doc-examples/main/istio/sample-apps/client-in-ambient.yamlScale the
in-ambientdeployment to 3 replicas so that you can observe load balancing across multiple endpoints.kubectl scale deployment in-ambient -n httpbin --replicas=3Verify that all pods are running.
kubectl get pods -n httpbinExample output, in which
in-ambientruns as 3 replicas:NAME READY STATUS RESTARTS AGE client-in-ambient-6b5c96c4f8-x2j9k 1/1 Running 0 30s in-ambient-7d8f9b6c54-abc12 1/1 Running 0 45s in-ambient-7d8f9b6c54-def34 1/1 Running 0 20s in-ambient-7d8f9b6c54-ghi56 1/1 Running 0 20s
Step 2: Test L4 load balancing
Send requests from the client to the backend service to observe the default round-robin load balancing behavior.
Send multiple curl requests from the client to the
in-ambientservice. The/hostnameendpoint returns the pod hostname of the replica that handled the request. Verify that the requests are distributed across all three replicas in a round-robin pattern.kubectl exec -n httpbin deploy/client-in-ambient -- sh -c " for i in \$(seq 1 12); do curl -s http://in-ambient:8000/hostname done"Example output:
in-ambient-7d8f9b6c54-abc12 in-ambient-7d8f9b6c54-def34 in-ambient-7d8f9b6c54-ghi56 in-ambient-7d8f9b6c54-abc12 ...Review the ztunnel logs to verify traffic flow. You can see connection events showing traffic from the client to the backend pods.
kubectl logs -n istio-system -l app=ztunnel --tail=20 | grep "in-ambient"Example output:
2025-03-06T16:52:48.095517Z info access connection complete src.addr=10.10.0.14:40292 src.workload="client-in-ambient-6b5c96c4f8-x2j9k" src.namespace="httpbin" src.identity="spiffe://cluster.local/ns/httpbin/sa/client-in-ambient" dst.addr=10.10.0.15:8080 dst.service="in-ambient.httpbin.svc.cluster.local" dst.workload="in-ambient-7d8f9b6c54-abc12" dst.namespace="httpbin" direction="outbound" bytes_sent=78 bytes_recv=45 duration="12ms"
Step 3: Test L4 failover
Test how ztunnel handles failover when a backend endpoint becomes unavailable.
Failover behavior: The following diagram shows how ztunnel handles failover when one backend endpoint fails. Traffic automatically redistributes to the remaining healthy endpoints.
graph LR
Client[client-in-ambient] -->|Request| Ztunnel[ztunnel<br/>L4 load balancer]
Ztunnel -->|Traffic| Backend1[in-ambient-...-abc12<br/>✓ Healthy]
Ztunnel -->|Traffic| Backend2[in-ambient-...-def34<br/>✓ Healthy]
Ztunnel -.->|No traffic| Backend3[in-ambient-...-ghi56<br/>✗ Failed]
Get one of the
in-ambientpod names to simulate a failure.FAILING_POD=$(kubectl get pods -n httpbin -l app=in-ambient -o jsonpath='{.items[0].metadata.name}') echo "Will make pod unavailable: $FAILING_POD"Make the pod unavailable by blocking incoming traffic on port 8080. This simulates a pod that becomes unresponsive due to network issues or process hangs.
kubectl exec -n httpbin $FAILING_POD -- sh -c "apt-get update -qq && apt-get install -y -qq iptables > /dev/null 2>&1 && iptables -A INPUT -p tcp --dport 8080 -j DROP"Immediately send requests to observe the failover behavior. During the initial detection window, you might see some failed requests as ztunnel detects the unhealthy endpoint. After detection completes, traffic is distributed only among the remaining healthy endpoints.
kubectl exec -n httpbin deploy/client-in-ambient -- sh -c " for i in \$(seq 1 12); do curl -s --max-time 2 http://in-ambient:8000/hostname || echo 'request failed' done"Example output showing the failover transition:
- The first pod (for example,
-ghi56) that you made unavailable is not shown in successful responses. - Initial requests show
request failedas ztunnel detects the unhealthy endpoint via TCP health checks. - After detection completes, traffic flows only to the 2 healthy pods (for example,
-abc12and-def34).
in-ambient-7d8f9b6c54-abc12 request failed in-ambient-7d8f9b6c54-def34 in-ambient-7d8f9b6c54-abc12 in-ambient-7d8f9b6c54-def34 request failed in-ambient-7d8f9b6c54-abc12 in-ambient-7d8f9b6c54-def34 in-ambient-7d8f9b6c54-abc12 in-ambient-7d8f9b6c54-def34 in-ambient-7d8f9b6c54-abc12 in-ambient-7d8f9b6c54-def34When ztunnel detects an unhealthy endpoint via L4 health checks (TCP), it closes the connection to that endpoint and opens a new TCP connection to a different endpoint. Detection can take some time, so you might see brief connection errors until failover completes. For faster failover with HTTP-aware detection, use a waypoint with a DestinationRule.- The first pod (for example,
Restore the pod by removing the iptables rule. This simulates recovery of the unhealthy endpoint.
kubectl exec -n httpbin $FAILING_POD -- iptables -FSend requests again to confirm that load balancing is restored across all three replicas.
kubectl exec -n httpbin deploy/client-in-ambient -- sh -c " for i in \$(seq 1 9); do curl -s http://in-ambient:8000/hostname done" | sort | uniq -cExample output, in which the number at the beginning of each line is the count from
uniq -cthat shows how many requests each replica handled:3 in-ambient-7d8f9b6c54-abc12 3 in-ambient-7d8f9b6c54-def34 3 in-ambient-7d8f9b6c54-ghi56
Step 4 (optional): Observe ztunnel outlier detection
In the Solo distribution of Istio, ztunnel includes built-in outlier detection that helps identify and deprioritize unhealthy endpoints. This feature uses Exponentially Weighted Moving Average (EWMA) and circuit breaking to improve failover behavior. For more information about ztunnel outlier detection configuration options, including how to adjust EWMA and circuit breaker settings, see ztunnel outlier detection.
Review the ztunnel logs for outlier detection activity.
kubectl logs -n istio-system -l app=ztunnel --tail=50 | grep -i "health\|ewma\|circuit"Example output showing detection of the unhealthy pod:
health_checkshows connection failures to the pod you made unavailable (for example,ghi56).ewmashows the health score decreasing as failures are detected.circuit_breakershows the endpoint being marked unhealthy and deprioritized.
2025-03-11T18:32:15.421Z warn health_check connection failed to endpoint
dst.addr=10.10.0.15:8080 dst.workload="in-ambient-7d8f9b6c54-ghi56"
dst.service="in-ambient.httpbin.svc.cluster.local" error="connection timeout"
2025-03-11T18:32:15.422Z info ewma updating endpoint health score
dst.workload="in-ambient-7d8f9b6c54-ghi56" previous_score=1.0 new_score=0.65
2025-03-11T18:32:16.103Z warn circuit_breaker endpoint marked unhealthy
dst.workload="in-ambient-7d8f9b6c54-ghi56" consecutive_failures=3 status="open"Cleanup
You can optionally remove the resources that you created in this guide. If you want to continue to the other load balancing and failover guides, you can keep the namespace and apps for use in those guides as well.
kubectl delete namespace httpbinNext steps
- For HTTP-aware failover with outlier detection, see Waypoints (L7).
- For multicluster load balancing and failover, see Multicluster zone and region failover.
- For information about ztunnel outlier detection settings, see ztunnel outlier detection.