Load balancing and consistent hashing
Decide how to load balance incoming requests to an upstream service and enforce sticky sessions.
About simple load balancing
Gloo Mesh Gateway supports multiple load balancing algorithms for selecting upstream services to forward incoming requests to. By default, Gloo Mesh Gateway forwards incoming requests to the instance with the least requests. You can change this behavior and instead use a round robin algorithm to forward the request to an upstream. For more information about available load balancing options, see Configure load balancer policies.
To configure simple load balancing for incoming requests, you use the spec.config.simple
setting in the load balancer policy. To learn more about this setting, see the Istio Destination Rule documentation.
About session affinity and consistent hashing
Session affinity, also referred to as sticky session, allows you to route requests for a particular session to the same upstream service instance that served the initial request. This setup is particularly useful if you have an upstream service that performs expensive operations and caches the output or data for subsequent requests. With session affinity, you make sure that the expensive operation is performed once and that subsequent requests can be served from the upstream’s cache, which can significantly improve operational cost and response times for your clients.
The load balancer policy allows you to set up soft session affinity between a client and an upstream service by using a consistent hashing algorithm based on HTTP headers, cookies, or other properties, such as the source IP address or a query parameter. Ringhash and MagLev hash algorithms are also supported. For example, if you have 3 upstream hosts that can serve the request and you use consistent hashing based on headers or cookies, each host is hashed with the header or the cookie that the client provides. If a subsequent request uses the same header or cookie, the hash values are the same and the request is forwarded to the same upstream host that served the initial request. To configure consistent hashing, you use the spec.config.consistentHash
setting in the load balancer policy.
Consistent hashing is less reliable than a common sticky session implementation, in which the upstream service is encoded in a cookie and affinity can be maintained for as long as the upstream service is available. With consistent hashing, affinity might be lost when an upstream service is added or removed.
If you configured locality-based routing, such as with a failover and outlier detection policy, you can use consistent hashing only if all endpoints are in the same locality. If your services are spread across localities, consistent hashing might not work, as session affinity from or to unknown endpoints cannot be created.
When using consistent hashing for virtual destinations in a multicluster setup, you must set spec.clientMode.tlsTermination
to {}
on the virtual destination to ensure proper hashing through the east-west gateway. The gateway is the only component that is aware of the upstream service instances that can fulfill the request. When terminating TLS traffic at the gateway, the gateway can access the cookie and headers that were used to build the consistent hash, and forward the request to the correct upstream service.
Keep in mind the following considerations when enabling TLS termination on a virtual destination:
- With TLS termination enabled, the gateway allows traffic to be forwarded to both mTLS and non-mTLS workloads. For mTLS workloads, a new TLS connection is established with the destination before traffic is forwarded. However, unencrypted traffic is forwarded to non-mTLS workloads.
- The SPIFFE ID of the request changes from the client to the east-west gateway. An Istio Authorization policy is automatically created that allows traffic from the east-west gateway. This can impact metrics collection as you might not be able to determine the clients that the request came from.
To learn more about this setting, see the Istio Destination Rule documentation.
Other load balancing settings
Learn about other load balancing options that you can set in the load balancer policy.
All settings in this section can be set only in conjunction with a simple load balancing mode or consistent hash algorithm.
Healthy panic threshold
By default, Gloo Mesh Gateway only considers services that are healthy and available when load balancing incoming requests among upstream services. In the case that the number of healthy upstream services becomes too low, you can instruct Gloo Mesh Gateway to disregard the upstream health status and either load balance requests among all or no hosts by using the healthy_panic_threshold
setting. If not set, the threshold defaults to 50%. To disable panic mode, set this field to 0.
To learn more about this setting and when to use it, see the Envoy documentation.
Update merge window
Sometimes, your deployments might have health checks and metadata updates that use a lot of CPU and memory. In such cases, you can use the update_merge_window
setting. This way, Gloo Mesh Gateway merges all updates together within a specific timeframe. For more information about this setting, see the Envoy documentation. If not set, the update merge window defaults to 1000ms. To disable the update merge window, set this field to 0s.
Warm up duration
If you have new upstream services that need time to get ready for traffic, use the warmupDurationSecs
setting. This way, Gloo Mesh Gateway gradually increases the amount of traffic for the service. This setting is effective in scaling events, such as when new replicas are added to handle increased load. However, if all services start at the same time, this setting might not be as effective as all endpoints receive the same amount of requests.
Note that the warmupDurationSecs
field can only be set if the load balancing mode (spec.config.simple
) is set to ROUND_ROBIN
or LEAST_REQUEST
.
To learn more about this setting, see the Istio Destination Rule documentation.
Before you begin
This guide assumes that you use the same names for components like clusters, workspaces, and namespaces as in the getting started. If you have different names, make sure to update the sample configuration files in this guide.
- Set up Gloo Mesh Gateway in a single cluster.
- Install Bookinfo and other sample apps.
Configure an HTTP listener on your gateway and set up basic routing for the sample apps.
Configure load balancer policies
You can apply a load balancer policy at the destination level. For more information, see the following resources:
When you apply this custom resource to your cluster, Gloo Mesh Gateway automatically checks the configuration against validation rules and value constraints. You can also run a pre-admission validation check by using the meshctl x validate resources
command. For more information, see the resource validation overview and the CLI command reference.
Verify load balancer policies
Create a simple load balancer policy that uses round robin to select an upstream service.
kubectl apply --context $REMOTE_CONTEXT1 -f- <<EOF apiVersion: trafficcontrol.policy.gloo.solo.io/v2 kind: LoadBalancerPolicy metadata: annotations: cluster.solo.io/cluster: "" name: loadbalancer-policy namespace: bookinfo spec: applyToDestinations: - selector: labels: app: reviews config: simple: ROUND_ROBIN updateMergeWindow: 50s EOF
Get the Istio destination rule and Envoy filter that was created for you.
kubectl get destinationrule -n bookinfo -o yaml kubectl get envoyfilter -n bookinfo -o yaml
Verify that you can see the round robin load balancing algorithm in the Istio destination rule and the update merge window setting in the Envoy filter.
Example output for the Istio destination rule:
... spec: exportTo: - . host: reviews.bookinfo.svc.cluster.local trafficPolicy: portLevelSettings: - loadBalancer: simple: ROUND_ROBIN port: number: 9080
Example output for the Envoy filter:
... spec: configPatches: - applyTo: CLUSTER match: cluster: portNumber: 9080 service: reviews.bookinfo.svc.cluster.local patch: operation: MERGE value: commonLbConfig: updateMergeWindow: 50s
Send multiple requests to the reviews app. In your CLI output, make sure that you get back a response from each of the reviews versions.
- HTTP:
curl -vik --resolve www.example.com:80:${INGRESS_GW_ADDRESS} http://www.example.com:80/reviews/1
- HTTPS:
curl -vik --resolve www.example.com:443:${INGRESS_GW_ADDRESS} https://www.example.com:443/reviews/1
Example output:
* Mark bundle as not supporting multiuse < HTTP/1.1 200 OK HTTP/1.1 200 OK ... < * Connection #0 to host www.example.com left intact {"id": "1","podname": "reviews-v2-cdd8fb88b-p74k5","clustername": "null","reviews": [{ "reviewer": "Reviewer1", "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!", "rating": {"stars": 5, "color": "black"}},{ "reviewer": "Reviewer2", "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare.", "rating": {"stars": 4, "color": "black"}}]}%
{"id": "1","podname": "reviews-v1-777df99c6d-xhwjg","clustername": "null","reviews": [{ "reviewer": "Reviewer1", "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!"},{ "reviewer": "Reviewer2", "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare."}]}
{"id": "1","podname": "reviews-v3-58b6479b-p476q","clustername": "null","reviews": [{ "reviewer": "Reviewer1", "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!", "rating": {"stars": 5, "color": "red"}},{ "reviewer": "Reviewer2", "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare.", "rating": {"stars": 4, "color": "red"}}]}
- HTTP:
Modify your load balancer policy to set up consistent hashing based on an HTTP header.
kubectl apply -f- <<EOF apiVersion: trafficcontrol.policy.gloo.solo.io/v2 kind: LoadBalancerPolicy metadata: annotations: cluster.solo.io/cluster: "" name: loadbalancer-policy namespace: bookinfo spec: applyToDestinations: - port: number: 9080 selector: labels: app: reviews config: consistentHash: httpHeaderName: x-user EOF
Send a few more requests to the app. This time, you provide the
x-user
header as part of your request. Note that you get back a response from the same versioned pod, such as v2.- HTTP:
curl -vik --resolve www.example.com:80:${INGRESS_GW_ADDRESS} http://www.example.com:80/reviews/1
- HTTPS:
curl -vik --resolve www.example.com:443:${INGRESS_GW_ADDRESS} https://www.example.com:443/reviews/1
Example output:
* Connection #0 to host www.example.com left intact {"id": "1","podname": "reviews-v2-cdd8fb88b-p74k5","clustername": "null","reviews": [{ "reviewer": "Reviewer1", "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!", "rating": {"stars": 5, "color": "black"}},{ "reviewer": "Reviewer2", "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare.", "rating": {"stars": 4, "color": "black"}}]}%
- HTTP:
Remove the
x-user
header and verify that you now get back responses from all the versioned pods again in your workload cluster.- HTTP:
curl -vik --resolve www.example.com:80:${INGRESS_GW_ADDRESS} http://www.example.com:80/reviews/1
- HTTPS:
curl -vik --resolve www.example.com:443:${INGRESS_GW_ADDRESS} https://www.example.com:443/reviews/1
Example output:
* Mark bundle as not supporting multiuse < HTTP/1.1 200 OK HTTP/1.1 200 OK ... < * Connection #0 to host www.example.com left intact {"id": "1","podname": "reviews-v2-cdd8fb88b-p74k5","clustername": "null","reviews": [{ "reviewer": "Reviewer1", "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!", "rating": {"stars": 5, "color": "black"}},{ "reviewer": "Reviewer2", "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare.", "rating": {"stars": 4, "color": "black"}}]}%
{"id": "1","podname": "reviews-v1-777df99c6d-xhwjg","clustername": "null","reviews": [{ "reviewer": "Reviewer1", "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!"},{ "reviewer": "Reviewer2", "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare."}]}
{"id": "1","podname": "reviews-v3-58b6479b-p476q","clustername": "null","reviews": [{ "reviewer": "Reviewer1", "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!", "rating": {"stars": 5, "color": "red"}},{ "reviewer": "Reviewer2", "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare.", "rating": {"stars": 4, "color": "red"}}]}
- HTTP:
Cleanup
You can optionally remove the resources that you set up as part of this guide.
kubectl delete loadbalancerpolicy loadbalancer-policy -n bookinfo