Load balancer and consistent hash

Specify how you want the ingress gateway to select an upstream service to serve an incoming client request.

For more information, see the following resources.

About simple load balancing

Gloo Gateway supports multiple load balancing algorithms for selecting upstream services to forward incoming requests to. By default, the gateway forwards incoming requests to the instance with the least requests. You can change this behavior and instead use a round robin algorithm to forward the request to an upstream. For more information about available load balancing options, see Configure load balancer policies.

To configure simple load balancing for incoming requests, you use the spec.config.simple setting in the load balancer policy. To learn more about this setting, see the Istio Destination Rule documentation.

About session affinity and consistent hashing

Session affinity, also referred to as sticky session, allows you to route requests for a particular session to the same upstream service instance that served the initial request. This setup is particularly useful if you have an upstream service that performs expensive operations and caches the output or data for subsequent requests. With session affinity, you make sure that the expensive operation is performed once and that subsequent requests can be served from the upstream's cache, which can significantly improve operational cost and response times for your clients.

The load balancer policy allows you to set up soft session affinity between a client and an upstream service by using a consistent hashing algorithm based on HTTP headers, cookies, or other properties, such as the source IP address or a query parameter. Ringhash and MagLev hash algorithms are also supported. For example, if you have 3 upstream hosts that can serve the request and you use consistent hashing based on headers or cookies, each host is hashed with the header or the cookie that the client provides. If a subsequent request uses the same header or cookie, the hash values are the same and the request is forwarded to the same upstream host that served the initial request. To configure consistent hashing, you use the spec.config.consistentHash setting in the load balancer policy.

Consistent hashing is less reliable than a common sticky session implementation, in which the upstream service is encoded in a cookie and affinity can be maintained for as long as the upstream service is available. With consistent hashing, affinity might be lost when an upstream service is added or removed.

If you configured locality-based routing, such as with a failover and outlier detection policy, you can use consistent hashing only if all endpoints are in the same locality. If your services are spread across localities, consistent hashing might not work, as session affinity from or to unknown endpoints cannot be created.

To learn more about this setting, see the Istio Destination Rule documentation.

Other load balancing settings

Learn about other load balancing options that you can set in the load balancer policy.

All settings in this section can be set only in conjunction with a simple load balancing mode or consistent hash algorithm.

Healthy panic threshold

By default, the gateway only considers services that are healthy and available when load balancing incoming requests among upstream services. In the case that the number of healthy upstream services becomes too low, you can instruct your gateway to disregard the upstream health status and either load balance requests among all or no hosts by using the healthy_panic_threshold setting. If not set, the threshold defaults to 50%. To disable panic mode, set to 0.

To learn more about this setting and when to use it, see the Envoy documentation.

Update merge window

Sometimes, your deployments might have health checks and metadata updates that use a lot of CPU and memory. In such cases, you can use the update_merge_window setting. This way, the gateway merges all updates together within a specific timeframe. For more information about this setting, see the Envoy documentation. If not set, the update merge window defaults to 1000ms. To disable the update merge window, set this field to 0s.

Warm up duration

If you have new upstream services that need time to get ready for traffic, use the warmupDurationSecs setting. This way, the gateway gradually increases the amount of traffic for the service. This setting is effective in scaling events, such as when new replicas are added to handle increased load. However, if all services start at the same time, this setting might not be as effective as all endpoints receive the same amount of requests.

Note that the warmupDurationSecs field can only be set if the load balancing mode (spec.config.simple) is set to ROUND_ROBIN or LEAST_REQUEST.

To learn more about this setting, see the Istio Destination Rule documentation.

Before you begin

This guide assumes that you use the same names for components like clusters, workspaces, and namespaces as in the getting started, and that your Kubernetes context is set to the cluster you store your Gloo config in (typically the management cluster). If you have different names, make sure to update the sample configuration files in this guide.
Follow the getting started instructions to:

  1. Set up Gloo Gateway in a single cluster.
  2. Deploy sample apps.
  3. Configure an HTTP listener on your gateway and set up basic routing for the sample apps.

Configure load balancer policies

You can apply a load balancer policy at the destination level. For more information, see Applying policies.

The following load balancer policy randomly selects one of the available reviews services to serve an incoming request.

apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: LoadBalancerPolicy
metadata:
  annotations:
    cluster.solo.io/cluster: ""
  name: simple-loadbalancer-policy
  namespace: bookinfo
spec:
  applyToDestinations:
  - port:
      number: 9080
    selector:
      labels:
        app: reviews
  config:
    simple: RANDOM

Review the following table to understand this configuration.

Setting Description
spec.config.simple Specify how the ingress gateway selects an upstream service to serve an incoming request. Choose between following options:
  • UNSPECIFIED: You do not specify a load balancing algorithm, and the gateway uses the default Istio load balancing algorithm, LEAST_REQUEST.
  • RANDOM: The gateway randomly selects a healthy upstream service instance. This setting is typically faster than ROUND_ROBIN if you did not configure a health checking policy.
  • PASSTHROUGH: The gateway does not load balance. Instead, it directly forwards the request to the upstream service that the client requested, such as by providing the IP address of the target destination in the request.
  • ROUND_ROBIN: The gateway forwards the request to each upstream service in turn. Note that this setting does not take into account the current load of an instance and can therefore lead to upstream services that must handle an excessive amount of incoming requests while others remain underutilized. To make sure that your hosts stay balanced, use the LEAST_REQUEST option instead.
  • LEAST_REQUEST: The gateway forwards requests to the upstream service with the least outstanding requests. This is the default setting in Gloo Gateway, and is typically faster than ROUND_ROBIN.

The following example sets up session affinity between a client and an upstream service by using a cookie.

apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: LoadBalancerPolicy
metadata:
  annotations:
    cluster.solo.io/cluster: ""
  name: sticky-loadbalancer-policy
  namespace: bookinfo
spec:
  applyToDestinations:
  - port:
      number: 9080
    selector:
      labels:
        app: reviews
  config:
    consistentHash:
      httpCookie:
        name: chocolate-chip
        ttl: 10s

Review the following table to understand this configuration.

Setting Description
spec.config.consistentHash.httpCookie Specify the cookie to use to create the hash value for the session between a client and an upstream service, which ensures session affinity in subsequent requests. In this example, the cookie is named chocolate-chip. The cookie is valid for 10 seconds (ttl). After it expires, a new host is selected that serves the incoming requests.

The following example sets up session affinity between a client and an upstream service by using HTTP headers.

apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: LoadBalancerPolicy
metadata:
  annotations:
    cluster.solo.io/cluster: ""
  name: sticky-loadbalancer-policy
  namespace: bookinfo
spec:
  applyToDestinations:
  - port:
      number: 9080
    selector:
      labels:
        app: reviews
  config:
    consistentHash:
      httpHeaderName: x-user

Review the following table to understand this configuration.

Setting Description
spec.config.consistentHash.httpHeaderName Specify the HTTP header to use to create the hash value for the session between a client and an upstream service, which ensures session affinity in subsequent requests. In this example, the x-user HTTP header is used.

The following example disables the healthy panic threshold for the reviews upstream service.

apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: LoadBalancerPolicy
metadata:
  annotations:
    cluster.solo.io/cluster: ""
  name: simple-loadbalancer-policy
  namespace: bookinfo
spec:
  applyToDestinations:
  - port:
      number: 9080
    selector:
      labels:
        app: reviews
  config:
    healthyPanicThreshold: 0
    simple: ROUND_ROBIN

Review the following table to understand this configuration.

Setting Description
spec.config.healthyPanicThreshold Specify the percentage of upstream services that must be unavailable before the gateway disregards the health check and starts load balancing requests among all or no hosts. If not set, this threshold defaults to 50%. To disable the healthy panic threshold, set this field to 0.

The following example collects and merges all health check updates during a 50s timeframe.

apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: LoadBalancerPolicy
metadata:
  annotations:
    cluster.solo.io/cluster: ""
  name: simple-loadbalancer-policy
  namespace: bookinfo
spec:
  applyToDestinations:
  - port:
      number: 9080
    selector:
      labels:
        app: reviews
  config:
    updateMergeWindow: 50s
    simple: ROUND_ROBIN

Review the following table to understand this configuration.

Setting Description
spec.config.updateMergeWindow Specify the duration that you want the gateway to collect and merge health check, weight, or metadata updates for an upstream service. When the duration ends, all updates that occurred during this timeframe are delivered as one update. If not set, the update merge window defaults to 1000ms. To disable the update merge window, set this field to 0ms.

The following example configures the gateway to wait 10 seconds before requests are routed to newly created review instances.

apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: LoadBalancerPolicy
metadata:
  annotations:
    cluster.solo.io/cluster: ""
  name: simple-loadbalancer-policy
  namespace: bookinfo
spec:
  applyToDestinations:
  - port:
      number: 9080
    selector:
      labels:
        app: reviews
  config:
    warmupDurationSecs: 10s
    simple: ROUND_ROBIN

Review the following table to understand this configuration.

Setting Description
spec.config.warumupDurationSecs Enter the number of seconds for the gateway to wait before sending requests to newly created upstreams. Note that the warmupDurationSecs field can only be set if the load balancing mode (spec.config.simple) is set to ROUND_ROBIN or LEAST_REQUEST.

Verify load balancer policies

  1. Create a simple load balancer policy that uses round robin to select the upstream reviews service.

    kubectl apply -f- <<EOF
    apiVersion: trafficcontrol.policy.gloo.solo.io/v2
    kind: LoadBalancerPolicy
    metadata:
      annotations:
        cluster.solo.io/cluster: ""
      name: loadbalancer-policy
      namespace: bookinfo
    spec:
      applyToDestinations:
      - port:
          number: 9080
        selector:
          labels:
            app: reviews
      config:
        simple: ROUND_ROBIN
        updateMergeWindow: 50s
    EOF
    
  2. Get the Istio destination rule and Envoy filter that was created for you.

    kubectl get destinationrule -n bookinfo -o yaml
    kubectl get envoyfilter -n bookinfo -o yaml
    
  3. Verify that you can see the round robin load balancing algorithm in the Istio destination rule and the update merge window setting in the Envoy filter.

    Example output for the Istio destination rule:

    ...
    spec:
      exportTo:
      - .
      host: reviews.bookinfo.svc.cluster.local
      trafficPolicy:
        portLevelSettings:
        - loadBalancer:
            simple: ROUND_ROBIN
          port:
            number: 9080
    

    Example output for the Envoy filter:

    ...
    spec:
      configPatches:
      - applyTo: CLUSTER
        match:
          cluster:
            portNumber: 9080
            service: reviews.bookinfo.svc.cluster.local
        patch:
          operation: MERGE
          value:
            commonLbConfig:
              updateMergeWindow: 50s
    
  4. Send multiple requests to the reviews app. In your CLI output, make sure that you get back a response from each of the reviews versions.

    curl -vik --resolve www.example.com:80:${INGRESS_GW_IP} http://www.example.com:80/reviews/1
    
    curl -vik --resolve www.example.com:443:${INGRESS_GW_IP} https://www.example.com:443/reviews/1
    

    Example output:

    * Mark bundle as not supporting multiuse
    < HTTP/1.1 200 OK
    HTTP/1.1 200 OK
    ...
    
    < 
    * Connection #0 to host www.example.com left intact
    {"id": "1","podname": "reviews-v2-cdd8fb88b-p74k5","clustername": "null","reviews": [{  "reviewer": "Reviewer1",  "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!", "rating": {"stars": 5, "color": "black"}},{  "reviewer": "Reviewer2",  "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare.", "rating": {"stars": 4, "color": "black"}}]}%
    
    {"id": "1","podname": "reviews-v1-777df99c6d-xhwjg","clustername": "null","reviews": [{  "reviewer": "Reviewer1",  "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!"},{  "reviewer": "Reviewer2",  "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare."}]}
    
    {"id": "1","podname": "reviews-v3-58b6479b-p476q","clustername": "null","reviews": [{  "reviewer": "Reviewer1",  "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!", "rating": {"stars": 5, "color": "red"}},{  "reviewer": "Reviewer2",  "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare.", "rating": {"stars": 4, "color": "red"}}]}
    
  5. Modify your load balancer policy to set up consistent hashing based on an HTTP header.

    kubectl apply -f- <<EOF
    apiVersion: trafficcontrol.policy.gloo.solo.io/v2
    kind: LoadBalancerPolicy
    metadata:
      annotations:
        cluster.solo.io/cluster: ""
      name: loadbalancer-policy
      namespace: bookinfo
    spec:
      applyToDestinations:
      - port:
          number: 9080
        selector:
          labels:
            app: reviews
      config:
        consistentHash:
          httpHeaderName: x-user
    EOF
    
  6. Send a few requests to the reviews app with the x-user header that you specified in the load balancer policy. Note that this time, you get back a response from the same reviews pod, such as reviews-v2.

    curl -vik -H "x-user: me" --resolve www.example.com:80:${INGRESS_GW_IP} http://www.example.com:80/reviews/1
    
    curl -vik -H "x-user: me" --resolve www.example.com:443:${INGRESS_GW_IP} https://www.example.com:443/reviews/1
    

    Example output:

    * Connection #0 to host www.example.com left intact
    {"id": "1","podname": "reviews-v2-cdd8fb88b-p74k5","clustername": "null","reviews": [{  "reviewer": "Reviewer1",  "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!", "rating": {"stars": 5, "color": "black"}},{  "reviewer": "Reviewer2",  "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare.", "rating": {"stars": 4, "color": "black"}}]}%       
    
  7. Remove the x-user header and verify that you now get back responses from all reviews pods again.

    curl -vik --resolve www.example.com:80:${INGRESS_GW_IP} http://www.example.com:80/reviews/1
    
    curl -vik --resolve www.example.com:443:${INGRESS_GW_IP} https://www.example.com:443/reviews/1
    

    Example output:

    * Connection #0 to host www.example.com left intact
    {"id": "1","podname": "reviews-v1-777df99c6d-tmds7","clustername": "null","reviews": [{  "reviewer": "Reviewer1",  "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!"},{  "reviewer": "Reviewer2",  "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare."}]}%    
    
    {"id": "1","podname": "reviews-v2-cdd8fb88b-p74k5","clustername": "null","reviews": [{  "reviewer": "Reviewer1",  "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!", "rating": {"stars": 5, "color": "black"}},{  "reviewer": "Reviewer2",  "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare.", "rating": {"stars": 4, "color": "black"}}]}%
    
    {"id": "1","podname": "reviews-v3-58b6479b-p476q","clustername": "null","reviews": [{  "reviewer": "Reviewer1",  "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!", "rating": {"stars": 5, "color": "red"}},{  "reviewer": "Reviewer2",  "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare.", "rating": {"stars": 4, "color": "red"}}]}
    

Cleanup

You can optionally remove the resources that you set up as part of this guide.
kubectl delete loadbalancerpolicy loadbalancer-policy -n bookinfo