Specify the number of times and duration for the gateway to try a connection to an unresponsive backend service. You might commonly use retries alongside Timeouts to ensure that your apps are available even if they are temporarily unavailable.

About request retries

A request retry is the number of times a request is retried if it fails. This setting can be useful to avoid your apps from failing if they are temporarily unavailable. With retries, calls are retried a certain number of times before they are considered failed. Retries can enhance your app’s availability by making sure that calls don’t fail permanently because of transient problems, such as a temporarily overloaded service or network.

Before you begin

  1. Follow the Get started guide to install Gloo Gateway.

  2. Follow the Sample app guide to create a gateway proxy with an HTTP listener and deploy the httpbin sample app.

  3. Get the external address of the gateway and save it in an environment variable.

  4. Important: Install the experimental channel of the Kubernetes Gateway API to use this feature.

      kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.3.0/experimental-install.yaml
      

Step 1: Set up your environment for retries

To use retries, you need to install the experimental channel. You can also set up two things that help you test retries: a sample app that can simulate a failure and an access log policy that tracks whether the request was retried.

  1. Install the experimental Kubernetes Gateway API CRDs.

      kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.0/experimental-install.yaml
      
  2. Install a sample app that you can simulate a failure for, such as adding a sleep command to the Bookinfo reviews app.

      kubectl apply -f- <<EOF
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: reviews
      namespace: default
      labels:
        app: reviews
        service: reviews
    spec:
      ports:
      - port: 9080
        name: http
      selector:
        app: reviews
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: bookinfo-reviews
      namespace: default
      labels:
        account: reviews
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: reviews-v1
      namespace: default
      labels:
        app: reviews
        version: v1
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: reviews
          version: v1
      template:
        metadata:
          labels:
            app: reviews
            version: v1
        spec:
          serviceAccountName: bookinfo-reviews
          containers:
          - name: reviews
            image: docker.io/istio/examples-bookinfo-reviews-v1:1.20.3
            imagePullPolicy: IfNotPresent
            env:
            - name: LOG_DIR
              value: "/tmp/logs"
            ports:
            - containerPort: 9080
            volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: wlp-output
              mountPath: /opt/ibm/wlp/output
          volumes:
          - name: wlp-output
            emptyDir: {}
          - name: tmp
            emptyDir: {}   
    EOF
      
  3. Apply an access log policy to the gateway that tracks the number of retries. The key log in the following example is response_flags, which is used to verify that the request was retried. For more information, see the Access logging guide and the Envoy access logs response flags docs.

      kubectl apply -f- <<EOF
    apiVersion: gateway.kgateway.dev/v1alpha1
    kind: HTTPListenerPolicy
    metadata:
      name: access-logs
      namespace: gloo-system
    spec:
      targetRefs:
      - group: gateway.networking.k8s.io
        kind: Gateway
        name: http
      accessLog:
      - fileSink:
          path: /dev/stdout
          jsonFormat:
            start_time: "%START_TIME%"
            method: "%REQ(:METHOD)%"
            path: "%REQ(:PATH)%"
            response_code: "%RESPONSE_CODE%"
            response_flags: "%RESPONSE_FLAGS%"
            upstream_host: "%UPSTREAM_HOST%"
            upstream_cluster: "%UPSTREAM_CLUSTER%"
    EOF
      

Step 2: Set up request retries

Set up retries to the reviews app.

  1. Create an HTTPRoute resource to specify your retry rules. You can apply the retry policy on an HTTPRoute, HTTPRoute rule, or Gateway listener.

  2. Verify that the gateway proxy is configured to retry the request.

    1. Port-forward the gateway proxy on port 19000.

        kubectl port-forward deployment/http -n gloo-system 19000
        
    2. Get the configuration of your gateway proxy as a config dump.

        curl -X POST 127.0.0.1:19000/config_dump\?include_eds > gateway-config.json
        
    3. Open the config dump and find the route configuration for the kube_default_reviews_9080 Envoy cluster on the listener~8080~retry_example virtual host. Verify that the retry policy is set as you configured it.

      Example jq command:

        jq '.configs[] | select(."@type" == "type.googleapis.com/envoy.admin.v3.RoutesConfigDump") | .dynamic_route_configs[].route_config.virtual_hosts[] | select(.routes[].route.cluster == "kube_default_reviews_9080")' gateway-config.json
        

      Example output:

        {
        "name": "listener~8080~retry_example",
        "domains": [
          "retry.example"
        ],
        "routes": [
          {
            "match": {
              "prefix": "/"
            },
            "route": {
              "cluster": "kube_default_reviews_9080",
              "timeout": "20s",
              "retry_policy": {
                "retry_on": "gateway-error,connect-failure,reset",
                "num_retries": 3,
                "per_try_timeout": "1s",
                "retriable_status_codes": [
                  404
                ],
                "retry_back_off": {
                  "base_interval": "0.025s"
                }
              },
              "cluster_not_found_response_code": "INTERNAL_SERVER_ERROR"
            },
            "name": "listener~8080~retry_example-route-0-httproute-retry-default-0-0-matcher-0"
          }
        ]
      }
      ...
        
  3. Send a request to the reviews app. Verify that the request succeeds.

    Example output for a successful response:

      HTTP/1.1 200 OK
    ...
    {"id": "1","podname": "reviews-v1-598b896c9d-l7d8l","clustername": "null","reviews": [{  "reviewer": "Reviewer1",  "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!"},{  "reviewer": "Reviewer2",  "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare."}]}
      
  4. Check the gateway’s access logs to verify that the request was not retried.

      kubectl logs -n gloo-system -l gateway.networking.k8s.io/gateway-name=http | tail -1 | jq
      

    Example output: Note that the response_flags field is -, which means that the request was not retried.

      {
      "method": "GET",
      "path": "/reviews/1",
      "response_code": 200,
      "response_flags": "-",
      "start_time": "2025-06-16T17:24:04.268Z",
      "upstream_cluster": "kube_default_reviews_9080",
      "upstream_host": "10.244.0.24:9080"
    }
      

Step 3: Trigger a retry

Simulate a failure for the reviews app so that you can verify that the request is retried.

  1. Send the reviews app to sleep, to simulate an app failure.

      kubectl -n default patch deploy reviews-v1 --patch '{"spec":{"template":{"spec":{"containers":[{"name":"reviews","command":["sleep","20h"]}]}}}}'
      
  2. Send another request to the reviews app. This time, the request fails.

    Example output:

      HTTP/1.1 503 Service Unavailable
    ...
    upstream connect error or disconnect/reset before headers. retried and the latest reset reason: remote connection failure, transport failure reason: delayed connect error: Connection refused
      
  3. Check the gateway’s access logs to verify that the request was retried.

      kubectl logs -n gloo-system -l gateway.networking.k8s.io/gateway-name=http | tail -1 | jq
      

    Example output: Note that the response_flags field now has values as follows:

    • URX means UpstreamRetryLimitExceeded, which verifies that the request was retried.
    • UF means UpstreamOverflow, which verifies that the request failed.
      {
      "method": "GET",
      "path": "/reviews/1",
      "response_code": 503,
      "response_flags": "URX,UF",
      "start_time": "2025-06-16T17:26:07.287Z",
      "upstream_cluster": "kube_default_reviews_9080",
      "upstream_host": "10.244.0.25:9080"
    }
      

Cleanup

You can remove the resources that you created in this guide.
  1. Delete the HTTPRoute resource.

      kubectl delete httproute retry -n default
      
  2. Delete the reviews app.

      kubectl delete deploy reviews-v1 -n default
    kubectl delete svc reviews -n default
    kubectl delete sa bookinfo-reviews -n default
      
  3. Delete the access log policy.

      kubectl delete httplistenerpolicy access-logs -n gloo-system
      
  4. Delete the GlooTrafficPolicy.

      kubectl delete GlooTrafficPolicy retry 
    kubectl delete GlooTrafficPolicy retry -n gloo-system