If an upstream service is unavailable, the service is removed from the load balancing pool until health is re-established.

For more information, see the following resources.

Before you begin

  1. Set up Gloo Mesh Gateway in a single cluster.
  2. Install Bookinfo and other sample apps.
  3. Configure an HTTP listener on your gateway and set up basic routing for the sample apps.

Configure active healthcheck policies

You can apply an active healthcheck policy at the destination level. For more information, see Applying policies.

  apiVersion: resilience.policy.gloo.solo.io/v2
kind: ActiveHealthCheckPolicy
metadata:
  annotations:
    cluster.solo.io/cluster: ""
  name: active-health-check-policy-httbin
  namespace: httpbin
spec:
  applyToDestinations:
  - port:
      number: 8000
    selector:
      name: httpbin
      namespace: httpbin
  config:
    healthCheck:
      alwaysLogHealthCheckFailures: true
      eventLogPath: /dev/stdout
      healthyThreshold: 1
      httpHealthCheck:
        host: httpbin.httpbin.svc.cluser.local
        path: /anything
      interval: 1s
      noTrafficInterval: 1s
      timeout: 5s
      unhealthyThreshold: 1
    virtualGateways:
    - cluster: $CLUSTER_NAME
      name: istio-ingressgateway
      namespace: bookinfo
  
Review the following table to understand this configuration.
SettingDescription
alwaysLogHealthCheckFailuresLog healthcheck failure events. If set to true, all healthcheck failure events are logged. If set to false, only the initial healthcheck failure event is logged. Any subsequent failure check events are not logged. The default value is false.
eventLogPathThe path of the healthcheck event log.
healthyThresholdThe number of healthy healthchecks that are required before an upstream server is marked as healthy.
httpHealthCheck.hostThe value of the host header in the HTTP healthcheck request. If you use this policy to check the health of a Kubernetes service, make sure to set this field to the hostname of your upstream server to avoid unexpected results. If this field is left empty, the hostname is set to the name of the cluster where the healthcheck is performed, which in Gloo Gateway, follows this format: `
httpHealthCheck.pathThe path in the upstream server that is used to perform the healthcheck.
intervalThe number of seconds between healthchecks.
noTrafficIntervalThe time interval that you want to use between healthchecks before traffic is received in the cluster. This setting can be useful if you want to keep the health information for your upstreams up-to-date and avoid sending a large number of active healthcheck requests for no reason. Once traffic is received, health checks are performed by using the interval setting. The default value is 60s.
timeoutThe time to wait for a healthcheck response. If the timeout is reached, the healthcheck attempt is considered a failure.
unhealthyThresholdThe number of unhealthy healthchecks that are required before an upstream server is marked as unhealthy. Note that if you check the health of an HTTP server, the response code must be in the expected_statuses or retriable_statuses list. If a response code is returned that is not part of this list, the upstream server is marked unhealthy immediately.

Verify active healthcheck policies

  1. Deploy an active healthcheck policy to your cluster.

      kubectl apply -f- <<EOF
    apiVersion: resilience.policy.gloo.solo.io/v2
    kind: ActiveHealthCheckPolicy
    metadata:
      annotations:
        cluster.solo.io/cluster: ""
      name: active-health-check-policy-httpbin
      namespace: httpbin
    spec:
      applyToDestinations:
      - port:
          number: 8000
        selector:
          name: httpbin
          namespace: httpbin
      config:
        healthCheck:
          alwaysLogHealthCheckFailures: true
          eventLogPath: /dev/stdout
          healthyThreshold: 1
          httpHealthCheck:
            host: httpbin.httpbin.svc.cluser.local
            path: /anything
          interval: 1s
          noTrafficInterval: 1s
          timeout: 5s
          unhealthyThreshold: 1
        virtualGateways:
        - cluster: $CLUSTER_NAME
          name: istio-ingressgateway
          namespace: bookinfo
    EOF
      
  2. Wait a few seconds. Then, get the logs of the httpbin app to see the successful healthcheck.

      kubectl logs $(kubectl get pod -l app=istio-ingressgateway -A -o jsonpath='{.items[0].metadata.name}') -n gloo-mesh-gateways
      

    Example output:

      {"health_checker_type":"HTTP","host":{"socket_address":{"protocol":"TCP","address":"10.16.0.20","resolver_name":"","ipv4_compat":false,"port_value":80}},"cluster_name":"outbound|8000||httpbin.httpbin.svc.cluster.local","add_healthy_event":{"first_check":true},"timestamp":"2023-05-23T18:29:37.285Z"}
      
  3. Modify the active healthcheck policy to use the /status/500 path to perform the healthcheck on the httpbin app. Because this endpoint returns a 500 HTTP response code, the healthcheck fails and the upstream server is marked as unhealthy.

      kubectl apply -f- <<EOF
    apiVersion: resilience.policy.gloo.solo.io/v2
    kind: ActiveHealthCheckPolicy
    metadata:
      annotations:
        cluster.solo.io/cluster: ""
      name: active-health-check-policy-httpbin
      namespace: httpbin
    spec:
      applyToDestinations:
      - port:
          number: 8000
        selector:
          name: httpbin
          namespace: httpbin
      config:
        healthCheck:
          alwaysLogHealthCheckFailures: true
          eventLogPath: /dev/stdout
          healthyThreshold: 1
          httpHealthCheck:
            host: httpbin.httpbin.svc.cluser.local
            path: /status/500
          interval: 1s
          noTrafficInterval: 1s
          timeout: 5s
          unhealthyThreshold: 1
        virtualGateways:
        - cluster: $CLUSTER_NAME
          name: istio-ingressgateway
          namespace: bookinfo
    EOF
      
  4. Wait a few seconds. Then, get the logs again and verify that you see the health_check_failure_event.

      kubectl logs $(kubectl get pod -l app=istio-ingressgateway -A -o jsonpath='{.items[0].metadata.name}') -n gloo-mesh-gateways
      

    Example output:

      {"health_checker_type":"HTTP","host":{"socket_address":{"protocol":"TCP","address":"10.16.0.20","resolver_name":"","ipv4_compat":false,"port_value":80}},"cluster_name":"outbound|8000||httpbin.httpbin.svc.cluster.local","health_check_failure_event":{"failure_type":"ACTIVE","first_check":false},"timestamp":"2023-05-23T18:41:00.410Z"}
      

Cleanup

You can optionally remove the resources that you set up as part of this guide.
  kubectl delete activehealthcheckpolicy active-health-check-policy-httpbin -n httpbin