On this page

Active healthcheck

Periodically check the health of an upstream service.

If an upstream service is unavailable, the service is removed from the load balancing pool until health is re-established.

For more information, see the following resources.

Before you begin

info

This guide assumes that you use the same names for components like clusters, workspaces, and namespaces as in the getting started. If you have different names, make sure to update the sample configuration files in this guide.

Configure active healthcheck policies

You can apply an active healthcheck policy at the destination level. For more information, see Applying policies.

apiVersion: resilience.policy.gloo.solo.io/v2
kind: ActiveHealthCheckPolicy
metadata:
  name: active-health-check-policy-httbin
  namespace: httpbin
spec:
  applyToDestinations:
  - port:
      number: 8000
    selector:
      name: httpbin
      namespace: httpbin
  config:
    healthCheck:
      alwaysLogHealthCheckFailures: true
      eventLogPath: /dev/stdout
      healthyThreshold: 1
      httpHealthCheck:
        host: httpbin.httpbin.svc.cluser.local
        path: /anything
      interval: 1s
      noTrafficInterval: 1s
      timeout: 5s
      unhealthyThreshold: 1
    virtualGateways:
    - cluster: $CLUSTER_NAME
      name: istio-ingressgateway
      namespace: bookinfo

Review the following table to understand this configuration.

Setting	Description
`alwaysLogHealthCheckFailures`	Log healthcheck failure events. If set to `true`, all healthcheck failure events are logged. If set to `false`, only the initial healthcheck failure event is logged. Any subsequent failure check events are not logged. The default value is `false`.
`eventLogPath`	The path of the healthcheck event log.
`healthyThreshold`	The number of healthy healthchecks that are required before an upstream server is marked as healthy.
`httpHealthCheck.host`	The value of the host header in the HTTP healthcheck request. If you use this policy to check the health of a Kubernetes service, make sure to set this field to the hostname of your upstream server to avoid unexpected results. If this field is left empty, the hostname is set to the name of the cluster where the healthcheck is performed, which in Gloo Gateway, follows this format: `
`httpHealthCheck.path`	The path in the upstream server that is used to perform the healthcheck.
`interval`	The number of seconds between healthchecks.
`noTrafficInterval`	The time interval that you want to use between healthchecks before traffic is received in the cluster. This setting can be useful if you want to keep the health information for your upstreams up-to-date and avoid sending a large number of active healthcheck requests for no reason. Once traffic is received, health checks are performed by using the `interval` setting. The default value is 60s.
`timeout`	The time to wait for a healthcheck response. If the timeout is reached, the healthcheck attempt is considered a failure.
`unhealthyThreshold`	The number of unhealthy healthchecks that are required before an upstream server is marked as unhealthy. Note that if you check the health of an HTTP server, the response code must be in the `expected_statuses` or `retriable_statuses` list. If a response code is returned that is not part of this list, the upstream server is marked unhealthy immediately.

Verify active healthcheck policies

Deploy an active healthcheck policy to your cluster.

kubectl apply -f- <<EOF
apiVersion: resilience.policy.gloo.solo.io/v2
kind: ActiveHealthCheckPolicy
metadata:
  name: active-health-check-policy-httpbin
  namespace: httpbin
spec:
  applyToDestinations:
  - port:
      number: 8000
    selector:
      name: httpbin
      namespace: httpbin
  config:
    healthCheck:
      alwaysLogHealthCheckFailures: true
      eventLogPath: /dev/stdout
      healthyThreshold: 1
      httpHealthCheck:
        host: httpbin.httpbin.svc.cluser.local
        path: /anything
      interval: 1s
      noTrafficInterval: 1s
      timeout: 5s
      unhealthyThreshold: 1
    virtualGateways:
    - cluster: $CLUSTER_NAME
      name: istio-ingressgateway
      namespace: bookinfo
EOF

Wait a few seconds. Then, get the logs of the httpbin app to see the successful healthcheck.

kubectl logs $(kubectl get pod -l app=istio-ingressgateway -A -o jsonpath='{.items[0].metadata.name}') -n istio-ingress

Example output:

{"health_checker_type":"HTTP","host":{"socket_address":{"protocol":"TCP","address":"10.16.0.20","resolver_name":"","ipv4_compat":false,"port_value":80}},"cluster_name":"outbound|8000||httpbin.httpbin.svc.cluster.local","add_healthy_event":{"first_check":true},"timestamp":"2023-05-23T18:29:37.285Z"}

Modify the active healthcheck policy to use the /status/500 path to perform the healthcheck on the httpbin app. Because this endpoint returns a 500 HTTP response code, the healthcheck fails and the upstream server is marked as unhealthy.

kubectl apply -f- <<EOF
apiVersion: resilience.policy.gloo.solo.io/v2
kind: ActiveHealthCheckPolicy
metadata:
  name: active-health-check-policy-httpbin
  namespace: httpbin
spec:
  applyToDestinations:
  - port:
      number: 8000
    selector:
      name: httpbin
      namespace: httpbin
  config:
    healthCheck:
      alwaysLogHealthCheckFailures: true
      eventLogPath: /dev/stdout
      healthyThreshold: 1
      httpHealthCheck:
        host: httpbin.httpbin.svc.cluser.local
        path: /status/500
      interval: 1s
      noTrafficInterval: 1s
      timeout: 5s
      unhealthyThreshold: 1
    virtualGateways:
    - cluster: $CLUSTER_NAME
      name: istio-ingressgateway
      namespace: bookinfo
EOF

Wait a few seconds. Then, get the logs again and verify that you see the health_check_failure_event.

kubectl logs $(kubectl get pod -l app=istio-ingressgateway -A -o jsonpath='{.items[0].metadata.name}') -n istio-ingress

Example output:

{"health_checker_type":"HTTP","host":{"socket_address":{"protocol":"TCP","address":"10.16.0.20","resolver_name":"","ipv4_compat":false,"port_value":80}},"cluster_name":"outbound|8000||httpbin.httpbin.svc.cluster.local","health_check_failure_event":{"failure_type":"ACTIVE","first_check":false},"timestamp":"2023-05-23T18:41:00.410Z"}

Cleanup

You can optionally remove the resources that you set up as part of this guide.

kubectl delete activehealthcheckpolicy active-health-check-policy-httpbin -n httpbin

Active healthcheck

Before you begin link

Configure active healthcheck policies link

Verify active healthcheck policies link

Cleanup link

Before you begin

Configure active healthcheck policies

Verify active healthcheck policies

Cleanup