As a platform administrator, you can set up the Gloo Mesh rate limit server while registering your workload clusters. By default, one rate limit server is deployed per cluster. If you have more specific rate limiting requirements, you can set up more servers, such as one per workspace, or create the rate limit server in a different namespace.

Then, you might delegate the configuration and settings for the rate limit server to the app owner or workspace administrator.

For more information about how the rate limit server resources work together, see About rate limiting.

Rate limit server setup

During Gloo Mesh registration, set up your workload clusters to use the Envoy Go/gRPC rate limit service. The server must be set up, configured, and healthy for rate limit policies to work. Platform admins typically install the server, because they have access to modify the Gloo Mesh Enterprise agent installation on your workload clusters.

To set up the rate limiter, see the workload cluster setup guide.

During the initial setup or a later upgrade, you might optionally update the rate limiter settings. For more information, see the override settings information.

  • Number of servers: By default, one rate limit server is deployed per cluster. If you have more specific rate limiting requirements, you can set up more servers, such as one per workspace.
  • Number of replicas: You can increase the number of replicas that the rate limiter deployment creates. Each replica stores the rate limit counters in a shared backing Redis database that you set up. For example, you can use the built-in Redis instance or bring your own.
  • Other deployment settings: You might want to update other settings, such as the config maps, volumes, or resource limits for CPU and memory.

Rate limit server traces

You can optionally enable OpenTelemetry trace span exports for the rate limit server component for enhanced observability and distributed tracing in your Gloo setup. You can configure the tracing settings in the rateLimiter.rateLimiter.tracing Helm values of the gloo-platform Helm chart, such as when you install the rate limiter during Gloo installation.

  1. When you set up or upgrade the rate limiter, such as by following the workload cluster setup guide, enable the following rateLimiter.tracing settings. For example, to try out the feature, you might use the following example settings.

      
    rateLimiter:
      ...
      tracing:
        enabled: true
        exporterProtocol: grpc
        otlpEndpoint: grpc://gloo-jaeger-collector.gloo-mesh:4317
        samplingRate: 1 # To ensure that you see all traces created during testing
        serviceName: "RateLimit"
      
  2. Set up your rate limit policy. For example, deploy a sample app like Bookinfo, create a rate limit client config for the bookinfo namespace, and configure a rate limit policy for Bookinfo.

  3. Enable Istio tracing in the cluster. This guide includes steps to run example curl commands for the Bookinfo productpage service to generate traces, and steps to open the Gloo UI to view the traces in the Jaeger UI. For example, you can now see rate limit traces, such as the following.

Figure: Traces for the rate limiter in the Gloo UI.
Figure: Traces for the rate limiter in the Gloo UI.

Dynamic metadata from the rate limiter

You can add dynamic metadata, such as the descriptor status and overall code, to the response message after a request is filtered by the rate limiter. This way, the metadata is available for use in downstream filters, most commonly for access logs. The metadata might be used for cases such as reporting on API usage and limits.

Available dynamic metadata from the rate limiter:

  • overallCode: The rate limit decision, such as OVER_LIMIT.
  • descriptorStatus: Details for the decision of each descriptor, such as OK.

The ability to add dynamic metadata from the rate limiter is disabled by default, because it can impact performance of the response times. Use the following steps to enable dynamic metadata.

  1. Update the rate limiter settings in your Helm values file. The following example shows the rateLimiter.rateLimiter.setDynamicMetadata setting to add. For complete upgrade steps, see the Upgrade guide.

      ...
    rateLimiter:
      enabled: true
      rateLimiter:
        setDynamicMetadata: true
      
  2. Update the Istio access logs to include the dynamic metadata. The following example shows how to add the overallCode and descriptorStatus dynamic metadata settings to a ConfigMap that the Gloo Operator uses to configure Istio for you. For complete steps, see the Istio access logs guide.

      kubectl apply -f- <<EOF
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: gloo-extensions-config
      namespace: gloo-mesh
    data:
      values.istiod: |
        meshConfig:
          # Enable access logging to /dev/stdout
          accessLogFile: /dev/stdout
          # Encoding for the access log (TEXT or JSON). Default value is TEXT.
          accessLogEncoding: TEXT      
          accessLogFormat: "Overall Code is: %DYNAMIC_METADATA(envoy.filters.http.ratelimit:overallCode)%. Descriptor Status is: %DYNAMIC_METADATA(envoy.filters.http.ratelimit:descriptorStatus)%."
    EOF
      

    Review the following table to understand this configuration. For more options, see the Istio access logs guide and the Istio docs.

    SettingDescription
    metadataFor the Gloo Operator to pick up the ConfigMap, the name must be gloo-extensions-config and the namespace must be the same as the Gloo Operator, such as gloo-mesh.
    accessLogFileSet the access log file to /dev/stdout to log to the console.
    accessLogEncodingSet the access log encoding to TEXT to log in plain text, which matches the access log format that you set in the next field.
    accessLogFormatSet the access log format to include the dynamic metadata for %DYNAMIC_METADATA(envoy.filters.http.ratelimit:overallCode)% and %DYNAMIC_METADATA(envoy.filters.http.ratelimit:descriptorStatus)%.
  3. Set up rate limiting, such as with the Basic rate limit policy example.

  4. Send requests to trigger the rate limit. The following example shows a curl command to send a request to the reviews service. Create a temporary curl pod in the bookinfo namespace, so that you can test the app setup. You can also use this method in Kubernetes 1.23 or later, but an ephemeral container might be simpler.

    1. Create the curl pod.

        kubectl run -it -n bookinfo --context ${REMOTE_CONTEXT1} curl --image=curlimages/curl:7.73.0 --rm  -- sh
        
    2. Send a request to the reviews app from within the curl pod to test east-west rate limiting.

        curl http://reviews:9080/reviews/1 -v
        
  5. Check the logs of the rate limited service’s Istio proxy. The following example shows the logs for the sidecar of the reviews service.

      kubectl logs -l app=reviews -c istio-proxy -n bookinfo --context ${REMOTE_CONTEXT1}
      
  6. Verify that the dynamic metadata is included in the logs.

      Overall Code is: OVER_LIMIT. Descriptor Status is: -.
      

Rate limit server config

Configure the descriptors with the rate limiting rules for the server to accept. You can reuse the same config for multiple servers. To rate limit a request, the action from the client config must match with one of the descriptors in the server config. You must create a rate limit server configuration before using rate limit policies. The platform admin, app owner, or workspace admin might configure the rate limit server.

Review the following example YAML file for a rate limit server config. For more information, see the API docs.

  apiVersion: admin.gloo.solo.io/v2
kind: RateLimitServerConfig
metadata:
  name: rl-server-config
  namespace: gloo-mesh
spec:
  destinationServers:
  - port:
      number: 8083
    ref:
      cluster: cluster-1
      name: rate-limiter
      namespace: gloo-mesh
  raw:
    descriptors:
    - key: generic_key
      rateLimit:
        requestsPerUnit: 1
        unit: DAY
      value: counter
  


Review the following table to understand this configuration.

SettingDescription
spec.destinationServersThis example uses the default rate-limiter Kubernetes service in the gloo-mesh namespace on port 8083. In multicluster setups, you can also set kind: VIRTUAL_DESTINATION to select a virtual destination for the rate limiter instead.
spec.raw.descriptorsSet up a raw configuration for the rate limit server to enforce for your policies. Make sure that any rate limit client config that you create does not conflict with this server config. In this example, one rate limit descriptor is set up for requests that match the key: value label of generic_key: counter. These requests are rate limited to 1 per day. For more information, see the descriptors API reference.

Rate limit server settings

Optionally set up how a client, such as a sidecar or gateway proxy, connects to the rate limit server, such as adding a request timeout. Rate limit server settings are an optional resource, unless you have multiple servers per cluster or if the rate limit server has a non-default name or namespace.

If you don’t create rate limit server settings, you must select the server to use in the rate limit policy or the rate limit client config. The app owner or app developer might create the rate limit server settings.

Review the following example YAML file for a rate limit server config. For more information, see the API docs.

  apiVersion: admin.gloo.solo.io/v2
kind: RateLimitServerSettings
metadata:
  name: rl-server
  namespace: bookinfo
spec:
  destinationServer:
    port:
      number: 8083
    ref:
      cluster: cluster-1
      name: rate-limiter
      namespace: gloo-mesh
  


Review the following table to understand this configuration.

SettingDescription
spec.destinationServerThis example connects to the default rate-limiter Kubernetes service in the gloo-mesh namespace on port 8083. No special connection settings such as timeouts or denials are set. In multicluster setups, you can also set kind: VIRTUAL_DESTINATION to select a virtual destination for the rate limiter instead.

Rate limit client config

Configure the actions for the Envoy client to take, by matching the action to the server descriptor. You can reuse the same client config for multiple destinations or routes. You must create a rate limit client configuration before using rate limit policies. The operator or app owner might configure the rate limit client.

Review the following example YAML file for a rate limit client config. For more information, see the API docs.

  apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: RateLimitClientConfig
metadata:
  name: rl-client-config
  namespace: bookinfo
spec:
  raw:
    rateLimits:
    - actions:
      - genericKey:
          descriptorValue: counter
      limit:
        dynamicMetadata:
          metadataKey:
            key: envoy.filters.http.ext_authz
            path:
            - key: opa_auth
            - key: rateLimit
  


Review the following table to understand this configuration.

SettingDescription
spec.rawSet up a raw-style configuration for the rate limit client (the Envoy proxy) to enforce for your policies. Make sure that this rate limit client config does not conflict with the server config. In this example, the action generic_key: counter matches the expected descriptor in the server config. For other possible rate limiting actions such as on request headers, see the API docs.