On this page

More rate limit policy examples

Review more examples for Envoy raw-style and set-style rate limiting APIs.

Control the rate of requests to destinations within the service mesh. The following examples show you different types of rate limiting, such as based on header requests. For a simple example based on generic key requests, see Basic rate limit policy.

check_circle

If you import or export resources across workspaces, your policies might not apply. For more information, see Import and export policies.

notifications

You cannot apply this policy to a route that already has a redirect, rewrite, or direct response action. Keep in mind that these actions might not be explicitly defined in the route configuration. For example, invalid routes are automatically replaced with a direct response action, such as when the backing destination is wrong. First, verify that your route configuration is correct. Then, decide whether to apply the policy. To apply the policy, remove any redirect, rewrite, or direct response actions. To keep the actions and not apply the policy, change the route labels of either the policy or the route.

Before you begin

info

This guide assumes that you use the same names for components like clusters, workspaces, and namespaces as in the getting started. If you have different names, make sure to update the sample configuration files in this guide.

Set up Gloo Mesh Gateway in a single cluster.
Install Bookinfo and other sample apps.
Configure an HTTP listener on your gateway and set up basic routing for the sample apps.
Make sure that the rate limiting service is installed and running. If not, install the rate limiting service.
```
kubectl get pods -A -l app=rate-limiter
```

Create the Gloo resources for this policy in the management and workload clusters. For more information about the rate limit server and client configuration resources, see Rate limit server setup.

info

The following files are examples only for testing purposes. Your actual setup might vary. You can use the files as a reference for creating your own tests.

Download the following Gloo resources:

Apply the files to your management cluster.

kubectl apply -f kubernetes-cluster_gloo-mesh_cluster-1.yaml --context ${MGMT_CONTEXT}
kubectl apply -f kubernetes-cluster_gloo-mesh_cluster-2.yaml --context ${MGMT_CONTEXT}
kubectl apply -f workspace_gloo-mesh_anything.yaml --context ${MGMT_CONTEXT}

Download the following Gloo resources:
- Rate limit server:
  - Rate limit server config
  - Rate limit server settings
- Rate limit client config
- Resources to test policies that apply to routes:
  - Route table
  - Virtual gateway
- Workspace settings

Apply the files to your workload cluster.

kubectl apply -f rate-limit-server-config_gloo-mesh-addons_rl-server-config.yaml --context ${REMOTE_CONTEXT1}
kubectl apply -f rate-limit-server-settings_bookinfo_rl-server.yaml --context ${REMOTE_CONTEXT1}
kubectl apply -f rate-limit-client-config_bookinfo_rl-client-config.yaml --context ${REMOTE_CONTEXT1}
kubectl apply -f route-table_bookinfo_www-example-com.yaml --context ${REMOTE_CONTEXT1}
kubectl apply -f virtual-gateway_bookinfo_istio-ingressgateway.yaml --context ${REMOTE_CONTEXT1}
kubectl apply -f workspace-settings_bookinfo_anything.yaml --context ${REMOTE_CONTEXT1}

Set-style request header example, in-line policy

info

This policy currently does not support selecting VirtualDestinations as a destination.

The following example rate limits requests based on headers, in the set-style API. The setActions are configured within the policy, instead of in the reusable client config.

cat << EOF | kubectl apply -f -
apiVersion: admin.gloo.solo.io/v2
kind: RateLimitServerConfig # defines ratelimit rules enforced by the rl server
metadata:
  name: rl-server-config
  namespace: bookinfo
spec:
  destinationServers:
    - ref:
        name: rate-limiter
        namespace: gloo-mesh
      port:
        number: 8083
  raw:
    setDescriptors:
      - simpleDescriptors:
          - key: type
            value: a
          - key: number
            value: one
        rateLimit:
          requestsPerUnit: 5
          unit: MINUTE
EOF

cat << EOF | kubectl apply -f -
apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: RateLimitPolicy # applies rules to route/destination
metadata:
  name: rl-policy
  namespace: bookinfo
spec:
  applyToDestinations:
    - selector:
        labels:
          app: reviews
      port:
        number: 9080
  config:
    ratelimitServerConfig: # matching ratelimit server config required
      namespace: bookinfo
      name: rl-server-config
    raw:
      rateLimits:
      - setActions:
          - requestHeaders:
              descriptorKey: number
              headerName: x-number
          - requestHeaders:
              descriptorKey: type
              headerName: x-type
EOF

Set-style request header example, in the client config

The following example rate limits requests based on headers, in the set-style API. The setActions are configured within a client config, so that you can reuse the configuration across other policies.

cat << EOF | kubectl apply -f -
apiVersion: admin.gloo.solo.io/v2
kind: RateLimitServerConfig # defines ratelimit rules enforced by the rl server
metadata:
  name: rl-server-config
  namespace: bookinfo
spec:
  destinationServers:
    - ref:
        name: rate-limiter
        namespace: gloo-mesh
      port:
        number: 8083
  raw:
    setDescriptors:
      - simpleDescriptors:
          - key: type
            value: a
          - key: number
            value: one
        rateLimit:
          requestsPerUnit: 5
          unit: MINUTE
EOF

cat << EOF | kubectl apply -f -
apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: RateLimitClientConfig # defines client side rules that need to match config enforced by the server
metadata:
  name: rl-client-config
  namespace: bookinfo
spec:
  raw:
    rateLimits:
    - setActions:
        - requestHeaders:
            descriptorKey: number
            headerName: x-number
        - requestHeaders:
            descriptorKey: type
            headerName: x-type
EOF

cat << EOF | kubectl apply -f -
apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: RateLimitPolicy # applies rules to route/destination
metadata:
  name: rl-policy
  namespace: bookinfo
spec:
  applyToDestinations:
    - selector:
        labels:
          app: reviews
      port:
        number: 9080
  config:
    ratelimitServerConfig: # matching ratelimit server config required
      namespace: bookinfo
      name: rl-server-config
    ratelimitClientConfig:
      namespace: bookinfo
      name: rl-client-config
EOF

Raw-style tuples in request headers

The following example nests descriptors in the raw style in the server config, to express rules based on tuples instead of a single value. This rule enforces a limit of 1 request per minute for any unique combination of type and number values in the request header.

The client config defines the actions to match with the server descriptors.

If a request has both the x-type and x-number headers, it is counted towards the limit. If the request does not have one or both headers, then no rate limit is enforced.

warning

Because this uses the raw style, the order of actions must match the order of nesting in the descriptors. If the actions were reversed in this example, with the number action before the type action, then the request would not match and therefore not count towards the rate limit.

cat << EOF | kubectl apply -f -
apiVersion: admin.gloo.solo.io/v2
kind: RateLimitServerConfig # defines ratelimit rules enforced by the rl server
metadata:
  name: rl-server-config
  namespace: bookinfo
spec:
  destinationServers:
    - ref:
        name: rate-limiter
        namespace: gloo-mesh
      port:
        number: 8083
  raw:
    descriptors:
      - key: remote_address
        descriptors:
          - key: type
            descriptors:
              - key: number
                rateLimit:
                  requestsPerUnit: 1
                  unit: MINUTE
EOF

cat << EOF | kubectl apply -f -
apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: RateLimitClientConfig # defines client side rules that need to match config enforced by the server
metadata:
  name: rl-client-config
  namespace: bookinfo
spec:
  raw:
    rateLimits:
    - actions:
      - requestHeaders:
          descriptorKey: type
          headerName: x-type
      - requestHeaders:
          descriptorKey: number
          headerName: x-number
EOF

cat << EOF | kubectl apply -f -
apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: RateLimitPolicy # applies rules to route/destination
metadata:
  name: rl-policy
  namespace: bookinfo
spec:
  applyToDestinations:
    - selector:
        labels:
          app: reviews
      port:
        number: 9080
  config:
    ratelimitServerConfig: # matching ratelimit server config required
      namespace: bookinfo
      name: rl-server-config
    ratelimitClientConfig:
      namespace: bookinfo
      name: rl-client-config
EOF

Raw-style nested descriptors

Building off the previous raw-style example, you might want to enforce a limit if the type is provided but the number is not.

In the server config, you can nest the number descriptor within the type descriptor.

In the client config, define actions for two separate rate limits:

One to increment the counter for the type limit.
One to increment the counter for the type and number pair, when both are present.

The request results in a 429 rate limit error response if either limit is reached.

Matching is attempted against the key and value pair before matching against only the key.

cat << EOF | kubectl apply -f -
apiVersion: admin.gloo.solo.io/v2
kind: RateLimitServerConfig # defines ratelimit rules enforced by the rl server
metadata:
  name: rl-server-config
  namespace: bookinfo
spec:
  destinationServers:
    - ref:
        name: rate-limiter
        namespace: gloo-mesh
      port:
        number: 8083
  raw:
    descriptors:
      - key: type
        rateLimit:
          requestsPerUnit: 3
          unit: MINUTE
        descriptors:
          - key: number
            rateLimit:
              requestsPerUnit: 1
              unit: MINUTE
EOF

cat << EOF | kubectl apply -f -
apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: RateLimitClientConfig # defines client side rules that need to match config enforced by the server
metadata:
  name: rl-client-config
  namespace: bookinfo
spec:
  raw:
    rateLimits:
    - actions:
      - requestHeaders:
          descriptorKey: type
          headerName: x-type
    - actions:
      - requestHeaders:
          descriptorKey: type
          headerName: x-type
      - requestHeaders:
          descriptorKey: number
          headerName: x-number
EOF

cat << EOF | kubectl apply -f -
apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: RateLimitPolicy # applies rules to route/destination
metadata:
  name: rl-policy
  namespace: bookinfo
spec:
  applyToDestinations:
    - selector:
        labels:
          app: reviews
      port:
        number: 9080
  config:
    ratelimitServerConfig: # matching ratelimit server config required
      namespace: bookinfo
      name: rl-server-config
    ratelimitClientConfig:
      namespace: bookinfo
      name: rl-client-config
EOF

Note that in the rate limit configuration “tree,” only the leaf values serve as wildcards that set up a unique limit. The nested, non-leaf descriptors that do not have values serve as a catch-all.

If you use nested descriptors and the descriptor has no value, the cache key does not append the value for the nested, non-leaf configuration. In the nested descriptors example, no value is set for type or number. In this case, the same limit is used regardless of the x-type header value that is sent. However, the x-number header value has a different limit per value, because this field is the leaf node in the descriptor tree.

Priority and weights

You can specify weights on descriptors. For a particular request that has multiple sets of matching actions, the server evaluates each and then increments only the matching rules with the highest weight. By default, the weight is 0.

The following example adds a weight: 1 field to the server config. When a request has both the x-type and x-number headers, then the server evaluates both limits: the limit on type alone, and the limit on the combination of type and number.

Because the number has a higher weight, the server increments only that counter. In this setup, requests with a unique type and number are allowed 10 requests per minute, but requests that have only a type are limited to 1 per minute.

To make sure a rule is always applied, you can add the alwaysApply option to the descriptor.

cat << EOF | kubectl apply -f -
apiVersion: admin.gloo.solo.io/v2
kind: RateLimitServerConfig # defines ratelimit rules enforced by the rl server
metadata:
  name: rl-server-config
  namespace: bookinfo
spec:
  destinationServers:
    - ref:
        name: rate-limiter
        namespace: gloo-mesh
      port:
        number: 8083
  raw:
    descriptors:
      - key: type
        rateLimit:
          requestsPerUnit: 1
          unit: MINUTE
        descriptors:
          - key: number
            weight: 1
            rateLimit:
              requestsPerUnit: 10
              unit: MINUTE
   EOF

cat << EOF | kubectl apply -f -
apiVersion: admin.gloo.solo.io/v2
kind: RateLimitServerConfig # defines ratelimit rules enforced by the rl server
metadata:
  name: rl-server-config
  namespace: bookinfo
spec:
  destinationServers:
    - ref:
        name: rate-limiter
        namespace: gloo-mesh
      port:
        number: 8083
  raw:
    descriptors:
      - key: type
        alwaysApply: true
        rateLimit:
          requestsPerUnit: 1
          unit: MINUTE
        descriptors:
          - key: number
            weight: 1
            rateLimit:
              requestsPerUnit: 10
              unit: MINUTE
EOF

Multiple limits per remote address

As shown in previous examples, you can use the remote_address descriptor to rate limit based on the downstream client address. In practice, you might want to express multiple rules, such as a per-second and per-minute limit.

To do so, you can make remote_address a nested descriptor, with distinct generic keys.

cat << EOF | kubectl apply -f -
apiVersion: admin.gloo.solo.io/v2
kind: RateLimitServerConfig # defines ratelimit rules enforced by the rl server
metadata:
  name: rl-server-config
  namespace: bookinfo
spec:
  destinationServers:
    - ref:
        name: rate-limiter
        namespace: gloo-mesh
      port:
        number: 8083
  raw:
    descriptors:
      - key: generic_key
        value: "per-minute"
        descriptors:
          - key: remote_address
            rateLimit:
              requestsPerUnit: 20
              unit: MINUTE
      - key: generic_key
        value: "per-second"
        descriptors:
          - key: remote_address
            rateLimit:
              requestsPerUnit: 2
              unit: SECOND
EOF

cat << EOF | kubectl apply -f -
apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: RateLimitClientConfig # defines client side rules that need to match config enforced by the server
metadata:
  name: rl-client-config
  namespace: bookinfo
spec:
  raw:
    rateLimits:
    - actions:
      - genericKey:
          descriptorValue: "per-minute"
      - remoteAddress: {}
    - actions:
      - genericKey:
          descriptorValue: "per-second"
      - remoteAddress: {}
EOF

Prioritized traffic based on HTTP method

A useful tactic for building resilient, distributed systems is to implement different rate limits for different “priorities” or “classes” of traffic. This practice is related to the concept of load shedding.

Suppose you have exposed an API that supports both GET and POST methods for listing data and creating resources. Although both functions are important, ultimately the POST action is more important to your business. Therefore, you want to protect the availability of the POST function at the expense of the less important GET function.

In the server config, GET requests are limited to 2 per minute.
In the client config, the actions are configured to extract the method from the request headers.

cat << EOF | kubectl apply -f -
apiVersion: admin.gloo.solo.io/v2
kind: RateLimitServerConfig # defines ratelimit rules enforced by the rl server
metadata:
  name: rl-server-config
  namespace: bookinfo
spec:
  destinationServers:
    - ref:
        name: rate-limiter
        namespace: gloo-mesh
      port:
        number: 8083
  raw:
    descriptors:
      # allow 5 calls per minute for any unique host
      - key: remote_address
        rateLimit:
          requestsPerUnit: 5
          unit: MINUTE
      # specifically limit GET requests from unique hosts to 2 per min
      - key: method
        value: GET
        descriptors:
          - key: remote_address
            rateLimit:
              requestsPerUnit: 2
              unit: MINUTE
EOF

cat << EOF | kubectl apply -f -
apiVersion: trafficcontrol.policy.gloo.solo.io/v2
kind: RateLimitClientConfig # defines client side rules that need to match config enforced by the server
metadata:
  name: rl-client-config
  namespace: bookinfo
spec:
  raw:
    rateLimits:
    - actions:
      - remoteAddress: {}
    - actions:
      - requestHeaders:
          descriptorKey: method
          headerName: :method
      - remoteAddress: {}
EOF

More rate limit policy examples

Before you begin link

Set-style request header example, in-line policy link

Set-style request header example, in the client config link

Raw-style tuples in request headers link

Raw-style nested descriptors link

Priority and weights link

Multiple limits per remote address link

Prioritized traffic based on HTTP method link

Before you begin

Set-style request header example, in-line policy

Set-style request header example, in the client config

Raw-style tuples in request headers

Raw-style nested descriptors

Priority and weights

Multiple limits per remote address

Prioritized traffic based on HTTP method