RateLimitConfigs (Enterprise)

Rate limit configuration via RateLimitConfig resources was introduced with Gloo Enterprise, release v1.5.0-beta3. If you are using an earlier version, this feature will not be available.

As we saw in the Envoy API guide, Gloo Enterprise exposes a fine-grained API that allows you to configure a vast number of rate limiting use cases. The two main objects that make up the API are:

  1. the descriptors, which configure the rate limit server and are defined on the global Settings resource, and
  2. the actions that determine how Envoy composes the descriptors that are sent to the server to check whether a request should be rate-limited; actions are defined either on the Route or on the VirtualHost options.

Although powerful, this API has some drawbacks:

To address these shortcomings, we introduced a new custom resource.

RateLimitConfig resources

Starting with Gloo Enterprise v1.5.0-beta3 you can define you rate limits by creating RateLimitConfig resources. A RateLimitConfig resource represents a self-contained rate limit policy; this means that Gloo will use the resource to configure both the Envoy proxies and the Gloo Enterprise rate limit server they communicate with. Gloo guarantees that rate limit rules defined on different RateLimitConfig resources are completely independent of each other.

Here is a simple example of a RateLimitConfig resource:

apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: my-rate-limit-policy
  namespace: gloo-system
spec:
  raw:
    descriptors:
    - key: generic_key
      value: counter
      rateLimit:
        requestsPerUnit: 10
        unit: MINUTE
    rateLimits:
    - actions:
      - genericKey:
          descriptorValue: counters

Once an RateLimitConfig is created, it can be used to enforce rate limits on your VirtualServices (or on the lower-level Proxy resources). You can do that be referencing the resource at two different levels:

The configuration format is the same in both cases. It must be specified under the relevant options attribute (on Virtual Hosts or Routes). This snippet shows an example configuration that uses the above RateLimitConfig:

options:
  rateLimitConfigs:
    refs:
    - name: my-rate-limit-policy
      namespace: gloo-system

RateLimitConfigs defined on a VirtualHost is inherited by all the Routes that belong to that VirtualHost, unless a route itself references its own RateLimitConfigs.

Configuration format

Each RateLimitConfig is an instance of one specific configuration type. Currently, only the raw configuration type is implemented, but we are planning on adding more high-level configuration formats to support specific use cases (e.g. limiting requests based on the presence and value of a header, or on a per-upstream, per-client basis, etc.).

The raw configuration allows you to specify rate limit policies using the raw configuration formats used by the server and the client (Envoy). It consists of two elements:

These two objects have the exact some format as the descriptors and ratelimits that are explained in detail in the Envoy API guide.

Example

Let’s run through an example that uses RateLimitConfig resources to enforce rate limit policies on your Virtual Services. As mentioned earlier, all the examples that are listed in the Envoy API guide apply to RateLimitConfigs as well, so please be sure to check them out.

Initial setup

First, we need to install Gloo Enterprise (minimum version 1.5.0-beta3). Please refer to the corresponding installation guide for details.

Let’s also deploy a simple application which returns “Hello World” when receiving HTTP requests:

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: http-echo
  name: http-echo
  namespace: default
spec:
  selector:
    matchLabels:
      app: http-echo
  replicas: 1
  template:
    metadata:
      labels:
        app: http-echo
    spec:
      containers:
      - image: hashicorp/http-echo:latest
        name: http-echo
        args: ["-text=Hello World!"]
        ports:
        - containerPort: 5678
          name: http
---
apiVersion: v1
kind: Service
metadata:
  name: http-echo
  namespace: default
  labels:
    service: http-echo
spec:
  ports:
  - port: 5678
    protocol: TCP
  selector:
    app: http-echo
EOF

For the purpose of this example, let’s create two different upstreams that point to the same service. You’ll soon see why we do this.

kubectl apply -f - <<EOF
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
  name: echo-1
  namespace: gloo-system
spec:
  static:
    hosts:
      - addr: http-echo.default.svc.cluster.local
        port: 5678
---
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
  name: echo-2
  namespace: gloo-system
spec:
  static:
    hosts:
      - addr: http-echo.default.svc.cluster.local
        port: 5678
EOF

Now let’s create a Virtual Service with two different routes. Requests with the /echo-1 and /echo-2 path prefixes will be routed to http-echo service.

kubectl apply -f - << EOF
apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: echo
  namespace: gloo-system
spec:
  displayName: echo
  virtualHost:
    domains:
      - '*'
    routes:
      - matchers:
        - prefix: /echo-1
        routeAction:
          single:
            upstream:
              name: echo-1
              namespace: gloo-system
      - matchers:
        - prefix: /echo-2
        routeAction:
          single:
            upstream:
              name: echo-2
              namespace: gloo-system
EOF

To verify that the Virtual Service works, let’s send a request to /echo:

curl $(glooctl proxy url)/echo-1
curl $(glooctl proxy url)/echo-2

Both should return the expected response:

Hello World!

Apply rate limit policies

Now let’s create two RateLimitConfig resources.

kubectl apply -f - << EOF
apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: global-limit
  namespace: gloo-system
spec:
  raw:
    descriptors:
    - key: generic_key
      value: count
      rateLimit:
        requestsPerUnit: 4
        unit: MINUTE
    rateLimits:
    - actions:
      - genericKey:
          descriptorValue: count
---
apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: per-upstream-counter
  namespace: gloo-system
spec:
  raw:
    descriptors:
    - key: destination_cluster
      rateLimit:
        requestsPerUnit: 3
        unit: MINUTE
    rateLimits:
    - actions:
      - destinationCluster: {}
EOF

Let’s see what each of these resources represents:

Now let’s apply these policies to our VirtualService:

kubectl apply -f - << EOF
apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: echo
  namespace: gloo-system
spec:
  displayName: echo
  virtualHost:
    domains:
      - '*'
    options:
      rateLimitConfigs:
        refs:
        - name: global-limit
          namespace: gloo-system
        - name: per-upstream-counter
          namespace: gloo-system
    routes:
      - matchers:
        - prefix: /echo-1
        routeAction:
          single:
            upstream:
              name: echo-1
              namespace: gloo-system
      - matchers:
        - prefix: /echo-2
        routeAction:
          single:
            upstream:
              name: echo-2
              namespace: gloo-system
EOF

We have applied these two policies to the VirtualHost, so they apply to both of the routes that belong to the VirtualHost. This will cause requests to be rate-limited either when:

You can verify that Gloo has been correctly configured by port-forwarding the rate limit server and requesting a config dump. First run:

kubectl port-forward -n gloo-system deploy/rate-limit 9091

Then - from a separate shell - run:

curl http://localhost:9091/rlconfig/

You should get the following response:

solo.io.generic_key_gloo-system.global-limit.generic_key_count: unit=MINUTE requests_per_unit=4 weight=0 always_apply=false
solo.io.generic_key_gloo-system.per-upstream-counter.destination_cluster: unit=MINUTE requests_per_unit=3 weight=0 always_apply=false

Test our configuration

Let’s verify that our rate limit policies are correctly enforced.

First, let’s try sending some requests to the echo-1 upstream. Submit the following command multiple times in rapid succession:

curl -v $(glooctl proxy url)/echo-1

On the fourth attempt you should receive the following response:

< HTTP/1.1 429 Too Many Requests
< x-envoy-ratelimited: true
< date: Tue, 14 Jul 2020 23:13:18 GMT
< server: envoy
< content-length: 0

This demonstrates that the per-upstream rate limit in enforced. Now let’s wait for a minute for the counter to reset and then submit the same command again, but this time only 2 times:

curl -v $(glooctl proxy url)/echo-1

You should get two successful Hello World! responses. After the second attempt, let’s start sending requests to the echo-2 upstream:

curl -v $(glooctl proxy url)/echo-2

The third attempt should return the 429 Too Many Reqeusts response:

< HTTP/1.1 429 Too Many Requests
< x-envoy-ratelimited: true
< date: Tue, 14 Jul 2020 23:13:18 GMT
< server: envoy
< content-length: 0

This is because, although we get 3 requests per minute on the Upstream, we have already reached the global-limit of 4 requests per minute across both Upstreams.