Rate limit

Control the rate of requests to a destination or route.

About rate limiting

API gateways act as a control point for the outside world to access the various application services that run in your environment, whether monoliths, microservices, or serverless functions. In microservices or hybrid application architecture, these workloads accept an increasingly large number of requests.

Requests might come from external clients or end users. This type of traffic is often called north-south and passes through the ingress gateway, such as Gloo Gateway or Envoy.

Protecting backend services and globally enforcing business limits can become incredibly complex for your developers to handle at the application-level. Instead, you can use rate limiting policies to limit requests that pass through the Gloo Gateway.

With rate limiting, you set a limit for the number of incoming traffic requests that an API accepts over a specific time interval, such as per second, minute, hour, or day. For example, you might say that your website can handle 1,000 requests per second.

Key benefits

Gloo provides a set of custom resources to make it even easier to set up rate limiting for all of the microservices in your environment.

Scalable: Gloo gives you a set of reusable rate limiting resources that use Kubernetes selectors to automatically scale as your policies and workloads grow.

Reusable for cluster ingress and service mesh traffic: You can use the same Gloo resources to apply policy to both traffic into your cluster (ingress or “north-south”) and across the services in your mesh (“east-west”). Such reuse lets you use Gloo Gateway together with Gloo Mesh easily.

Persona-driven: As a platform administrator, you can set up the Gloo rate limit server while registering your workload clusters. Then, you might delegate the configuration and settings for the rate limit server to the lead developer or workspace administrator. Your operators create the rate limit policies that your developers can use across their services simply through Kubernetes labels. For more information, review the following figure and description.

Rate limiting APIs

You can configure rate limiting in Gloo resources by using two primary rate limiting APIs:

Both APIs make use of descriptors and actions.

Envoy API (raw style)

Gloo exposes the Envoy Go/gRPC rate limit service. You can configure a vast number of rate limiting use cases by defining rate limiting actions that specify an ordered tuple of descriptor keys to attach to a request. The descriptors match the ordered tuple of descriptor keys in the actions, and then apply the associated rate limit to the request.

Set-style API

Although powerful, the Envoy API has some drawbacks. It limits only requests whose ordered descriptors match a rule exactly. For example, you might have a use case wherein you want to limit requests for:

In Envoy raw-style, you must configure two sets of actions for each request: one that gets only the value of x-type, and another that gets the value of both x-type and x-number. At scale, this approach can become complex to configure, difficult to troubleshoot, and potentially impact performance.

In contrast, set-style descriptors are treated as an unordered set so that a policy applies if all the specified descriptors match. The order or value of other descriptors does not matter. Using the previous example, you can have the policy apply to requests with an x-type header and an x-number value of any (like a wildcard) or none.

In the Gloo API, these set-style settings have set appended to the name. For example:

Descriptors and actions

Descriptors describe the rules for which requests to rate limit. These rules are expressed as an ordered tuple of required key, optional value, and optional rate_limit fields. For more information, see the Envoy rate limiting configuration docs.

Descriptors are set in the Gloo RateLimitServerConfig resource.

In Envoy raw-style API, the complete set of descriptors must match exactly. In set-style, descriptors are matched regardless of other descriptors, so your requests can have more or fewer descriptors. The descriptors in the server config correspond to the actions in the client config.

Actions describe the action for the Envoy client to take, by matching the action to the descriptors on the server. If you specify more than one rate limit action, the request is throttled if any of those rate limiting actions is met. Actions can set up counters for requests related to:

Actions are set in the Gloo RateLimitClientConfig resource.

Gloo rate limit architecture

Review the following figure and description to understand the architecture for rate limiting in Gloo Mesh.

Figure: Gloo rate limit resources.
# Persona Component Description
1 Platform admin Rate limit server The platform admin sets up the rate limit server as part of the Gloo agent installation. By default, one server is created in the gloo-mesh namespace, but you can set up multiple servers in dedicated namespaces, such as gloo-mesh-addons, depending on your use case. The server stores configuration data in a Redis instance. By default, the Redis instance is created for you, but you can choose to bring your own.
2 App owner Rate limit server config Configure the descriptors with the rate limiting rules for the server to accept. You can reuse the same config for multiple servers. To rate limit a request, the action from the client config must match with one of the descriptors in the server config. You must set up a rate limit server config for each rate limit server before you can apply a policy that uses the server. This config is translated to an internal resource, the RatelimitConfig, which is used to build the Envoy config. Typically, you don't need to worry about the internal RatelimitConfig, but if you notice that the server rejects your configuration, you can check the RatelimitConfig to start debugging.
3 App owner Rate limit server settings Optional, unless you have multiple servers per cluster or if the rate limit server has a non-default name or namespace. Optionally set up how a client, such as a sidecar or gateway proxy, connects to the rate limit server, such as adding a request timeout.
4 Operator Rate limit client config Set up the rate limit actions for the routes or destinations that you want to apply the policy to. This way, you can reuse the same client config across multiple policies. Don't need to reuse client configs? You can add the client config in each rate limit policy instead of creating a separate resource. The default settings are to use the rate-limiter in the gloo-mesh namespace of the cluster that the policy is created in.
5 Operator Rate limit policy The rate limit policy selects the routes and destinations for the rate limit to apply to, along with the server config, client config, and server settings to use. The policy can even be reused across workspaces, if it is exported.
6 Operator Route table Create a route table for routes that match up to your developer's apps. The route table can serve routes from other workspaces if the import and export settings are set up appropriately. Policies are applied to routes based on matching Kubernetes labels.
7 Developer Destinations and apps Create the destinations that a rate limit policy applies to. These destinations are backed by the virtual destination, external service, or Kubernetes service for the apps that you develop. Policies are automatically applied based on matching Kubernetes labels. The same policy can apply to multiple destinations across workspaces.

Gloo rate limit API reference

For more information, see the API docs for each rate limit server and policy resource.

Rate limit guides

Server setup

Set up and configure the rate limit server.

Basic rate limit policy

Apply a basic rate limiting policy.

More rate limit policy examples

Review more examples for Envoy, set-style, and other rate limiting APIs.