Rate limitControl the rate of requests to a destination or route.
- About rate limiting
- Rate limiting APIs
- Gloo rate limit architecture
- Gloo rate limit API reference
- Rate limit guides
About rate limiting
API gateways act as a control point for the outside world to access the various application services that run in your environment, whether monoliths, microservices, or serverless functions. In microservices or hybrid application architecture, these workloads accept an increasingly large number of requests.
Requests might come from external clients or end users. This type of traffic is often called north-south and passes through the ingress gateway, such as Gloo Gateway or Envoy.
Protecting backend services and globally enforcing business limits can become incredibly complex for your developers to handle at the application-level. Instead, you can use rate limiting policies to limit requests that pass through the Gloo Gateway.
With rate limiting, you set a limit for the number of incoming traffic requests that an API accepts over a specific time interval, such as per second, minute, hour, or day. For example, you might say that your website can handle 1,000 requests per second.
Gloo provides a set of custom resources to make it even easier to set up rate limiting for all of the microservices in your environment.
Scalable: Gloo gives you a set of reusable rate limiting resources that use Kubernetes selectors to automatically scale as your policies and workloads grow.
Reusable for cluster ingress and service mesh traffic: You can use the same Gloo resources to apply policy to both traffic into your cluster (ingress or “north-south”) and across the services in your mesh (“east-west”). Such reuse lets you use Gloo Gateway together with Gloo Mesh easily.
Persona-driven: As a platform administrator, you can set up the Gloo rate limit server while registering your workload clusters. Then, you might delegate the configuration and settings for the rate limit server to the lead developer or workspace administrator. Your operators create the rate limit policies that your developers can use across their services simply through Kubernetes labels. For more information, review the following figure and description.
Rate limiting APIs
You can configure rate limiting in Gloo resources by using two primary rate limiting APIs:
- Envoy rate limit API, or
- Set-style rate limit API, for use cases wherein you want a rule to apply to all of the matching descriptors, regardless of order or other descriptors with the same values.
Both APIs make use of descriptors and actions.
Envoy API (raw style)
Gloo exposes the Envoy Go/gRPC rate limit service. You can configure a vast number of rate limiting use cases by defining rate limiting actions that specify an ordered tuple of descriptor keys to attach to a request. The descriptors match the ordered tuple of descriptor keys in the actions, and then apply the associated rate limit to the request.
Although powerful, the Envoy API has some drawbacks. It limits only requests whose ordered descriptors match a rule exactly. For example, you might have a use case wherein you want to limit requests for:
- Requests with an
- Requests with both an
x-typeheader and an
In Envoy raw-style, you must configure two sets of actions for each request: one that gets only the value of
x-type, and another that gets the value of both
x-number. At scale, this approach can become complex to configure, difficult to troubleshoot, and potentially impact performance.
In contrast, set-style descriptors are treated as an unordered set so that a policy applies if all the specified descriptors match. The order or value of other descriptors does not matter. Using the previous example, you can have the policy apply to requests with an
x-type header and an
x-number value of any (like a wildcard) or none.
In the Gloo API, these set-style settings have
set appended to the name. For example:
- Envoy raw-style:
Descriptors and actions
Descriptors describe the rules for which requests to rate limit. These rules are expressed as an ordered tuple of required
value, and optional
rate_limit fields. For more information, see the Envoy rate limiting configuration docs.
key: The key for the rule to use to match a request. You must provide a key for each descriptor.
value: An optional value that you can provide for the key, to further scope the rule matching.
rate_limit: An optional value to set a rate limit for any requests that match the key and values that you set for the descriptor.
weight: An optional value to set priority for rules.
Descriptors are set in the Gloo RateLimitServerConfig resource.
In Envoy raw-style API, the complete set of descriptors must match exactly. In set-style, descriptors are matched regardless of other descriptors, so your requests can have more or fewer descriptors. The descriptors in the server config correspond to the actions in the client config.
Actions describe the action for the Envoy client to take, by matching the action to the descriptors on the server. If you specify more than one rate limit action, the request is throttled if any of those rate limiting actions is met. Actions can set up counters for requests related to:
- Source cluster
- Destination cluster
- Request headers
- Remote address
- Generic key
- Header value match
Actions are set in the Gloo RateLimitClientConfig resource.
Gloo rate limit architecture
Review the following figure and description to understand the architecture for rate limiting in Gloo Mesh.
|1||Platform admin||Rate limit server||The platform admin sets up the rate limit server as part of the Gloo agent installation. By default, one server is created in the
|2||App owner||Rate limit server config||Configure the descriptors with the rate limiting rules for the server to accept. You can reuse the same config for multiple servers. To rate limit a request, the action from the client config must match with one of the descriptors in the server config. You must set up a rate limit server config for each rate limit server before you can apply a policy that uses the server. This config is translated to an internal resource, the RatelimitConfig, which is used to build the Envoy config. Typically, you don't need to worry about the internal RatelimitConfig, but if you notice that the server rejects your configuration, you can check the RatelimitConfig to start debugging.|
|3||App owner||Rate limit server settings||Optional, unless you have multiple servers per cluster or if the rate limit server has a non-default name or namespace. Optionally set up how a client, such as a sidecar or gateway proxy, connects to the rate limit server, such as adding a request timeout.|
|4||Operator||Rate limit client config||Set up the rate limit actions for the routes or destinations that you want to apply the policy to. This way, you can reuse the same client config across multiple policies. Don't need to reuse client configs? You can add the client config in each rate limit policy instead of creating a separate resource. The default settings are to use the
|5||Operator||Rate limit policy||The rate limit policy selects the routes and destinations for the rate limit to apply to, along with the server config, client config, and server settings to use. The policy can even be reused across workspaces, if it is exported.|
|6||Operator||Route table||Create a route table for routes that match up to your developer's apps. The route table can serve routes from other workspaces if the import and export settings are set up appropriately. Policies are applied to routes based on matching Kubernetes labels.|
|7||Developer||Destinations and apps||Create the destinations that a rate limit policy applies to. These destinations are backed by the virtual destination, external service, or Kubernetes service for the apps that you develop. Policies are automatically applied based on matching Kubernetes labels. The same policy can apply to multiple destinations across workspaces.|
Gloo rate limit API reference
For more information, see the API docs for each rate limit server and policy resource.