Navigation :

Rate limit

Control the rate of requests to a destination or route.

About rate limiting
Rate limiting APIs
Gloo rate limit architecture
Gloo rate limit API reference
Rate limit guides

About rate limiting

API gateways act as a control point for the outside world to access the various application services that run in your environment, whether monoliths, microservices, or serverless functions. In microservices or hybrid application architecture, these workloads accept an increasingly large number of requests.

Requests might come from external clients or end users. This type of traffic is often called north-south and passes through the ingress gateway, such as Gloo Gateway or Envoy.

Protecting backend services and globally enforcing business limits can become incredibly complex for your developers to handle at the application-level. Instead, you can use rate limiting policies to limit requests that pass through the Gloo Gateway.

With rate limiting, you set a limit for the number of incoming traffic requests that an API accepts over a specific time interval, such as per second, minute, hour, or day. For example, you might say that your website can handle 1,000 requests per second.

Key benefits

Gloo provides a set of custom resources to make it even easier to set up rate limiting for all of the microservices in your environment.

Scalable: Gloo gives you a set of reusable rate limiting resources that use Kubernetes selectors to automatically scale as your policies and workloads grow.

Reusable for cluster ingress and service mesh traffic: You can use the same Gloo resources to apply policy to both traffic into your cluster (ingress or “north-south”) and across the services in your mesh (“east-west”). Such reuse lets you use Gloo Gateway together with Gloo Mesh easily.

Persona-driven: As a platform administrator, you can set up the Gloo rate limit server while registering your workload clusters. Then, you might delegate the configuration and settings for the rate limit server to the lead developer or workspace administrator. Your operators create the rate limit policies that your developers can use across their services simply through Kubernetes labels. For more information, review the following figure and description.

Rate limiting APIs

You can configure rate limiting in Gloo resources by using two primary rate limiting APIs:

Envoy rate limit API, or raw style.
Set-style rate limit API, for use cases wherein you want a rule to apply to all of the matching descriptors, regardless of order or other descriptors with the same values.

Both APIs make use of descriptors and actions.

Envoy API (raw style)

Gloo exposes the Envoy Go/gRPC rate limit service. You can configure a vast number of rate limiting use cases by defining rate limiting actions that specify an ordered tuple of descriptor keys to attach to a request. The descriptors match the ordered tuple of descriptor keys in the actions, and then apply the associated rate limit to the request.

Set-style API

Although powerful, the Envoy API has some drawbacks. It limits only requests whose ordered descriptors match a rule exactly. For example, you might have a use case wherein you want to limit requests for:

Requests with an x-type header
Requests with both an x-type header and an x-number=5 header.

In Envoy raw-style, you must configure two sets of actions for each request: one that gets only the value of x-type, and another that gets the value of both x-type and x-number. At scale, this approach can become complex to configure, difficult to troubleshoot, and potentially impact performance.

In contrast, set-style descriptors are treated as an unordered set so that a policy applies if all the specified descriptors match. The order or value of other descriptors does not matter. Using the previous example, you can have the policy apply to requests with an x-type header and an x-number value of any (like a wildcard) or none.

In the Gloo API, these set-style settings have set appended to the name. For example:

Envoy raw-style: descriptors and actions
Set-style: setDescriptors and setActions

Descriptors and actions

Descriptors describe the rules for which requests to rate limit. These rules are expressed as an ordered tuple of required key, optional value, and optional rate_limit fields. For more information, see the Envoy rate limiting configuration docs.

key: The key for the rule to use to match a request. You must provide a key for each descriptor.
value: An optional value that you can provide for the key, to further scope the rule matching.
rate_limit: An optional value to set a rate limit for any requests that match the key and values that you set for the descriptor.
weight: An optional value to set priority for rules.

Descriptors are set in the Gloo RateLimitServerConfig resource.

In Envoy raw-style API, the complete set of descriptors must match exactly. In set-style, descriptors are matched regardless of other descriptors, so your requests can have more or fewer descriptors. The descriptors in the server config correspond to the actions in the client config.

Actions describe the action for the Envoy client to take, by matching the action to the descriptors on the server. If you specify more than one rate limit action, the request is throttled if any of those rate limiting actions is met. Actions can set up counters for requests related to:

Source cluster
Destination cluster
Request headers
Remote address
Generic key
Header value match
Metadata

Actions are set in the Gloo RateLimitClientConfig resource.

Gloo rate limit architecture

Review the following figure and description to understand the architecture for rate limiting in Gloo Mesh.

#	Persona	Component	Description
1	Platform admin	Rate limit server	The platform admin sets up the rate limit server as part of the Gloo agent installation. By default, one server is created in the `gloo-mesh` namespace, but you can set up multiple servers in dedicated namespaces, such as `gloo-mesh-addons`, depending on your use case. The server stores configuration data in a Redis instance. By default, the Redis instance is created for you, but you can choose to bring your own.
2	App owner	Rate limit server config	Configure the descriptors with the rate limiting rules for the server to accept. You can reuse the same config for multiple servers. To rate limit a request, the action from the client config must match with one of the descriptors in the server config. You must set up a rate limit server config for each rate limit server before you can apply a policy that uses the server. This config is translated to an internal resource, the RatelimitConfig, which is used to build the Envoy config. Typically, you don't need to worry about the internal RatelimitConfig, but if you notice that the server rejects your configuration, you can check the RatelimitConfig to start debugging.
3	App owner	Rate limit server settings	Optional, unless you have multiple servers per cluster or if the rate limit server has a non-default name or namespace. Optionally set up how a client, such as a sidecar or gateway proxy, connects to the rate limit server, such as adding a request timeout.
4	Operator	Rate limit client config	Set up the rate limit actions for the routes or destinations that you want to apply the policy to. This way, you can reuse the same client config across multiple policies. Don't need to reuse client configs? You can add the client config in each rate limit policy instead of creating a separate resource. The default settings are to use the `rate-limiter` in the `gloo-mesh` namespace of the cluster that the policy is created in.
5	Operator	Rate limit policy	The rate limit policy selects the routes and destinations for the rate limit to apply to, along with the server config, client config, and server settings to use. The policy can even be reused across workspaces, if it is exported.
6	Operator	Route table	Create a route table for routes that match up to your developer's apps. The route table can serve routes from other workspaces if the import and export settings are set up appropriately. Policies are applied to routes based on matching Kubernetes labels.
7	Developer	Destinations and apps	Create the destinations that a rate limit policy applies to. These destinations are backed by the virtual destination, external service, or Kubernetes service for the apps that you develop. Policies are automatically applied based on matching Kubernetes labels. The same policy can apply to multiple destinations across workspaces.

Gloo rate limit API reference

For more information, see the API docs for each rate limit server and policy resource.

Rate limit guides

Server setup

Set up and configure the rate limit server.

Guide

Basic rate limit policy

Apply a basic rate limiting policy.

Guide

More rate limit policy examples

Review more examples for Envoy, set-style, and other rate limiting APIs.

Guide