You can configure throttling to implement precise control over traffic to cope with issues such as traffic bursts, service overload, resource exhaustion, and malicious attacks. This protects the stability of backend services, reduces costs, and improves user experience. This topic describes the concepts of throttling, throttling modes, and how local throttling and global throttling work.
Concepts of throttling
Throttling is a mechanism that limits the number of requests sent to a service. It specifies the maximum number of requests that clients can send to a service in a given period of time, such as 300 requests per minute or 10 requests per second. The aim of throttling is to prevent a service from being overloaded because it receives excessive requests from a specific client IP address or from global clients.
For example, if you limit the number of requests sent to a service to 300 per minute, the 301st request is denied. At the same time, the HTTP 429 status code that indicates excessive requests is returned.
Throttling modes
Envoy proxies implement throttling in the following modes: local throttling and global throttling. Local throttling is used to limit the request rate of each service instance. Global throttling uses the global gRPC service to provide throttling for the entire Service Mesh (ASM) instance. Local throttling can be used together with global throttling to provide different levels of throttling.
Mode | Description | References |
Local throttling |
| |
Global or distributed throttling |
|
How local throttling works
An Envoy proxy uses the token bucket algorithm to implement local throttling. The token bucket algorithm is a method that limits the number of requests sent to services based on a certain number of tokens in a bucket. Tokens fill in the bucket at a constant rate. When a request is sent to a service, a token is removed from the bucket. When the bucket is empty, requests are denied. Generally, you need to specify the following parameters:
The interval at which the bucket is filled
The number of tokens added to the bucket at each interval
By default, an Envoy proxy returns the HTTP 429 status code when a request is denied and the x-envoy-ratelimited header is set. You can customize the HTTP status code and response header.
Take note of the following concepts when you use the throttling feature:
http_filter_enabled: indicates the percentage of requests for which the local rate limit is checked but not enforced.
http_filter_enforcing: indicates the percentage of requests on which the local rate limit is applied or enforced.
Set the values to percentages. For example, you can set http_filter_enabled to 10% of requests and http_filter_enforcing to 5% of requests. This way, you can test the effect of throttling before it is applied to all the requests.
How global throttling works
Global throttling of Envoy is a mechanism used to control the request rates in an ASM instance. It is implemented based on the rate limit service of Envoy. The rate limit service centrally processes traffic of the entire ASM instance and limits the rates of requests based on predefined rules and quotas.
The configuration of global throttling involves two parts: the Envoy rate_limits filter and the configuration of the rate limit service.
The rate_limits filter contains a list of
actions
. An Envoy proxy attempts to match each request against each action in the rate_limits filter. A descriptor is generated for each action. A descriptor consists of a set of descriptor entries that correspond to an action. Each descriptor entry is a key-value pair, such as"descriptor-key-1": "descriptor-value-1"
and"descriptor-key-2": "descriptor-value-2"
. For more information, see config-http-filters-rate-limit.The configuration of the rate limit service is matched against the descriptor entry generated for each request. The configuration of the rate limit service specifies the rate limit for a specific set of descriptor entries. The rate limit service interacts with the Redis cache to determine whether to limit the rates of requests and sends the throttling decision to the Envoy proxy.
Global throttling can be implemented by combining the rate_limits filter and configuration of the rate limit service. The rate_limits filter generates a descriptor based on the configured action and sends the descriptor to the rate limit service. The rate limit service determines a specific limit based on the information in the descriptor and returns a throttling response. This mechanism allows you to fully control the rates of requests and protect backend services against request bursts.