Use ASMGlobalRateLimiter to configure global throttling for inbound traffic directed to a service into which a sidecar proxy is injected - Alibaba Cloud Service Mesh

Throttling is a mechanism that limits the number of requests sent to a service. It specifies the maximum number of requests that clients can send to a service in a given period of time, such as 300 requests per minute or 10 requests per second. Service Mesh (ASM) of version 1.18.0.131 and later allows you to configure global throttling for ingress gateways and inbound traffic directed to services into which sidecar proxies are injected. This topic describes how to use ASMGlobalRateLimiter to configure global throttling for inbound traffic directed to services in ASM.

Prerequisites

A Container Service for Kubernetes (ACK) managed cluster is added to your ASM instance. The version of the ASM instance is V1.18.0.131 or later. For more information, see Add a cluster to an ASM instance.
Automatic sidecar proxy injection is enabled for the default namespace in the ACK cluster. For more information, see the "Enable automatic sidecar proxy injection" section of the Manage global namespaces topic.
An ASM ingress gateway named ingressgateway is created and port 80 is enabled. For more information, see Create an ingress gateway.
The sample applications, sleep and HTTPBin, are deployed. For more information, see Deploy the HTTPBin application and Deploy the sleep service in a cluster on the data plane.

Deploy a throttling service

You must deploy a throttling service in a cluster on the data plane before the global throttling feature can take effect. The following steps describe how to deploy a throttling service and sample applications.

Note

Envoy proxies implement throttling in the following modes: global throttling and local throttling. This topic describes only how to configure global throttling. For more information about throttling and how to configure local throttling, see Configure local throttling in Traffic Management Center.

Create a ratelimit-svc.yaml file that contains the following content:

Show the ratelimit-svc.yaml file

apiVersion: v1
kind: ServiceAccount
metadata:
  name: redis
---
apiVersion: v1
kind: Service
metadata:
  name: redis
  labels:
    app: redis
spec:
  ports:
  - name: redis
    port: 6379
  selector:
    app: redis
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
   metadata:
      labels:
        app: redis
        sidecar.istio.io/inject: "false"
   spec:
      containers:
      - image: redis:alpine
        imagePullPolicy: Always
        name: redis
        ports:
        - name: redis
          containerPort: 6379
      restartPolicy: Always
      serviceAccountName: redis
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: ratelimit-config
data:
  config.yaml: |
    {}
---
apiVersion: v1
kind: Service
metadata:
  name: ratelimit
  labels:
    app: ratelimit
spec:
  ports:
  - name: http-port
    port: 8080
    targetPort: 8080
    protocol: TCP
  - name: grpc-port
    port: 8081
    targetPort: 8081
    protocol: TCP
  - name: http-debug
    port: 6070
    targetPort: 6070
    protocol: TCP
  selector:
    app: ratelimit
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ratelimit
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ratelimit
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: ratelimit
        sidecar.istio.io/inject: "false"
    spec:
      containers:
        # Latest image from https://hub.docker.com/r/envoyproxy/ratelimit/tags
      - image: envoyproxy/ratelimit:e059638d 
        imagePullPolicy: Always
        name: ratelimit
        command: ["/bin/ratelimit"]
        env:
        - name: LOG_LEVEL
          value: debug
        - name: REDIS_SOCKET_TYPE
          value: tcp
        - name: REDIS_URL
          value: redis:6379
        - name: USE_STATSD
          value: "false"
        - name: RUNTIME_ROOT
          value: /data
        - name: RUNTIME_SUBDIRECTORY
          value: ratelimit
        - name: RUNTIME_WATCH_ROOT
          value: "false"
        - name: RUNTIME_IGNOREDOTFILES
          value: "true"
        ports:
        - containerPort: 8080
        - containerPort: 8081
        - containerPort: 6070
        volumeMounts:
        - name: config-volume
          # $RUNTIME_ROOT/$RUNTIME_SUBDIRECTORY/$RUNTIME_APPDIRECTORY/config.yaml
          mountPath: /data/ratelimit/config
      volumes:
      - name: config-volume
        configMap:
          name: ratelimit-config

Use kubectl to connect to the ACK cluster based on the information in the kubeconfig file, and then run the following command to create the throttling service and the Redis service on which the throttling service depends:
```
kubectl apply -f ratelimit-svc.yaml
```

Scenario 1: Configure global throttling on a specific port of a service

Configure a throttling rule on port 8000 of the HTTPBin service. After the throttling rule is configured, all requests destined for port 8000 of the HTTPBin service are subject to throttling.

Create a global-ratelimit-svc.yaml file that contains the following content:

Show the global-ratelimit-svc.yaml file

apiVersion: istio.alibabacloud.com/v1beta1
kind: ASMGlobalRateLimiter
metadata:
  name: global-svc-test
  namespace: default
spec:
  workloadSelector:
    labels:
      app: httpbin
  rateLimitService:
    host: ratelimit.default.svc.cluster.local
    port: 8081
    timeout:
      seconds: 5
  isGateway: false
  configs:
  - name: httpbin
    limit:
      unit: MINUTE
      quota: 1
    match:
      vhost:
        name: '*'
        port: 8000

The following table describes some of the fields. For more information, see Description of ASMGlobalRateLimiter fields.

Field	Description
`workloadSelector`	The workload on which the throttling rule takes effect. In this example, global throttling needs to take effect on the workload of the HTTPBin service. Configure `app: httpbin` in this field.
`isGateway`	Specifies whether the throttling rule takes effect on the gateway. In this example, the value is set to `false`.
`rateLimitService`	The domain name, port, and connection timeout settings of the throttling service. The following code block shows the settings of the throttling service deployed in Scenario 1: Configure global throttling on a specific port of a service: `host: ratelimit.default.svc.cluster.local port: 8081 timeout: seconds: 5`
`limit`	The throttling parameters to take effect. `unit` indicates the unit of time for throttling detection. `quota` indicates the total number of requests allowed per unit time. In this example, `unit` is set to `MINUTE` and `quota` is set to `1`. This indicates that only one request can be sent per minute on the matching route. If the number of requests exceeds one, throttling is triggered.
`vhost`	The domain name and route on which throttling takes effect. To make the configurations take effect on the HTTPBin service, `name` must be set to`'*'` and `port` must be set to the service port of the HTTPBin service.

Use kubectl to connect to the ASM instance based on the information in the kubeconfig file, and then run the following command to create a global throttling rule that takes effect on inbound traffic of the HTTPBin service:
```
kubectl apply -f global-ratelimit-svc.yaml
```

Run the following command to obtain the configuration of the global throttling rule:

kubectl get asmglobalratelimiter global-svc-test -o yaml

Show the expected output

apiVersion: istio.alibabacloud.com/v1
kind: ASMGlobalRateLimiter
metadata:
  name: global-svc-test
  namespace: default
spec:
  configs:
  - limit:
      quota: 1
      unit: MINUTE
    match:
      vhost:
        name: '*'
        port: 8000
    name: httpbin
  isGateway: false
  rateLimitService:
    host: ratelimit.default.svc.cluster.local
    port: 8081
    timeout:
      seconds: 5
  workloadSelector:
    labels:
      app: httpbin
status:
  config.yaml: |
    descriptors:
    - key: generic_key
      rate_limit:
        requests_per_unit: 1
        unit: MINUTE
      value: RateLimit[global-svc-test.default]-Id[3833670472]
    domain: ratelimit.default.svc.cluster.local
  message: ok
  status: successful

Copy and paste the content of the config.yaml field in the status section in ASMGlobalRateLimiter that is generated in the previous step to the ratelimit-config.yaml file to generate the global throttling service configurations.
The string content in the config.yaml field in the status section in ASMGlobalRateLimiter must be pasted to the config.yaml field in the data section in ConfigMap without changes.
Show the ratelimit-config.yaml file
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: ratelimit-config
data:
  config.yaml: |
    descriptors:
    - key: header_match
      rate_limit:
        requests_per_unit: 1
        unit: MINUTE
      value: RateLimit[global-svc-test.default]-Id[1492204717]
    domain: ratelimit.default.svc.cluster.local
```
Use kubectl to connect to the ACK cluster based on the information in the kubeconfig file, and then run the following command to update the global throttling service configuration in the ACK cluster:
```
kubectl apply -f ratelimit-config.yaml
```
Run the following command to enable bash for the sleep service:
```
kubectl exec -it deploy/sleep -- sh
```
Run the following commands to access the HTTPBin service twice:
```
curl httpbin:8000/get -v
curl httpbin:8000/get -v
```
Expected output:
```
< HTTP/1.1 429
< x-envoy-ratelimited: true
< x-ratelimit-limit: 1, 1;w=60
< x-ratelimit-remaining: 0
< x-ratelimit-reset: 5
< date: Thu, 26 Oct 2023 04:23:54 GMT
< server: envoy
< content-length: 0
< x-envoy-upstream-service-time: 2
< 
* Connection #0 to host httpbin left intact
```
In the global throttling configuration, only one request is allowed to access the HTTPBin application within one minute. When you access the HTTPBin application twice, you can see that throttling is triggered on the second request. This indicates that global throttling takes effect on inbound traffic of the service into which a sidecar proxy is injected.

Scenario 2: Configure a throttling rule for requests destined for a specified path on a specific port of a service

Configure a throttling rule on port 8000 of the HTTPBin service, and specify that the throttling takes effect only on requests destined for the /headers path. After the throttling rule is configured, all requests destined for port 8000 of the HTTPBin service and the /headers path are subject to throttling.

Create a global-ratelimit-svc.yaml file by using the following content as needed based on the version of your ASM instance:

For an ASM instance earlier than v1.19.0

Show the YAML file

apiVersion: istio.alibabacloud.com/v1beta1
kind: ASMGlobalRateLimiter
metadata:
  name: global-svc-test
  namespace: default
spec:
  workloadSelector:
    labels:
      app: httpbin
  rateLimitService:
    host: ratelimit.default.svc.cluster.local
    port: 8081
    timeout:
      seconds: 5
  isGateway: false
  configs:
  - name: httpbin
    limit:
      unit: MINUTE
      quota: 1
    match:
      vhost:
        name: '*'
        port: 8000
        route:
          header_match:
          - name: ":path"
            prefix_match: "/headers"

For an ASM instance of v1.19.0 or later

Show the YAML file

apiVersion: istio.alibabacloud.com/v1beta1
kind: ASMGlobalRateLimiter
metadata:
  name: global-svc-test
  namespace: default
spec:
  workloadSelector:
    labels:
      app: httpbin
  rateLimitService:
    host: ratelimit.default.svc.cluster.local
    port: 8081
    timeout:
      seconds: 5
  isGateway: false
  configs:
  - name: httpbin
    limit:
      unit: SECOND
      quota: 100000
    match:
      vhost:
        name: '*'
        port: 8000
    limit_overrides:
    - request_match:
        header_match:
        - name: ":path"
          prefix_match: "/headers"
      limit:
        unit: MINUTE
        quota: 1

The following table describes some of the fields. For more information, see Description of ASMGlobalRateLimiter fields.

Field	Description
`workloadSelector`	The workload on which the throttling rule takes effect. In this example, global throttling needs to take effect on the workload of the HTTPBin service. Configure `app: httpbin` in this field.
`isGateway`	Specifies whether the throttling rule takes effect on the gateway. In this example, the value is set to `false`.
`rateLimitService`	The domain name, port, and connection timeout settings of the throttling service. The following code block shows the settings of the throttling service deployed in Scenario 1: Configure global throttling on a specific port of a service: `host: ratelimit.default.svc.cluster.local port: 8081 timeout: seconds: 5`
`limit`	The throttling parameters to take effect. `unit` indicates the unit of time for throttling detection. `quota` indicates the total number of requests allowed per unit time. In this example, `unit` is set to `MINUTE` and `quota` is set to `1`. This indicates that only one request can be sent per minute on the matching route. If the number of requests exceeds one, throttling is triggered. If the ASM instance is of V1.19.0 or later, unit is set to SECOND and quota is set to 100000. This indicates that 100,000 requests are allowed to be sent per second on the matching route. It can be deemed that no throttling is set. You can use the `limit_overrides` field to configure throttling for requests that meet a specific requirement.
`vhost`	The domain name and route on which throttling takes effect. To make the configurations take effect on the HTTPBin service, `name` must be set to`'*'` and `port` must be set to the service port of the HTTPBin service. If the ASM instance is earlier than V1.19.0, you can also configure header matching rules for requests in the `route` section. In this example, a special header named `:path` is used to match request paths. It indicates that all requests whose paths start with a forward slash (`/`) are matched. If the ASM instance is of V1.19.0 or later, you can configure header matching rules for requests in the `limit_overrides` field.
`limit_overrides`	The throttling threshold override configuration. This field is supported only by ASM instances of V1.19.0 and later. Different request attributes can be matched. The throttling actions specified in override configurations are applied to matched requests. In this example, the `limit_overrides` field specifies that a special header named `:path` is used to match request paths. It indicates that all requests whose paths start with a `/headers` are matched.

Use kubectl to connect to the ASM instance based on the information in the kubeconfig file, and then run the following command to create a global throttling rule that takes effect on inbound traffic of the HTTPBin service:
```
kubectl apply -f global-ratelimit-svc.yaml
```

Run the following command to obtain the configuration of the global throttling rule:

kubectl get asmglobalratelimiter global-svc-test -o yaml

Show the expected output

apiVersion: istio.alibabacloud.com/v1
kind: ASMGlobalRateLimiter
metadata:
  name: global-svc-test
  namespace: default
spec:
  configs:
  - limit:
      quota: 100000
      unit: SECOND
    limit_overrides:
    - limit:
        quota: 1
        unit: MINUTE
      request_match:
        header_match:
        - name: :path
          prefix_match: /headers
    match:
      vhost:
        name: '*'
        port: 8000
    name: httpbin
  isGateway: false
  rateLimitService:
    host: ratelimit.default.svc.cluster.local
    port: 8081
    timeout:
      seconds: 5
  workloadSelector:
    labels:
      app: httpbin
status:
  config.yaml: |
    descriptors:
    - descriptors:
      - key: header_match
        rate_limit:
          requests_per_unit: 1
          unit: MINUTE
        value: RateLimit[global-svc-test.default]-Id[2613586978]
      key: generic_key
      rate_limit:
        requests_per_unit: 100000
        unit: SECOND
      value: RateLimit[global-svc-test.default]-Id[2613586978]
    domain: ratelimit.default.svc.cluster.local
  message: ok
  status: successful

Copy and paste the content of the config.yaml field in the status section in ASMGlobalRateLimiter that is generated in the previous step to the ratelimit-config.yaml file to generate the global throttling service configurations.

The string content in the config.yaml field in the status section in ASMGlobalRateLimiter must be pasted to the config.yaml field in the data section in ConfigMap without changes.

Show the ratelimit-config.yaml file

apiVersion: v1
kind: ConfigMap
metadata:
  name: ratelimit-config
data:
  config.yaml: |
    descriptors:
    - descriptors:
      - key: header_match
        rate_limit:
          requests_per_unit: 1
          unit: MINUTE
        value: RateLimit[global-svc-test.default]-Id[2613586978]
      key: generic_key
      rate_limit:
        requests_per_unit: 100000
        unit: SECOND
      value: RateLimit[global-svc-test.default]-Id[2613586978]
    domain: ratelimit.default.svc.cluster.local

Use kubectl to connect to the ACK cluster based on the information in the kubeconfig file, and then run the following command to update the global throttling service configuration in the ACK cluster:
```
kubectl apply -f ratelimit-config.yaml
```
Run the following command to enable bash for the sleep service:
```
kubectl exec -it deploy/sleep -- sh
```
Run the following commands to access the /headers path of the HTTPBin service twice:
```
curl httpbin:8000/headers -v
curl httpbin:8000/headers -v
```
Expected output:
```
< HTTP/1.1 429 Too Many Requests
< x-envoy-ratelimited: true
< x-ratelimit-limit: 1, 1;w=60
< x-ratelimit-remaining: 0
< x-ratelimit-reset: 5
< date: Thu, 26 Oct 2023 04:23:54 GMT
< server: envoy
< content-length: 0
< x-envoy-upstream-service-time: 2
< 
* Connection #0 to host httpbin left intact
```
In the global throttling configuration, only one request is allowed to access the /headers path of the HTTPBin service within one minute. When you access the /headers path of the HTTPBin service twice within one minute, you can see that the second request is throttled. This indicates that global throttling takes effect on inbound traffic of the HTTPBin service into which a sidecar proxy is injected.

Run the following command to access the /get path of the HTTPBin service:

curl httpbin:8000/get -v

Show the expected output

*   Trying 192.168.243.21:8000...
* Connected to httpbin (192.168.243.21) port 8000 (#0)
> GET /get HTTP/1.1
> Host: httpbin:8000
> User-Agent: curl/8.1.2
> Accept: */*
>
< HTTP/1.1 200 OK
< server: envoy
< date: Thu, 11 Jan 2024 06:25:09 GMT
< content-type: application/json
< content-length: 431
< access-control-allow-origin: *
< access-control-allow-credentials: true
< x-envoy-upstream-service-time: 7
<
{
  "args": {},
  "headers": {
    "Accept": "*/*",
    "Host": "httpbin:8000",
    "User-Agent": "curl/8.1.2",
    "X-Envoy-Attempt-Count": "1",
    "X-Forwarded-Client-Cert": "By=spiffe://cluster.local/ns/default/sa/httpbin;Hash=be10819991ba1a354a89e68b3bed1553c12a4fba8b65fbe0f16299d552680b29;Subject=\"\";URI=spiffe://cluster.local/ns/default/sa/sleep"
  },
  "origin": "127.0.0.6",
  "url": "http://httpbin:8000/get"
}
* Connection #0 to host httpbin left intact

The output indicates that requests destined to the /get path of the HTTPBin service are successful. This indicates that requests to other paths of the HTTPBin service are not subject to the throttling rule.

Related operations

View metrics related to global throttling

The following table describes the metrics related to global throttling.

Metric	Metric type	Description
envoy_cluster_ratelimit_ok	Counter	The total number of requests allowed by global throttling
envoy_cluster_ratelimit_over_limit	Counter	The total number of requests that are determined to trigger throttling by global throttling
envoy_cluster_ratelimit_error	Counter	The total number of requests that fail to call global throttling

You can configure the proxyStatsMatcher parameter of a sidecar proxy to enable the sidecar proxy to report metrics. Then, you can use Prometheus to collect and view metrics related to throttling.

Configure the proxyStatsMatcher parameter to enable a sidecar proxy to report throttling-related metrics. After you select proxyStatsMatcher, select Regular Expression Match and set the value to .*ratelimit.*. For more information, see the "proxyStatsMatcher" section in the Configure sidecar proxies topic.
Redeploy the HTTPBin service. For more information, see the "(Optional) Redeploy workloads" section in the Configure sidecar proxies topic.
Configure local throttling and perform request tests. For more information, see Scenario 1 or Configure local throttling in Traffic Management Center.

Run the following command to view the global throttling metrics of the HTTPBin service:

kubectl exec -it deploy/httpbin -c istio-proxy -- curl localhost:15090/stats/prometheus|grep envoy_cluster_ratelimit

Expected output:

# TYPE envoy_cluster_ratelimit_ok counter
envoy_cluster_ratelimit_ok{cluster_name="inbound|80||"} 904
# TYPE envoy_cluster_ratelimit_over_limit counter
envoy_cluster_ratelimit_over_limit{cluster_name="inbound|80||"} 3223

Configure metric collection and alerts for global throttling

You can configure a Prometheus instance to collect global throttling metrics that are exposed by a mesh proxy. You can also configure alert rules based on key metrics so that alerts are generated when throttling occurs. This section describes how to configure metric collection and alerts for global throttling. In this example, Managed Service for Prometheus is used.

In Managed Service for Prometheus, connect the ACK cluster on the data plane to the Alibaba Cloud ASM component or upgrade the Alibaba Cloud ASM component to the latest version. This ensures that the global throttling metrics can be collected by Managed Service for Prometheus. For more information about how to connect components to ARMS, see Component management. (If you have integrated a self-managed Prometheus instance to collect metrics of ASM instances, no additional operations are required. For more information, see Monitor ASM instances by using a self-managed Prometheus instance.)

Create an alert rule for global throttling. For more information, see Use a custom PromQL statement to create an alert rule. The following table describes how to specify key parameters for configuring an alert rule. For more information about other parameters, see the preceding documentation.

Parameter	Example	Description
Custom PromQL Statements	(sum by(namespace, service_istio_io_canonical_name) (increase(envoy_cluster_ratelimit_over_limit[1m]))) > 0	You can execute the increase statement to query the number of requests on which throttling is performed within the last one minute. The number of requests that are throttled is grouped by the namespace and name of the service that triggers throttling. An alert is triggered if the number of requests that are throttled within one minute is greater than 0.
Alert Message	Global throttling occurs! Namespace:{{$labels.namespace }}. The service that triggers throttling:{{$labels.service_istio_io_canonical_name}}. The number of requests that are throttled in the current one minute :{{ $value }}	The alert information in the example shows the namespace of the service that triggers throttling, the name of the service, and the number of requests that are sent to the service and are throttled in the last one minute.

References

For more information about the ASMGlobalRateLimiter fields, see Description of ASMGlobalRateLimiter fields.
For more information about how to configure local throttling in Traffic Management Center, see Configure local throttling in Traffic Management Center.
For more information about how to configure global throttling on an ingress gateway, see Configure global throttling on an ingress gateway.