All Products
Search
Document Center

Alibaba Cloud Service Mesh:Use AverageLatencySchedulingPolicy to implement priority-based request scheduling

Last Updated:Aug 01, 2024

The traffic scheduling suite of Service Mesh (ASM) supports priority-based request scheduling policies. When the system is overloaded, high-priority requests can be processed preferentially. This topic describes how to use AverageLatencySchedulingPolicy provided by the traffic scheduling suite to implement priority-based request scheduling.

Background information

A priority-based request scheduling policy compares the real-time latency with the historical average latency to determine whether traffic overload occurs. If traffic overload occurs, scheduling mechanisms based on token bucket and priority are used to schedule requests. The priority-based request scheduling policy works in the following way:

  1. Overload detection: This policy compares the average latency in the past period with the current latency to determine whether the system is overloaded.

  2. Adjustment of the token issuance rate: If an overload occurs, the monitoring data obtained in the previous step is sent to a dedicated controller, which will control the fill rate of the token bucket.

  3. Request scheduling: Different requests have different priorities. When an overload occurs, requests with higher priorities have a greater chance of obtaining tokens.

This policy can be used to queue requests when services are congested due to high request concurrency and response latencies continue to increase. Different from the commonly used throttling policies, this policy does not directly reject requests but put them in a priority queue. The request rate is limited by using the token bucket mechanism, and the request processing order is adjusted according to request priorities.

Prerequisites

Step 1: Create AverageLatencySchedulingPolicy

  1. Use kubectl to connect to the ASM instance. For more information, see Use kubectl on the control plane to access Istio resources.

  2. Create an AverageLatencySchedulingPolicy.yaml file that contains the following content:

    apiVersion: istio.alibabacloud.com/v1
    kind: AverageLatencySchedulingPolicy
    metadata:
      name: workload-prioritization
      namespace: istio-system
    spec:
      load_scheduling_core:
        aimd_load_scheduler:
          load_scheduler:
            workload_latency_based_tokens: true
            selectors:
              - service: httpbin.default.svc.cluster.local
            scheduler:
              workloads:
                - label_matcher:
                    match_labels:
                      http.request.header.user_type: "guest"
                  parameters:
                    priority: 50.0
                  name: "guest"
                - label_matcher:
                    match_labels:
                      http.request.header.user_type: "subscriber"
                  parameters:
                    priority: 200.0
                  name: "subscriber"
                  

    The following table describes some of the fields.

    Field

    Description

    workload_latency_based_tokens

    Indicates whether to dynamically adjust the number of tokens based on the average latency of the workload. The time window for estimating the average latency is the last 30 minutes.

    service

    The service on which the scheduling policy takes effect.

    workloads

    Two types of requests are defined based on user_type in the request headers: guest and subscriber. The priority of a request of the guest type is 50, and that of a request of the subscriber type is 200.

    For more information about the fields supported by AverageLatencySchedulingPolicy, see Description of AverageLatencySchedulingPolicy fields.

Step 2: Perform tests

In this example, the load testing tool fortio is used. For more information, see Install fortio.

First, some normal service requests are simulated to generate the average latency of requests:

fortio load -c 20 -qps 100000 -t 60m http://${IP address of the ASM gateway}/status/200

Three minutes later after the preceding command is executed, reopen two terminals and send test requests. During the entire test, make sure that the fortio process is always running and do not close the corresponding terminal.

In the two terminals, run the following two load testing commands respectively. (Start running the two tests at the same time as much as possible.)

fortio load -c 40 -qps 100000  -H "user_type:guest" -t 3m http://${IP address of the ASM gateway}/status/201
fortio load -c 40 -qps 100000  -H "user_type:subscriber" -t 3m http://${IP address of the ASM gateway}/status/202

In the two commands, different request paths are used to facilitate observation of test results based on access logs.

Step 3: Analyze the test results

The following code blocks show the last few lines of the outputs of the two commands in the current test environment:

Code 201 : 26852 (97.8 %)
Code 503 : 601 (2.2 %)
Response Header Sizes : count 27453 avg 242.91564 +/- 36.35 min 0 max 249 sum 6668763
Response Body/Total Sizes : count 27453 avg 246.17754 +/- 14.56 min 149 max 249 sum 6758312
All done 27453 calls (plus 40 warmup) 262.318 ms avg, 152.4 qps
Code 202 : 52765 (100.0 %)
Response Header Sizes : count 52765 avg 248.86358 +/- 0.5951 min 248 max 250 sum 13131287
Response Body/Total Sizes : count 52765 avg 248.86358 +/- 0.5951 min 248 max 250 sum 13131287
All done 52765 calls (plus 40 warmup) 136.472 ms avg, 292.9 qps

The code blocks indicate that:

  • All requests from the subscriber user were successful, and no HTTP 503 status code was returned. HTTP 503 status code was returned for 2.2% of the requests from the guest user.

  • The average latency of requests from the subscriber user is about 136 ms. The average latency of requests from the guest user is about 262 ms. In addition, the queries per second (QPS) of the two users differs significantly.

It can be inferred that AverageLatencySchedulingPolicy preferentially processes the requests of the subscriber user when the request latency increases.

Important

The test results in this topic are only theoretical values. The actual data is subject to your operating environment. The load test results vary in different test environments. Only the relative relationship of the data is analyzed in this example.

(Optional) Step 4: Analyze the test results based on the access logs of the ASM gateway

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Gateways > Ingress Gateway.

  3. Click the name of the ASM gateway to go to the Gateway overview page. Then, click Gateway Logs to view the access logs of the ASM gateway.

  4. The access paths of the guest and subscriber users are different. You can use the access paths to retrieve the access results of the two users respectively:

image.png

image.png