Use CPU Burst to improve the performance of latency-sensitive applications - Container Service for Kubernetes

The CPU usage of a container cannot exceed the CPU limit of the container. When the CPU usage reaches the limit, CPU scheduling for the container may be throttled by the kernel, which downgrades the service quality of the application. The CPU Burst feature automatically detects CPU throttling events and automatically adjusts the CPU limit to a proper value. After this feature is enabled, the CPU usage of a container can burst above the CPU limit when the CPU demand spikes. This eliminates CPU bottlenecks and improves the service quality of latency-sensitive applications.

Note

Before you use the CPU Burst feature, we recommend that you learn about the relevant terms in CFS Scheduler and Control CPU Management Policies on the Node.

Why CPU Burst is developed

In Kubernetes clusters, the CPU limit of a container specifies the maximum amount of CPU resources that the container can use. This ensures fair CPU allocation among containers and prevents performance degradation when CPU resources are exhausted by individual containers.

CPU resources are time-sharing. Multiple processes or containers can share one CPU time slice. After you specify the CPU limit of a container, the OS kernel uses Completely Fair Scheduler (CFS) to configure the maximum amount of CPU resources (cpu.cfs_quota_us) the container can use within each time period cpu.cfs_period_us. For example, you set the CPU limit of a container to 4. The OS kernel limits the CPU time slices that the container can use to 400 milliseconds within each time period (usually each 100-millisecond period).

Benefits

CPU utilization is a key metric that is used to evaluate the performance of a container. In most cases, the CPU limit is specified based on CPU utilization. CPU utilization on a per-millisecond basis shows more spikes than on a per-second basis. The following figure shows the CPU utilization of a container. The CPU utilization on a per-second basis (purple curve) is less than 4 vCores. However, the CPU utilization on a per-millisecond basis (green curve) exceeds 4 vCores within some time periods. In this case, if you set the CPU limit to 4 vCores, CPU throttling is enforced by the OS kernel and threads in the container are suspended.

原理说明

The following figures show how CPU resources are allocated to the threads of a web application container after the container receives requests. The container runs on a node with four vCores and the CPU limit of the container is set to 2. The figure on the left shows the CPU allocation details when CPU Burst is disabled. The figure on the right shows the CPU allocation details when CPU Burst is enabled.

The overall CPU utilization within the last second is low. However, Thread 2 cannot be resumed to process Request 2 until the third 100-millisecond period starts because CPU throttling is enforced during the second 100-millisecond period. This increases the response time (RT). This also causes long-tail latency problems in containers. ack-slo-manager example.png

After you enable CPU Burst, the container can accumulate CPU time slices when the container is idle. The container can use the accumulated CPU time slices to burst above the CPU limit when CPU demand spikes. This improves performance and reduces the RT of the container. CPU Burst.png

In addition to the preceding scenarios, CPU Burst is also suitable for handling CPU usage spikes. For example, when traffic spikes occur, ack-koordinator can eliminate CPU bottlenecks within a few seconds, while ensuring a proper number of workloads on the node.

Note

ack-koordinator achieves this by modifying the value of the CFS quota in the cgroup parameters instead of modifying the value of the CPU limit in the pod specifications.

Scenarios

CPU Burst is suitable in the following scenarios:

CPU throttling is triggered for an application though the CPU usage of the application is often less than the CPU limit of the application. As a result, the performance of the application is degraded. CPU Burst allows a container to use the CPU time slices that the container accumulates when it is idle. You can enable CPU Burst to resolve CPU throttling and improve the performance of the application.
The CPU usage of an application during the startup process is higher than the CPU usage after the application is started. You can enable CPU Burst to meet the CPU requirements during the startup process. This way, you do not need to specify an excessively high CPU limit for the application.

Billing

No fee is charged when you install or use the ack-koordinator component. However, fees may be charged in the following scenarios:

ack-koordinator is a non-managed component that occupies worker node resources after it is installed. You can specify the amount of resources requested by each module when you install the component.
By default, ack-koordinator exposes the monitoring metrics of features such as resource profiling and fine-grained scheduling as Prometheus metrics. If you enable Prometheus metrics for ack-koordinator and use Managed Service for Prometheus, these metrics are considered custom metrics and fees are charged for these metrics. The fee depends on factors such as the size of your cluster and the number of applications. Before you enable Prometheus metrics, we recommend that you read the Billing topic of Managed Service for Prometheus to learn about the free quota and billing rules of custom metrics. For more information about how to monitor and manage resource usage, see Query the amount of observable data and bills.

Prerequisites

An ACK Pro cluster is created and the Kubernetes version of the cluster is 1.18 or later. For more information, see Create an ACK managed cluster and Manually upgrade ACK clusters.
Note
We recommend that you select Alibaba Cloud Linux for the cluster. For more information, see Does CPU Burst support only Alibaba Cloud Linux?
ack-koordinator 0.8.0 is installed in the cluster. For more information, see ack-koordinator.

Usage notes

You can use an annotation to enable CPU Burst for a pod. You can also use a ConfigMap to enable CPU Burst for pods in a namespace or cluster.

Use an annotation to enable CPU Burst for a pod

To enable CPU Burst for a pod, add an annotation in the metadata section of the YAML file of the pod.

Note

To apply configurations to a workload, such as a deployment, set the appropriate annotations for the pod in the template.metadata field.

annotations:
  # Set the value to auto to enable CPU Burst for a pod. 
  koordinator.sh/cpuBurst: '{"policy": "auto"}'
  # Set the value to none to disable CPU Burst for a pod. 
  koordinator.sh/cpuBurst: '{"policy": "none"}'

Use a ConfigMap to enable CPU Burst for pods in a cluster

By default, the CPU Burst feature configured by using a ConfigMap takes effect on the entire cluster.

Create a file named configmap.yaml and copy the following content to the file:

apiVersion: v1
data:
  cpu-burst-config: '{"clusterStrategy": {"policy": "auto"}}'
  #cpu-burst-config: '{"clusterStrategy": {"policy": "cpuBurstOnly"}}'
  #cpu-burst-config: '{"clusterStrategy": {"policy": "none"}}'
kind: ConfigMap
metadata:
  name: ack-slo-config
  namespace: kube-system

Check whether the ack-slo-config ConfigMap exists in the kube-system namespace.
- If the ack-slo-config ConfigMap exists, we recommend that you run the kubectl patch command to update the ConfigMap. This avoids changing other settings in the ConfigMap.
```
kubectl patch cm -n kube-system ack-slo-config --patch "$(cat configmap.yaml)
```
- If the ack-slo-config ConfigMap does not exist, run the following command to create a ConfigMap named ack-slo-config:
```
kubectl apply -f configmap.yaml
```

Use a ConfigMap to enable CPU Burst for pods in a namespace

You can specify a namespace in a ConfigMap to enable the CPU Burst feature for pods in the namespace.

Create a file named configmap.yaml and copy the following content to the file:

apiVersion: v1
kind: ConfigMap
metadata:
  name: ack-slo-pod-config
  namespace: koordinator-system # If this is the first time you use CPU Burst, you must first create a namespace named koordinator-system. 
data:
  # Enable or disable CPU Burst in the specified namespace. 
  cpu-burst: |
    {
      "enabledNamespaces": ["allowed-ns"], 
      "disabledNamespaces": ["blocked-ns"]
    }
  # This setting enables CPU Burst for pods in the allowed-ns namespace, which is equivalent to policy: auto. 
  # This setting disables CPU Burst for pods in the blocked-ns namespace, which is equivalent to policy: none.

Check whether the ack-slo-config ConfigMap exists in the kube-system namespace.
- If the ack-slo-config ConfigMap exists, we recommend that you run the kubectl patch command to update the ConfigMap. This avoids changing other settings in the ConfigMap.
```
kubectl patch cm -n kube-system ack-slo-config --patch "$(cat configmap.yaml)
```
- If the ack-slo-config ConfigMap does not exist, run the following command to create a ConfigMap named ack-slo-config:
```
kubectl apply -f configmap.yaml
```

Procedure

The following example uses a web application to show how CPU Burst reduces access latency and improves application performance.

Verification steps

Create a file named apache-demo.yaml and copy the following content to the file.

To enable CPU Burst for the pod, add a specific annotation to the metadata section of the pod configurations.

apiVersion: v1
kind: Pod
metadata:
  name: apache-demo
  annotations:
    koordinator.sh/cpuBurst: '{"policy": "auto"}'   # Add this annotation to enable CPU Burst. 
spec:
  containers:
  - command:
    - httpd
    - -D
    - FOREGROUND
    image: registry.cn-zhangjiakou.aliyuncs.com/acs/apache-2-4-51-for-slo-test:v0.1
    imagePullPolicy: Always
    name: apache
    resources:
      limits:
        cpu: "4"
        memory: 10Gi
      requests:
        cpu: "4"
        memory: 10Gi
  nodeName: $nodeName # Replace the value with the actual node name. 
  hostNetwork: False
  restartPolicy: Never
  schedulerName: default-scheduler

Run the following command to create an application by using Apache HTTP Server:
```
kubectl apply -f apache-demo.yaml
```

Use the wrk2 tool to perform stress tests.

# Download, decompress, and then install the wrk2 package. For more information, visit https://github.com/giltene/wrk2. 
# Gzip compression is enabled in the Apache image to simulate the request processing logic of the server. 
# Run the following command to send requests. Replace the IP address in the command with the IP address of the application. 
./wrk -H "Accept-Encoding: deflate, gzip" -t 2 -c 12 -d 120 --latency --timeout 2s -R 24 http://$target_ip_address:8010/static/file.1m.test

Note

Replace the IP address in the command with the pod IP address of the Apache application.
You can modify the -R parameter to change the number of queries per unit time from the sender.

Analyze the result

The following tables show metrics before and after CPU Burst is enabled for Alibaba Cloud Linux and CentOS.

The Disabled column shows the metrics when the CPU Burst policy is set to none.
The Enabled column shows the metrics when the CPU Burst policy is set to auto.

Important

The following metrics are theoretical values. Actual values vary based on your operating environment.

Alibaba Cloud Linux	Disabled	Enabled
apache RT-p99	107.37 ms	67.18 ms (-37.4%)
CPU Throttled Ratio	33.3%	0%
Average pod CPU utilization	31.8%	32.6%

CentOS	Disabled	Enabled
apache RT-p99	111.69 ms	71.30 ms (-36.2%)
CPU Throttled Ratio	33%	0%
Average pod CPU utilization	32.5%	33.8%

The preceding metrics indicate the following information:

After CPU Burst is enabled, the P99 latency is greatly reduced.
After CPU Burst is enabled, CPU throttling events are greatly reduced and the average pod CPU utilization remains approximately at the same value.

Advanced configurations

You can specify advanced configurations in the ConfigMap or by adding annotations to the pod configurations. Some parameters can be configured by adding annotations and modifying the ConfigMap. In this case, the annotations take precedence over the ConfigMap. When no annotation is added to configure such a parameter, ack-koordinator uses the corresponding parameter in the namespace-level ConfigMap. If the parameter is not specified in the namespace-level ConfigMap, ack-koordinator uses the parameter in the cluster-level ConfigMap.

The following code block is an example:

# Example of the ack-slo-config ConfigMap. 
data:
  cpu-burst-config: |
    {
      "clusterStrategy": {
        "policy": "auto",
        "cpuBurstPercent": 1000,
        "cfsQuotaBurstPercent": 300,
        "sharePoolThresholdPercent": 50,
        "cfsQuotaBurstPeriodSeconds": -1
      }
    }

# Example of pod annotations. 
  koordinator.sh/cpuBurst: '{"policy": "auto", "cpuBurstPercent": 1000, "cfsQuotaBurstPercent": 300, "cfsQuotaBurstPeriodSeconds": -1}'

The following table describes the advanced parameters of CPU Burst.

Note

The Annotation and ConfigMap columns indicate whether you can configure the parameter by using annotations and the ConfigMap. indicates supported and indicates not supported.

Parameter	Type	Description	Annotation	ConfigMap
`policy`	string	`none`: disables CPU Burst. If you set the value to none, the related parameters are reset to their original values. This is the default value. `cpuBurstOnly`: enables the kernel-level CPU Burst feature provided by Alibaba Cloud Linux. `cfsQuotaBurstOnly`: enables automatic adjustment of CFS quotas. All kernel versions are supported. `auto`: automatically enables CPU Burst for the kernel of Alibaba Cloud Linux and automatic adjustment of CFS quotas.
`cpuBurstPercent`	int	Default value: `1000`. Unit: %. This parameter is used to configure the kernel-level CPU Burst feature provided by Alibaba Cloud Linux. This parameter specifies the percentage to which the CPU limit can be increased by CPU Burst. This parameter corresponds to the `cpu.cfs_burst_us` parameter in cgroup. For more information, see Enable the CPU Burst feature for cgroup v1. For example, if the `CPU limit` is set to 1, the `cpu.cfs_quota_us` value is 100,000, and the `cpu.cfs_burst_us` value is increased by a factor of 10 to 1,000,000.
`cfsQuotaBurstPercent`	int	Default value: `300`. Unit: %. This parameter specifies the maximum percentage to which the value of the `cpu.cfs_quota_us` cgroup parameter can be increased when the automatic adjustment of CFS quotas is enabled. By default, the value of cpu.cfs_quota_us can be increased to three times.
`cfsQuotaBurstPeriodSeconds`	int	Default value: `-1`. Unit: seconds. This value indicates that the time period during which the container can run with an increased CFS quota is unlimited. This parameter specifies the time period during which the container can run with an increased CFS quota, which cannot exceed the upper limit specified by `cfsQuotaBurstPercent`. After the upper limit is exceeded, the increased value of the `cpu.cfs_quota_us` cgroup parameter is reset to the original value. Other pods are not affected.
`sharePoolThresholdPercent`	int	Default value: `50`. Unit: %. This parameter specifies the CPU utilization threshold of the node when the automatic adjustment of CFS quotas is enabled. If the CPU utilization of the node exceeds the threshold, the increased value of the `cpu.cfs_quota_us` cgroup parameter of each pod is reset to the original value.

Note

After you enable automatic adjustment of CFS quotas by setting policy to cfsQuotaBurstOnly or auto, the pod CPU limit parameter cpu.cfs_quota_us of the node is automatically adjusted based on the status of CPU throttling.
When you perform stress tests on a container, we recommend that you record the CPU utilization of the container throughout the test or disable the automatic adjustment of CFS quotas by setting policy to cpuBurstOnly or none. This ensures higher resource elasticity in a production environment.

FAQ

Does the CPU Burst feature that I enabled based on an earlier version of the ack-slo-manager protocol still work after I upgrade ack-slo-manager to ack-koordinator?

Earlier pod protocol versions require you to add the alibabacloud.com/cpuBurst annotation. ack-koordinator is fully compatible with the earlier protocol versions. You can seamlessly upgrade from ack-slo-manager to ack-koordinator.

Note

ack-koordinator is compatible with protocol versions no later than July 30, 2023. We recommend that you replace the resource parameters used in an earlier protocol version with those used in the latest version.

The following table describes the compatibility between ack-koordinator and different types of protocols.

ack-koordinator version	alibabacloud.com protocol	koordinator.sh protocol
≥ 0.2.0	Supported	Not supported
≥ 0.8.0	Supported	Supported

Why do pods still generate CPU throttling events after CPU Burst is enabled?

You can modify the configuration based on the following possible causes:

The CPU Burst feature does not take effect because the configuration format is incorrect. For more information, see Advanced configurations.
When CPU utilization reaches the upper limit specified by cfsQuotaBurstPercent, CPU throttling events are still generated due to insufficient CPU resources.
We recommend that you adjust the resource requests and limits based on the actual resource demand of the application.
CPU Burst can adjust the following cgroup parameters for pods: cpu.cfs_quota_us and cpu.cfs_burst_us. For more information, see Advanced configurations. cpu.cfs_quota_us is modified after the ack-coordinator detects CPU throttling events. Therefore, the adjustment is performed with a delay. cpu.cfs_burst_us is modified directly based on the existing configuration, which is more efficient.
For best results, we recommend that you use CPU Burst with Alibaba Cloud Linux.
CPU Burst has a protection mechanism when modifying the value of cpu.cfs_quota_us. You can configure the CPU utilization threshold of the node by using the sharePoolThresholdPercent parameter. When the CPU utilization of the node reaches the threshold, the value of cpu.cfs_quota_us is reset to the original value to prevent the current pod from affecting other pods.
We recommend that you set the CPU utilization threshold based on the actual status of your application to prevent high utilization from affecting application performance.

Does CPU Burst support only Alibaba Cloud Linux?

The CPU Burst feature provided by ack-koordinator supports Alibaba Cloud Linux and open source CentOS versions. We recommend that you choose Alibaba Cloud Linux. The kernel features of Alibaba Cloud Linux allow ack-koordinator to provide fine-grained and elastic CPU resource management. For more information, see Enable the CPU Burst feature for cgroup v1.