CPU Burst is a Service Level Objective (SLO)-aware resource scheduling feature provided by Container Service for Kubernetes (ACK). You can use CPU Burst to improve the performance of latency-sensitive applications. CPU scheduling for a container may be throttled by the kernel due to the CPU limit, which downgrades the performance of the application. The ack-koordinator component automatically detects CPU throttling events and automatically adjusts the CPU limit to a proper value. This greatly improves the service quality of latency-sensitive applications.
Prerequisites
An ACK Pro cluster is created, and the Kubernetes version of the cluster is 1.18 or later. For more information, see Create an ACK Pro cluster.
ack-koordinator 0.8.0 or later is installed. For more information, see ack-koordinator (FKA ack-slo-manager).
Billing rules
No fee is charged when you install and use the ack-koordinator component. However, fees may be charged in the following scenarios:
ack-koordinator is an non-managed component that occupies worker node resources after it is installed. You can specify the amount of resources requested by each module when you install the component.
By default, ack-koordinator exposes the monitoring metrics of features such as resource profiling and fine-grained scheduling as Prometheus metrics. If you enable Prometheus metrics for ack-koordinator and use Managed Service for Prometheus, these metrics are considered as custom metrics and fees are charged for these metrics. The fee depends on factors such as the size of your cluster and the number of applications. Before you enable Prometheus metrics, we recommend that you read the Billing topic of Managed Service for Prometheus to learn the free quota and billing rules of custom metrics. For more information about how to monitor and manage resource usage see Resource usage and bills.
Use scenarios
CPU Burst is suitable in the following scenarios:
CPU throttling is triggered for an application though the CPU usage of the application is less than the CPU limit of the application. As a result, the performance of the application is degraded. You can enable CPU Burst to resolve this issue and improve the performance of the application.
The CPU usage during an application startup is higher than the CPU usage after the application is started. You can enable CPU Burst to meet the CPU requirements during an application startup. This way, you do not need to specify an excessively high CPU request for the application, which reduces resource waste.
How it works
Kubernetes allows you to specify resource limits to manage resource allocation. For time-sharing resources such as CPUs, if you specify a CPU limit for a container, the OS limits the amount of CPU resources that can be used by the container within a specific time period. For example, you set the CPU limit of a container to 2. The OS kernel limits the CPU time slices that the container can use to 200 milliseconds within each 100-millisecond period.
CPU utilization is a key metric that is used to evaluate the performance of a container. In most cases, the CPU limit is specified based on CPU utilization. CPU utilization on a per-millisecond basis shows more spikes than on a per-second basis. If the CPU utilization of a container reaches the limit within a 100-millisecond period, CPU throttling is enforced by the OS kernel and threads in the container are suspended for the rest of the time period, as shown in the following figure.
The following figure shows the thread allocation of a web application container that runs on a node with four vCores. The CPU limit of the container is set to 2. The overall CPU utilization within the last second is low. However, Thread 2 cannot be resumed until the third 100-millisecond period starts because CPU throttling is enforced somewhere in the second 100-millisecond period. This increases the response time (RT) and causes long-tail latency problems in containers.
Alibaba Cloud Linux provides the CPU Burst feature. CPU Burst allows a container to accumulate CPU time slices when the container is idle. The container can use the accumulated CPU time slices to burst above the CPU limit when CPU demand spikes. This improves performance and reduces the response time of the container. For more information, see Enable the CPU burst feature for cgroup v1.
For kernel versions that do not support CPU Burst, ack-koordinator detects CPU throttling events and dynamically adjusts the CPU limit to achieve the same effect as CPU Burst.
ack-koordinator achieves this by modifying the value of the CFS quota
in the cgroup parameters instead of modifying the value of the CPU limit in the pod specifications.
The preceding Completely Fair Scheduler (CFS) quota adjustment policy can be used to handle CPU usage spikes For example, when traffic spikes occur, ack-koordinator can eliminate CPU bottlenecks within a few seconds, while ensuring a proper number of workloads on the node.
The kernel-level CPU Burst feature provided by Alibaba Cloud Linux has higher sensitivity. We recommend that you enable CPU Burst for latency-sensitive applications.
How to use CPU Burst
Use pod annotations to enable CPU Burst
You can use pod annotations to enable CPU Burst for the specified pod.
To enable CPU Burst for a pod, add pod annotations to the
metadata
section of the pod configuration.To enable CPU Burst for a Deployment, add pod annotation to the
template.metadata
section of the Deployment configuration.
annotations:
# Set the value to auto to enable CPU Burst for a pod.
koordinator.sh/cpuBurst: '{"policy": "auto"}'
# Set the value to none to disable CPU Burst for a pod.
#koordinator.sh/cpuBurst: '{"policy": "none"}'
Use a ConfigMap to enable CPU Burst for all pods in a cluster
By default, the CPU Burst feature configured by using a ConfigMap takes effect on the entire cluster.
Create a ConfigMap named configmap.yaml based on the following code:
apiVersion: v1 data: cpu-burst-config: '{"clusterStrategy": {"policy": "auto"}}' #cpu-burst-config: '{"clusterStrategy": {"policy": "cpuBurstOnly"}}' #cpu-burst-config: '{"clusterStrategy": {"policy": "none"}}' kind: ConfigMap metadata: name: ack-slo-config namespace: kube-system
Check whether the ConfigMap named
ack-slo-config
exists in the kube-system namespace.If the ConfigMap
ack-slo-config
already exists, use the PATCH method to update the ConfigMap so that other settings in the ConfigMap are not modified.kubectl patch cm -n kube-system ack-slo-config --patch "$(cat configmap.yaml)
If the ConfigMap
ack-slo-config
does not exist, run the following command to create the ConfigMap:kubectl apply -f configmap.yaml
Enable CPU Burst for pods in the specified namespace
You can specify a namespace to enable the CPU Burst feature for all pods in the namespace.
apiVersion: v1
kind: ConfigMap
metadata:
name: ack-slo-pod-config
namespace: koordinator-system # If this is the first time you use CPU Burst, you must first create a namespace named koordinator-system.
data:
# Enable or disable CPU Burst for all pods in the specified namespace.
cpu-burst: |
{
"enabledNamespaces": ["white-ns"],
"disabledNamespaces": ["black-ns"]
}
Advanced configurations of CPU Burst
You can specify advanced configurations in the ConfigMap or by adding annotations to the metadata
section of the pod configuration.
Some parameters can be configured by adding annotations and modifying the ConfigMap. In this case, the annotations take precedence over the ConfigMap. When no annotation is added to configure such a parameter, ack-koordinator uses the corresponding parameter in the ConfigMap.
The following sample code shows an example:
# Example of the ack-slo-config ConfigMap.
data:
cpu-burst-config: |
{
"clusterStrategy": {
"policy": "auto",
"cpuBurstPercent": 1000,
"cfsQuotaBurstPercent": 300,
"sharePoolThresholdPercent": 50,
"cfsQuotaBurstPeriodSeconds": -1
}
}
# Example of pod annotations.
koordinator.sh/cpuBurst: '{"policy": "auto", "cpuBurstPercent": 1000, "cfsQuotaBurstPercent": 300, "cfsQuotaBurstPeriodSeconds": -1}'
The following table describes the advanced parameters of CPU Burst.
The Annotation and ConfigMap columns indicate whether you can configure the parameter by using annotations and the ConfigMap. indicates supported and indicates not supported.
Parameter | Type | Description | Annotation | ConfigMap |
| string |
| ||
| int | Default value: This parameter is used to configure the kernel-level CPU Burst feature provided by Alibaba Cloud Linux. This parameter specifies the percentage to which the CPU limit can be increased by CPU Burst. This parameter corresponds to the For example, if the | ||
| int | Default value: This parameter specifies the maximum percentage to which the value of the | ||
| int | Default value: This parameter specifies the time period during which the container can run with an increased CFS quota, which cannot exceed the upper limit specified by | ||
| int | Default value: This parameter specifies the CPU utilization threshold of the node when the automatic adjustment of CFS quotas is enabled. If the CPU utilization of the node exceeds the threshold, the increased value of the |
After you enable automatic adjustment of CFS quotas by setting
policy
tocfsQuotaBurstOnly
orauto
, the pod CPU limit parametercpu.cfs_quota_us
of the node is automatically adjusted based on the status of CPU throttling.When you perform stress tests on a container, we recommend that you record the CPU utilization of the container throughout the test or disable the automatic adjustment of CFS quotas by setting
policy
tocpuBurstOnly
ornone
. This ensures higher resource elasticity in a production environment.
Verify the effect of CPU Burst
Verification steps
Use the following YAML template to create an apache-demo.yaml file:
To enable CPU Burst for a pod, add pod annotations to the
metadata
section of the pod configuration.apiVersion: v1 kind: Pod metadata: name: apache-demo annotations: koordinator.sh/cpuBurst: '{"policy": "auto"}' # The annotation is used to enable or disable CPU Burst. spec: containers: - command: - httpd - -D - FOREGROUND image: registry.cn-zhangjiakou.aliyuncs.com/acs/apache-2-4-51-for-slo-test:v0.1 imagePullPolicy: Always name: apache resources: limits: cpu: "4" memory: 10Gi requests: cpu: "4" memory: 10Gi nodeName: $nodeName # Replace nodeName with the actual node name. hostNetwork: False restartPolicy: Never schedulerName: default-scheduler
Run the following command to create an application by using Apache HTTP Server:
kubectl apply -f apache-demo.yaml
Use the wrk2 tool to perform stress tests.
# Download, decompress, and then install the wrk2 package. For more information, visit https://github.com/giltene/wrk2. # Gzip compression is enabled in the Apache image to simulate the request processing logic of the server. # Run the following command to send requests. Replace the IP address in the command with the IP address of the application. ./wrk -H "Accept-Encoding: deflate, gzip" -t 2 -c 12 -d 120 --latency --timeout 2s -R 24 http://$target_ip_address:8010/static/file.1m.test
NoteReplace the IP address in the command with the pod IP address of the Apache application.
You can modify the
-R
parameter to change the number of queries per unit time from the sender.
Analyze the result
The following tables show metrics before and after CPU Burst is enabled for Alibaba Cloud Linux and CentOS.
The Disabled column shows the metrics when the CPU Burst policy is set to
none
.The Enabled column shows the metrics when the CPU Burst policy is set to
auto
.
The following metrics are theoretical values. Actual values are subject to your operating environment.
Alibaba Cloud Linux | Disabled | Enabled |
apache RT-p99 | 107.37 ms | 67.18 ms (-37.4%) |
CPU Throttled Ratio | 33.3% | 0% |
Average pod CPU utilization | 31.8% | 32.6% |
CentOS | Disabled | Enabled |
apache RT-p99 | 111.69 ms | 71.30 ms (-36.2%) |
CPU Throttled Ratio | 33% | 0% |
Average pod CPU utilization | 32.5% | 33.8% |
The preceding metrics indicate the following information:
After CPU Burst is enabled, the P99 latency is greatly reduced.
After CPU Burst is enabled, CPU throttling is stopped and the average pod CPU utilization remains approximately at the same value.
FAQ
Does the CPU Burst feature that I enabled based on an earlier version of the ack-slo-manager protocol still work after I upgrade ack-slo-manager to ack-koordinator?
Earlier pod protocol versions require you to add the alibabacloud.com/cpuBurst
annotation. ack-koordinator is fully compatible with the earlier protocol versions. You can seamlessly upgrade from ack-slo-manager to ack-koordinator.
ack-koordinator is compatible with protocol versions no later than July 30, 2023. We recommend that you replace the resource parameters used in an earlier protocol version with those used in the latest version.
The following table describes the compatibility between ack-koordinator and different types of protocols.
ack-koordinator version | alibabacloud.com protocol | koordinator.sh protocol |
≥ 0.2.0 | Supported | Not supported |
≥ 0.8.0 | Supported | Supported |
Why do pods still generate CPU throttling events after CPU Burst is enabled?
You can modify the configuration based on the following possible causes:
The CPU Burst feature does not take effect because the configuration format is incorrect. For more information, see Advanced configurations of CPU Burst.
When CPU utilization reaches the upper limit specified by
cfsQuotaBurstPercent
, CPU throttling events are still generated due to insufficient CPU resources. We recommend that you adjust the resource requests and limits based on the actual resource demand of the application.CPU Burst can adjust the following cgroup parameters for pods: cpu.cfs_quota_us and cpu.cfs_burst_us. For more information, see Advanced configurations of CPU Burst. cpu.cfs_quota_us is modified after the ack-coordinator detects CPU throttling events. Therefore, the adjustment is performed with a delay. cpu.cfs_burst_us is modified directly based on the existing configuration, which is more efficient. For best results, we recommend that you use CPU Burst with Alibaba Cloud Linux.
CPU Burst has a protection mechanism when modifying the value of cpu.cfs_quota_us. You can configure the CPU utilization threshold of the node by using the
sharePoolThresholdPercent
parameter. When the CPU utilization of the node reaches the threshold, the value of cpu.cfs_quota_us is reset to the original value to prevent the current pod from affecting other pods. We recommend that you set the CPU utilization threshold based on the actual status of your application to prevent high utilization from affecting application performance.