The CPU usage of a container cannot exceed the CPU limit of the container. When the CPU usage reaches the limit, CPU scheduling for the container may be throttled by the kernel, which downgrades the service quality of the application. The CPU Burst feature automatically detects CPU throttling events and automatically adjusts the CPU limit to a proper value. After this feature is enabled, the CPU usage of a container can burst above the CPU limit when the CPU demand spikes. This eliminates CPU bottlenecks and improves the service quality of latency-sensitive applications.
Before you use the CPU Burst feature, we recommend that you learn about the relevant terms in CFS Scheduler and Control CPU Management Policies on the Node.
Why CPU Burst is developed
In Kubernetes clusters, the CPU limit of a container specifies the maximum amount of CPU resources that the container can use. This ensures fair CPU allocation among containers and prevents performance degradation when CPU resources are exhausted by individual containers.
CPU resources are time-sharing. Multiple processes or containers can share one CPU time slice. After you specify the CPU limit of a container, the OS kernel uses Completely Fair Scheduler (CFS) to configure the maximum amount of CPU resources (cpu.cfs_quota_us
) the container can use within each time period cpu.cfs_period_us
. For example, you set the CPU limit of a container to 4. The OS kernel limits the CPU time slices that the container can use to 400 milliseconds within each time period (usually each 100-millisecond period).
Benefits
CPU utilization is a key metric that is used to evaluate the performance of a container. In most cases, the CPU limit is specified based on CPU utilization. CPU utilization on a per-millisecond basis shows more spikes than on a per-second basis. The following figure shows the CPU utilization of a container. The CPU utilization on a per-second basis (purple curve) is less than 4 vCores. However, the CPU utilization on a per-millisecond basis (green curve) exceeds 4 vCores within some time periods. In this case, if you set the CPU limit to 4 vCores, CPU throttling is enforced by the OS kernel and threads in the container are suspended.
The following figures show how CPU resources are allocated to the threads of a web application container after the container receives requests. The container runs on a node with four vCores and the CPU limit of the container is set to 2. The figure on the left shows the CPU allocation details when CPU Burst is disabled. The figure on the right shows the CPU allocation details when CPU Burst is enabled.
The overall CPU utilization within the last second is low. However, Thread 2 cannot be resumed to process Request 2 until the third 100-millisecond period starts because CPU throttling is enforced during the second 100-millisecond period. This increases the response time (RT). This also causes long-tail latency problems in containers. | After you enable CPU Burst, the container can accumulate CPU time slices when the container is idle. The container can use the accumulated CPU time slices to burst above the CPU limit when CPU demand spikes. This improves performance and reduces the RT of the container. |
In addition to the preceding scenarios, CPU Burst is also suitable for handling CPU usage spikes. For example, when traffic spikes occur, ack-koordinator can eliminate CPU bottlenecks within a few seconds, while ensuring a proper number of workloads on the node.
ack-koordinator achieves this by modifying the value of the CFS quota
in the cgroup parameters instead of modifying the value of the CPU limit in the pod specifications.
Scenarios
CPU Burst is suitable in the following scenarios:
CPU throttling is triggered for an application though the CPU usage of the application is often less than the CPU limit of the application. As a result, the performance of the application is degraded. CPU Burst allows a container to use the CPU time slices that the container accumulates when it is idle. You can enable CPU Burst to resolve CPU throttling and improve the performance of the application.
The CPU usage of an application during the startup process is higher than the CPU usage after the application is started. You can enable CPU Burst to meet the CPU requirements during the startup process. This way, you do not need to specify an excessively high CPU limit for the application.
Billing
No fee is charged when you install or use the ack-koordinator component. However, fees may be charged in the following scenarios:
ack-koordinator is a non-managed component that occupies worker node resources after it is installed. You can specify the amount of resources requested by each module when you install the component.
By default, ack-koordinator exposes the monitoring metrics of features such as resource profiling and fine-grained scheduling as Prometheus metrics. If you enable Prometheus metrics for ack-koordinator and use Managed Service for Prometheus, these metrics are considered custom metrics and fees are charged for these metrics. The fee depends on factors such as the size of your cluster and the number of applications. Before you enable Prometheus metrics, we recommend that you read the Billing topic of Managed Service for Prometheus to learn about the free quota and billing rules of custom metrics. For more information about how to monitor and manage resource usage, see Query the amount of observable data and bills.
Prerequisites
An ACK Pro cluster is created and the Kubernetes version of the cluster is 1.18 or later. For more information, see Create an ACK managed cluster and Manually upgrade ACK clusters.
NoteWe recommend that you select Alibaba Cloud Linux for the cluster. For more information, see Does CPU Burst support only Alibaba Cloud Linux?
ack-koordinator 0.8.0 is installed in the cluster. For more information, see ack-koordinator.
Usage notes
You can use an annotation to enable CPU Burst for a pod. You can also use a ConfigMap to enable CPU Burst for pods in a namespace or cluster.
Use an annotation to enable CPU Burst for a pod
To enable CPU Burst for a pod, add an annotation in the metadata
section of the YAML file of the pod.
To apply configurations to a workload, such as a deployment, set the appropriate annotations for the pod in the template.metadata
field.
annotations:
# Set the value to auto to enable CPU Burst for a pod.
koordinator.sh/cpuBurst: '{"policy": "auto"}'
# Set the value to none to disable CPU Burst for a pod.
koordinator.sh/cpuBurst: '{"policy": "none"}'
Use a ConfigMap to enable CPU Burst for pods in a cluster
By default, the CPU Burst feature configured by using a ConfigMap takes effect on the entire cluster.
Create a file named configmap.yaml and copy the following content to the file:
apiVersion: v1 data: cpu-burst-config: '{"clusterStrategy": {"policy": "auto"}}' #cpu-burst-config: '{"clusterStrategy": {"policy": "cpuBurstOnly"}}' #cpu-burst-config: '{"clusterStrategy": {"policy": "none"}}' kind: ConfigMap metadata: name: ack-slo-config namespace: kube-system
Check whether the
ack-slo-config
ConfigMap exists in the kube-system namespace.If the ack-slo-config ConfigMap exists, we recommend that you run the kubectl patch command to update the ConfigMap. This avoids changing other settings in the ConfigMap.
kubectl patch cm -n kube-system ack-slo-config --patch "$(cat configmap.yaml)
If the ack-slo-config ConfigMap does not exist, run the following command to create a ConfigMap named ack-slo-config:
kubectl apply -f configmap.yaml
Use a ConfigMap to enable CPU Burst for pods in a namespace
You can specify a namespace in a ConfigMap to enable the CPU Burst feature for pods in the namespace.
Create a file named configmap.yaml and copy the following content to the file:
apiVersion: v1 kind: ConfigMap metadata: name: ack-slo-pod-config namespace: koordinator-system # If this is the first time you use CPU Burst, you must first create a namespace named koordinator-system. data: # Enable or disable CPU Burst in the specified namespace. cpu-burst: | { "enabledNamespaces": ["allowed-ns"], "disabledNamespaces": ["blocked-ns"] } # This setting enables CPU Burst for pods in the allowed-ns namespace, which is equivalent to policy: auto. # This setting disables CPU Burst for pods in the blocked-ns namespace, which is equivalent to policy: none.
Check whether the
ack-slo-config
ConfigMap exists in the kube-system namespace.If the ack-slo-config ConfigMap exists, we recommend that you run the kubectl patch command to update the ConfigMap. This avoids changing other settings in the ConfigMap.
kubectl patch cm -n kube-system ack-slo-config --patch "$(cat configmap.yaml)
If the ack-slo-config ConfigMap does not exist, run the following command to create a ConfigMap named ack-slo-config:
kubectl apply -f configmap.yaml
Procedure
The following example uses a web application to show how CPU Burst reduces access latency and improves application performance.
Verification steps
Create a file named apache-demo.yaml and copy the following content to the file.
To enable CPU Burst for the pod, add a specific annotation to the
metadata
section of the pod configurations.apiVersion: v1 kind: Pod metadata: name: apache-demo annotations: koordinator.sh/cpuBurst: '{"policy": "auto"}' # Add this annotation to enable CPU Burst. spec: containers: - command: - httpd - -D - FOREGROUND image: registry.cn-zhangjiakou.aliyuncs.com/acs/apache-2-4-51-for-slo-test:v0.1 imagePullPolicy: Always name: apache resources: limits: cpu: "4" memory: 10Gi requests: cpu: "4" memory: 10Gi nodeName: $nodeName # Replace the value with the actual node name. hostNetwork: False restartPolicy: Never schedulerName: default-scheduler
Run the following command to create an application by using Apache HTTP Server:
kubectl apply -f apache-demo.yaml
Use the wrk2 tool to perform stress tests.
# Download, decompress, and then install the wrk2 package. For more information, visit https://github.com/giltene/wrk2. # Gzip compression is enabled in the Apache image to simulate the request processing logic of the server. # Run the following command to send requests. Replace the IP address in the command with the IP address of the application. ./wrk -H "Accept-Encoding: deflate, gzip" -t 2 -c 12 -d 120 --latency --timeout 2s -R 24 http://$target_ip_address:8010/static/file.1m.test
NoteReplace the IP address in the command with the pod IP address of the Apache application.
You can modify the
-R
parameter to change the number of queries per unit time from the sender.
Analyze the result
The following tables show metrics before and after CPU Burst is enabled for Alibaba Cloud Linux and CentOS.
The Disabled column shows the metrics when the CPU Burst policy is set to
none
.The Enabled column shows the metrics when the CPU Burst policy is set to
auto
.
The following metrics are theoretical values. Actual values vary based on your operating environment.
Alibaba Cloud Linux | Disabled | Enabled |
apache RT-p99 | 107.37 ms | 67.18 ms (-37.4%) |
CPU Throttled Ratio | 33.3% | 0% |
Average pod CPU utilization | 31.8% | 32.6% |
CentOS | Disabled | Enabled |
apache RT-p99 | 111.69 ms | 71.30 ms (-36.2%) |
CPU Throttled Ratio | 33% | 0% |
Average pod CPU utilization | 32.5% | 33.8% |
The preceding metrics indicate the following information:
After CPU Burst is enabled, the P99 latency is greatly reduced.
After CPU Burst is enabled, CPU throttling events are greatly reduced and the average pod CPU utilization remains approximately at the same value.
Advanced configurations
You can specify advanced configurations in the ConfigMap or by adding annotations to the pod configurations. Some parameters can be configured by adding annotations and modifying the ConfigMap. In this case, the annotations take precedence over the ConfigMap. When no annotation is added to configure such a parameter, ack-koordinator uses the corresponding parameter in the namespace-level ConfigMap. If the parameter is not specified in the namespace-level ConfigMap, ack-koordinator uses the parameter in the cluster-level ConfigMap.
The following code block is an example:
# Example of the ack-slo-config ConfigMap.
data:
cpu-burst-config: |
{
"clusterStrategy": {
"policy": "auto",
"cpuBurstPercent": 1000,
"cfsQuotaBurstPercent": 300,
"sharePoolThresholdPercent": 50,
"cfsQuotaBurstPeriodSeconds": -1
}
}
# Example of pod annotations.
koordinator.sh/cpuBurst: '{"policy": "auto", "cpuBurstPercent": 1000, "cfsQuotaBurstPercent": 300, "cfsQuotaBurstPeriodSeconds": -1}'
The following table describes the advanced parameters of CPU Burst.
The Annotation and ConfigMap columns indicate whether you can configure the parameter by using annotations and the ConfigMap. indicates supported and indicates not supported.
Parameter | Type | Description | Annotation | ConfigMap |
| string |
| ||
| int | Default value: This parameter is used to configure the kernel-level CPU Burst feature provided by Alibaba Cloud Linux. This parameter specifies the percentage to which the CPU limit can be increased by CPU Burst. This parameter corresponds to the For example, if the | ||
| int | Default value: This parameter specifies the maximum percentage to which the value of the | ||
| int | Default value: This parameter specifies the time period during which the container can run with an increased CFS quota, which cannot exceed the upper limit specified by | ||
| int | Default value: This parameter specifies the CPU utilization threshold of the node when the automatic adjustment of CFS quotas is enabled. If the CPU utilization of the node exceeds the threshold, the increased value of the |
After you enable automatic adjustment of CFS quotas by setting
policy
tocfsQuotaBurstOnly
orauto
, the pod CPU limit parametercpu.cfs_quota_us
of the node is automatically adjusted based on the status of CPU throttling.When you perform stress tests on a container, we recommend that you record the CPU utilization of the container throughout the test or disable the automatic adjustment of CFS quotas by setting
policy
tocpuBurstOnly
ornone
. This ensures higher resource elasticity in a production environment.
FAQ
Does the CPU Burst feature that I enabled based on an earlier version of the ack-slo-manager protocol still work after I upgrade ack-slo-manager to ack-koordinator?
Earlier pod protocol versions require you to add the alibabacloud.com/cpuBurst
annotation. ack-koordinator is fully compatible with the earlier protocol versions. You can seamlessly upgrade from ack-slo-manager to ack-koordinator.
ack-koordinator is compatible with protocol versions no later than July 30, 2023. We recommend that you replace the resource parameters used in an earlier protocol version with those used in the latest version.
The following table describes the compatibility between ack-koordinator and different types of protocols.
ack-koordinator version | alibabacloud.com protocol | koordinator.sh protocol |
≥ 0.2.0 | Supported | Not supported |
≥ 0.8.0 | Supported | Supported |
Why do pods still generate CPU throttling events after CPU Burst is enabled?
You can modify the configuration based on the following possible causes:
The CPU Burst feature does not take effect because the configuration format is incorrect. For more information, see Advanced configurations.
When CPU utilization reaches the upper limit specified by
cfsQuotaBurstPercent
, CPU throttling events are still generated due to insufficient CPU resources.We recommend that you adjust the resource requests and limits based on the actual resource demand of the application.
CPU Burst can adjust the following cgroup parameters for pods:
cpu.cfs_quota_us
andcpu.cfs_burst_us
. For more information, see Advanced configurations.cpu.cfs_quota_us
is modified after the ack-coordinator detects CPU throttling events. Therefore, the adjustment is performed with a delay.cpu.cfs_burst_us
is modified directly based on the existing configuration, which is more efficient.For best results, we recommend that you use CPU Burst with Alibaba Cloud Linux.
CPU Burst has a protection mechanism when modifying the value of
cpu.cfs_quota_us
. You can configure the CPU utilization threshold of the node by using thesharePoolThresholdPercent
parameter. When the CPU utilization of the node reaches the threshold, the value ofcpu.cfs_quota_us
is reset to the original value to prevent the current pod from affecting other pods.We recommend that you set the CPU utilization threshold based on the actual status of your application to prevent high utilization from affecting application performance.
Does CPU Burst support only Alibaba Cloud Linux?
The CPU Burst feature provided by ack-koordinator supports Alibaba Cloud Linux and open source CentOS versions. We recommend that you choose Alibaba Cloud Linux. The kernel features of Alibaba Cloud Linux allow ack-koordinator to provide fine-grained and elastic CPU resource management. For more information, see Enable the CPU Burst feature for cgroup v1.