In dynamic resource overcommitment scenarios, you can overcommit the unused resources of high-priority Guaranteed or Burstable applications to low-priority BestEffort tasks. To ensure that the CPU usage of BestEffort (BE) pods remains within a reasonable range, Container Service for Kubernetes (ACK) provides the CPU Suppress feature, which allows you to prioritize the stable operation of latency-sensitive (LS) pods on the node.
To help you better understand and use this feature, we recommend that you first read the following topics in the official Kubernetes documentation: Pod QoS classes and Assign memory resources to containers and pods.
This feature must be used together with the dynamic resource overcommitment feature. For more information, see Enable dynamic resource overcommitment.
Why enable CPU Suppress
To enhance cluster resource usage, the dynamic resource overcommitment model allows for reserving a resource buffer for high-priority LS tasks to manage load fluctuations in upstream and downstream links, while allocating overcommitted resources to low-priority BE tasks. To ensure sufficient CPU resources for LS pods on a node, you can use ack-koordinator to limit the CPU usage of the BE pods on the node. The CPU Suppress feature can maintain the resource utilization of a node below the specified threshold and limit the amount of CPU resources that can be used by BE pods. This ensures the stability of the containers on the node. The CPU Suppress feature can limit the amount of CPU resources that are used by BE pods when the overall resource usage of the node is below the threshold. This ensures that the containers on the node have sufficient resources to run stably.
The following list describes the terms used in the figure:
CPU Threshold: the CPU usage threshold of a node.
Pod (LS).Usage: the CPU usage of LS pods.
CPU Restriction for BE: the CPU usage of BE pods.
The amount of CPU resources that can be used by BE pods is adjusted based on the fluctuation of the LS pods' CPU usage. We recommend that you use the same value for CPU Threshold and the reserved CPU watermark in the dynamic resource overcommitment model. This ensures a consistent level of CPU resource utilization.
Prerequisites
An ACK Pro cluster is created. For more information, see Create an ACK Pro cluster.
ack-koordinator 0.4.0 or later is installed. For more information about how to install ack-koordinator, see ack-koordinator (ack-slo-manager).
Billing
No fee is charged when you install or use the ack-koordinator component. However, fees may be charged in the following scenarios:
ack-koordinator is a non-managed component that occupies worker node resources after it is installed. You can specify the amount of resources requested by each module when you install the component.
By default, ack-koordinator exposes the monitoring metrics of features such as resource profiling and fine-grained scheduling as Prometheus metrics. If you enable Prometheus metrics for ack-koordinator and use Managed Service for Prometheus, these metrics are considered custom metrics and fees are charged for these metrics. The fee depends on factors such as the size of your cluster and the number of applications. Before you enable Prometheus metrics, we recommend that you read the Billing topic of Managed Service for Prometheus to learn about the free quota and billing rules of custom metrics. For more information about how to monitor and manage resource usage, see Query the amount of observable data and bills.
Procedure
You can enable the CPU Suppress feature at the cluster level through a ConfigMap. Additionally, you can configure related parameters in the ConfigMap, such as the CPU utilization threshold of the node (cpuSuppressThresholdPercent
) with CPU Suppress enabled, to achieve fine-grained resource management.
Create a file named configmap.yaml based on the following ConfigMap content:
apiVersion: v1 kind: ConfigMap metadata: name: ack-slo-config namespace: kube-system data: # Enable CPU Suppress. resource-threshold-config: | { "clusterStrategy": { "enable": true } }
Check whether the
ack-slo-config
ConfigMap exists in thekube-system
namespace.If the
ack-slo-config
ConfigMap exists, we recommend that you use the PATCH method to update the ConfigMap. This method does not change other settings in the ConfigMap.kubectl patch cm -n kube-system ack-slo-config --patch "$(cat configmap.yaml)"
If the
ack-slo-config
ConfigMap does not exist, run the following command to create a ConfigMap named ack-slo-config:kubectl apply -f configmap.yaml
Run the following command to query the CPU cores that are allocated to the BE pods on the node:
cat /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-besteffort.slice/cpuset.cpus
Expected output:
10-25,35-51,62-77,87-103
The expected output shows that the following CPU cores are allocated to the BE pods on the node:
10-25,35-51,62-77,87-103
. This indicates that the available CPU resources for BE pods are restricted after enabling CPU Suppress, based on current resource usage.Optional: Configure advanced parameters based on the following ConfigMap content.
CPU Suppress allows you to further configure the CPU utilization threshold.
apiVersion: v1 kind: ConfigMap metadata: name: ack-slo-config namespace: kube-system data: resource-threshold-config: | { "clusterStrategy": { "enable": true, "cpuSuppressThresholdPercent": 65 } }
The following table describes the key parameters:
Parameter
Type
Value range
Description
enable
Boolean
true
false
true
: enables CPU Suppress.false
: disables CPU Suppress.
cpuSuppressThresholdPercent
Integer
[0, 100]
The CPU utilization threshold of the node. Default value:
65
. Unit: %.