This topic describes how to use ack-koordinator to build a colocation environment for latency-sensitive (LS) and best-effort (BE) workloads. This topic also describes how to colocate your applications.
Prerequisites
A Container Service for Kubernetes (ACK) Pro cluster is created. Only ACK Pro clusters support colocation of LS and BE workloads. For more information, see Create an ACK Pro cluster.
ack-koordinator (FKA ack-slo-manager) 0.8.0 or later is installed. For more information, see ack-koordinator (FKA ack-slo-manager).
To optimize the performance of applications that are deployed in colocation mode, we recommend that you use ECS bare metal instances and the Alibaba Cloud Linux operating system.
Resource priorities and QoS
Resource priority and Quality of Service (QoS) class are key concepts of the service-level objective (SLO)-aware colocation feature provided by ACK.
Resource priorities are used to limit different types of resources on a node. Resource priorities can resolve the issue that the resource utilization still remains at a low level when the resource allocation increases. The following table describes the differences between different resource priorities.
The amount of low-priority resources depends on the amount of high-priority resources requested and used by pods. For example, Product resources that have been allocated but are not in use are downgraded to Batch resources and then re-allocated.
The way you set resource priorities affects the amount of cluster resources that can be overcommitted and the resource availability of the node.
Resources for overcommitment are described and updated as standard extended resources in the node metadata.
The following table describes the resource priorities that are used in ACK.
Resource priority | Resource quantity calculation | Resource name |
Product | Typically, the amount of Product resources equals the amount of physical resources provided by the node. | The allocatable resources on the node, including CPU and memory resources. |
Batch | The amount of Batch resources equals the amount of overcommitted resources, which are dynamically calculated based on the resource utilization of the node. The amount of resources for overcommitment is calculated based on the following formula: Amount of resources for overcommitment = Total amount of physical resources on the node - Amount of Product resources that are used. For more information, see Dynamic resource overcommitment. | Batch resources are described and updated as extended resources in the node metadata by using the |
QoS classes describe the resource sensitivity of applications. Pods that are assigned different QoS classes run at different performance levels to meet different SLOs. Different QoS classes correspond to different resource isolation parameters. When the resources on a node are insufficient, resources are preferably allocated to pods with high-priority QoS classes. The following table describes the QoS classes that are used in ACK.
QoS class | Applicable workload | Description |
LS (Latency Sensitive) | Online services (LS workloads) | LS workloads are prioritized in CPU time slice scheduling and allocation of memory resources, including L3 cache (last level cache) and memory bandwidth resources. The system preferably reclaims memory from BE workloads and reserves memory resources for LS workloads when memory reclaim is triggered. |
BE (Best Effort) | Resource-intensive jobs (BE workloads) | BE workloads have a lower priority than LS workloads in CPU time slice scheduling. The L3 cache and memory bandwidth resources that can be used by BE workloads are limited. Compared with LS workloads, the system preferably reclaim memory from BE workloads when memory reclaim is triggered. |
Resource priorities and QoS classes are independent of each other and can be used as a combination. However, due to the limits of the colocation model and business requirements, only the following combinations are used:
Product + LS: This combination is suitable for online applications that require low latency and must be prioritized during resource allocation, such as web applications and latency-sensitive stream computing jobs.
Batch + BE: This combination is suitable for offline applications that have a lower priority than online applications in resource allocation, such as batch Spark jobs, batch MapReduce jobs, and AI training jobs.
Manage colocation policies
ACK provides a ConfigMap that you can use to manage the colocation policies of ack-koordinator. To enable all colocation policies of ack-koordinator, perform the following steps:
Create a file named configmap.yaml based on the following ConfigMap content:
# Example of the ack-slo-config ConfigMap. apiVersion: v1 kind: ConfigMap metadata: name: ack-slo-config namespace: kube-system data: colocation-config: |- { "enable": true } resource-qos-config: |- { "clusterStrategy": { "lsClass": { "cpuQOS": { "enable": true }, "memoryQOS": { "enable": true }, "resctrlQOS": { "enable": true } }, "beClass": { "cpuQOS": { "enable": true }, "memoryQOS": { "enable": true }, "resctrlQOS": { "enable": true } } } } resource-threshold-config: |- { "clusterStrategy": { "enable": true } }
The following table describes the parameters specifies different colocation policies included in the preceding example.
Parameter
Description
colocation-config
When this policy is specified, ack-slo-config collects real-time monitoring data about the loads of the node and then analyzes the monitoring data to identify resources that can be overcommitted. If resources are allocated to pods but are not in use, the resources can be overcommitted. For more information, see Dynamic resource overcommitment.
resource-qos-config
When this policy is specified, ack-slo-config manages different types of resources in a fine-grained manner and ensures that resources are preferably allocated to pods with high-priority QoS classes. For more information, see CPU QoS, Memory QoS for containers, and Resource isolation based on the L3 cache and MBA.
resource-threshold-config
When this policy is specified, ack-slo-config dynamically limits the resources that can be used by pods with low-priority QoS classes based on the resource utilization watermark of the node. For more information, see Elastic resource limit.
Run the following command to create the ConfigMap:
kubectl apply -f configmap.yaml
Deploy applications
Create a file named nginx-ls-pod.yaml and copy the following content to the file:
Set the QoS class of the latency-sensitive application to
LS
. In this example,koordinator.sh/qosClass: LS
is specified in thelabels
section in the configurations of the pod that is created for an NGINX application.# Example of the nginx-ls-pod.yaml file. apiVersion: v1 kind: Pod metadata: labels: koordinator.sh/qosClass: LS app: nginx name: nginx spec: containers: - image: 'koordinatorsh/nginx:v1.18-koord-exmaple' imagePullPolicy: IfNotPresent name: nginx ports: - containerPort: 8000 hostPort: 8000 # The port that is used to perform stress tests. protocol: TCP resources: limits: cpu: '8' memory: 1Gi requests: cpu: '8' memory: 1Gi volumeMounts: - mountPath: /apps/nginx/conf name: config hostNetwork: true restartPolicy: Never volumes: - configMap: items: - key: config path: nginx.conf name: nginx-conf name: config
Create a file named ffmpeg-be-pod.yaml and copy the following content to the file:
Set the QoS class of the resource-intensive application to
BE
and configure resource overcommitment by specifying thekubernetes.io/batch-cpu
andkubernetes.io/batch-memory
parameters. In this example,koordinator.sh/qosClass: BE
is specified in thelabels
section in the configurations of the pod that is created for a transcoding application.# Example of the ffmpeg-be-pod.yaml file. apiVersion: v1 kind: Pod metadata: labels: koordinator.sh/qosClass: BE name: be-ffmpeg spec: containers: - command: - start-ffmpeg.sh - '30' - '2' - /apps/ffmpeg/input/HD2-h264.ts - /apps/ffmpeg/ image: 'registry.cn-zhangjiakou.aliyuncs.com/acs/ffmpeg-4-4-1-for-slo-test:v0.1' imagePullPolicy: Always name: ffmpeg resources: limits: # Unit: millicores. kubernetes.io/batch-cpu: "70k" kubernetes.io/batch-memory: "22Gi" requests: # Unit: millicores. kubernetes.io/batch-cpu: "70k" kubernetes.io/batch-memory: "22Gi"
Run the following command to deploy the pods for the latency-sensitive application and the resource-intensive application:
kubectl apply -f nginx-ls-pod.yaml kubectl apply -f ffmpeg-be-pod.yaml
What to do next
After the applications are deployed, you can use the colocation features provided by ACK. For more information, see Colocate online services and video transcoding applications.
For more information about the colocation features, see the following topics: