Getting started - Container Service for Kubernetes - Alibaba Cloud Documentation Center

This topic describes how to use ack-koordinator to build a colocation environment for latency-sensitive (LS) and best-effort (BE) workloads. This topic also describes how to colocate your applications.

Prerequisites

A Container Service for Kubernetes (ACK) Pro cluster is created. Only ACK Pro clusters support colocation of LS and BE workloads. For more information, see Create an ACK Pro cluster.
ack-koordinator (FKA ack-slo-manager) 0.8.0 or later is installed. For more information, see ack-koordinator (FKA ack-slo-manager).
To optimize the performance of applications that are deployed in colocation mode, we recommend that you use ECS bare metal instances and the Alibaba Cloud Linux operating system.

Resource priorities and QoS

Resource priority and Quality of Service (QoS) class are key concepts of the service-level objective (SLO)-aware colocation feature provided by ACK.

Resource priorities are used to limit different types of resources on a node. Resource priorities can resolve the issue that the resource utilization still remains at a low level when the resource allocation increases. The following table describes the differences between different resource priorities.

The amount of low-priority resources depends on the amount of high-priority resources requested and used by pods. For example, Product resources that have been allocated but are not in use are downgraded to Batch resources and then re-allocated.
The way you set resource priorities affects the amount of cluster resources that can be overcommitted and the resource availability of the node.
Resources for overcommitment are described and updated as standard extended resources in the node metadata.

The following table describes the resource priorities that are used in ACK.

Resource priority	Resource quantity calculation	Resource name
Product	Typically, the amount of Product resources equals the amount of physical resources provided by the node.	The allocatable resources on the node, including CPU and memory resources.
Batch	The amount of Batch resources equals the amount of overcommitted resources, which are dynamically calculated based on the resource utilization of the node. The amount of resources for overcommitment is calculated based on the following formula: Amount of resources for overcommitment = Total amount of physical resources on the node - Amount of Product resources that are used. For more information, see Dynamic resource overcommitment.	Batch resources are described and updated as extended resources in the node metadata by using the `kubernetes.io/batch-cpu` and `kubernetes.io/batch-memory` parameters.

QoS classes describe the resource sensitivity of applications. Pods that are assigned different QoS classes run at different performance levels to meet different SLOs. Different QoS classes correspond to different resource isolation parameters. When the resources on a node are insufficient, resources are preferably allocated to pods with high-priority QoS classes. The following table describes the QoS classes that are used in ACK.

QoS class	Applicable workload	Description
LS (Latency Sensitive)	Online services (LS workloads)	LS workloads are prioritized in CPU time slice scheduling and allocation of memory resources, including L3 cache (last level cache) and memory bandwidth resources. The system preferably reclaims memory from BE workloads and reserves memory resources for LS workloads when memory reclaim is triggered.
BE (Best Effort)	Resource-intensive jobs (BE workloads)	BE workloads have a lower priority than LS workloads in CPU time slice scheduling. The L3 cache and memory bandwidth resources that can be used by BE workloads are limited. Compared with LS workloads, the system preferably reclaim memory from BE workloads when memory reclaim is triggered.

Resource priorities and QoS classes are independent of each other and can be used as a combination. However, due to the limits of the colocation model and business requirements, only the following combinations are used:

Product + LS: This combination is suitable for online applications that require low latency and must be prioritized during resource allocation, such as web applications and latency-sensitive stream computing jobs.
Batch + BE: This combination is suitable for offline applications that have a lower priority than online applications in resource allocation, such as batch Spark jobs, batch MapReduce jobs, and AI training jobs.

Manage colocation policies

ACK provides a ConfigMap that you can use to manage the colocation policies of ack-koordinator. To enable all colocation policies of ack-koordinator, perform the following steps:

Create a file named configmap.yaml based on the following ConfigMap content:

# Example of the ack-slo-config ConfigMap. 
apiVersion: v1
kind: ConfigMap
metadata:
  name: ack-slo-config
  namespace: kube-system
data:
  colocation-config: |-
    {
      "enable": true
    }
  resource-qos-config: |-
    {
      "clusterStrategy": {
        "lsClass": {
          "cpuQOS": {
            "enable": true
          },
          "memoryQOS": {
            "enable": true
          },
          "resctrlQOS": {
            "enable": true
          }
        },
        "beClass": {
          "cpuQOS": {
            "enable": true
          },
          "memoryQOS": {
            "enable": true
          },
          "resctrlQOS": {
            "enable": true
          }
        }
      }
    }
  resource-threshold-config: |-
    {
      "clusterStrategy": {
        "enable": true
      }
    }

The following table describes the parameters specifies different colocation policies included in the preceding example.

Parameter	Description
colocation-config	When this policy is specified, ack-slo-config collects real-time monitoring data about the loads of the node and then analyzes the monitoring data to identify resources that can be overcommitted. If resources are allocated to pods but are not in use, the resources can be overcommitted. For more information, see Dynamic resource overcommitment.
resource-qos-config	When this policy is specified, ack-slo-config manages different types of resources in a fine-grained manner and ensures that resources are preferably allocated to pods with high-priority QoS classes. For more information, see CPU QoS, Memory QoS for containers, and Resource isolation based on the L3 cache and MBA.
resource-threshold-config	When this policy is specified, ack-slo-config dynamically limits the resources that can be used by pods with low-priority QoS classes based on the resource utilization watermark of the node. For more information, see Elastic resource limit.

Run the following command to create the ConfigMap:
```
kubectl apply -f configmap.yaml
```

Deploy applications

Create a file named nginx-ls-pod.yaml and copy the following content to the file:

Set the QoS class of the latency-sensitive application to LS. In this example, koordinator.sh/qosClass: LS is specified in the labels section in the configurations of the pod that is created for an NGINX application.

# Example of the nginx-ls-pod.yaml file. 
apiVersion: v1
kind: Pod
metadata:
  labels:
    koordinator.sh/qosClass: LS
    app: nginx
  name: nginx
spec:
  containers:
    - image: 'koordinatorsh/nginx:v1.18-koord-exmaple'
      imagePullPolicy: IfNotPresent
      name: nginx
      ports:
        - containerPort: 8000
          hostPort: 8000 # The port that is used to perform stress tests. 
          protocol: TCP
      resources:
        limits:
          cpu: '8'
          memory: 1Gi
        requests:
          cpu: '8'
          memory: 1Gi
      volumeMounts:
        - mountPath: /apps/nginx/conf
          name: config
  hostNetwork: true
  restartPolicy: Never
  volumes:
    - configMap:
        items:
          - key: config
            path: nginx.conf
        name: nginx-conf
      name: config

Create a file named ffmpeg-be-pod.yaml and copy the following content to the file:

Set the QoS class of the resource-intensive application to BE and configure resource overcommitment by specifying the kubernetes.io/batch-cpu and kubernetes.io/batch-memory parameters. In this example, koordinator.sh/qosClass: BE is specified in the labels section in the configurations of the pod that is created for a transcoding application.

# Example of the ffmpeg-be-pod.yaml file. 
apiVersion: v1
kind: Pod
metadata:
  labels:
    koordinator.sh/qosClass: BE
  name: be-ffmpeg
spec:
  containers:
    - command:
        - start-ffmpeg.sh
        - '30'
        - '2'
        - /apps/ffmpeg/input/HD2-h264.ts
        - /apps/ffmpeg/
      image: 'registry.cn-zhangjiakou.aliyuncs.com/acs/ffmpeg-4-4-1-for-slo-test:v0.1'
      imagePullPolicy: Always
      name: ffmpeg
      resources:
        limits:
          # Unit: millicores. 
          kubernetes.io/batch-cpu: "70k"
          kubernetes.io/batch-memory: "22Gi"
        requests:
          # Unit: millicores. 
          kubernetes.io/batch-cpu: "70k"
          kubernetes.io/batch-memory: "22Gi"

Run the following command to deploy the pods for the latency-sensitive application and the resource-intensive application:
```
kubectl apply -f nginx-ls-pod.yaml
kubectl apply -f ffmpeg-be-pod.yaml
```

What to do next

After the applications are deployed, you can use the colocation features provided by ACK. For more information, see Colocate online services and video transcoding applications.

For more information about the colocation features, see the following topics: