All Products
Search
Document Center

Container Service for Kubernetes:Getting started

Last Updated:Apr 01, 2024

This topic describes how to use ack-koordinator to build a colocation environment for latency-sensitive (LS) and best-effort (BE) workloads. This topic also describes how to colocate your applications.

Prerequisites

Resource priorities and QoS

Resource priority and Quality of Service (QoS) class are key concepts of the service-level objective (SLO)-aware colocation feature provided by ACK.

Resource priorities are used to limit different types of resources on a node. Resource priorities can resolve the issue that the resource utilization still remains at a low level when the resource allocation increases. The following table describes the differences between different resource priorities.

  • The amount of low-priority resources depends on the amount of high-priority resources requested and used by pods. For example, Product resources that have been allocated but are not in use are downgraded to Batch resources and then re-allocated.

  • The way you set resource priorities affects the amount of cluster resources that can be overcommitted and the resource availability of the node.

  • Resources for overcommitment are described and updated as standard extended resources in the node metadata.

The following table describes the resource priorities that are used in ACK.

Resource priority

Resource quantity calculation

Resource name

Product

Typically, the amount of Product resources equals the amount of physical resources provided by the node.

The allocatable resources on the node, including CPU and memory resources.

Batch

The amount of Batch resources equals the amount of overcommitted resources, which are dynamically calculated based on the resource utilization of the node. The amount of resources for overcommitment is calculated based on the following formula: Amount of resources for overcommitment = Total amount of physical resources on the node - Amount of Product resources that are used. For more information, see Dynamic resource overcommitment.

Batch resources are described and updated as extended resources in the node metadata by using the kubernetes.io/batch-cpu and kubernetes.io/batch-memory parameters.

QoS classes describe the resource sensitivity of applications. Pods that are assigned different QoS classes run at different performance levels to meet different SLOs. Different QoS classes correspond to different resource isolation parameters. When the resources on a node are insufficient, resources are preferably allocated to pods with high-priority QoS classes. The following table describes the QoS classes that are used in ACK.

QoS class

Applicable workload

Description

LS (Latency Sensitive)

Online services (LS workloads)

LS workloads are prioritized in CPU time slice scheduling and allocation of memory resources, including L3 cache (last level cache) and memory bandwidth resources. The system preferably reclaims memory from BE workloads and reserves memory resources for LS workloads when memory reclaim is triggered.

BE (Best Effort)

Resource-intensive jobs (BE workloads)

BE workloads have a lower priority than LS workloads in CPU time slice scheduling. The L3 cache and memory bandwidth resources that can be used by BE workloads are limited. Compared with LS workloads, the system preferably reclaim memory from BE workloads when memory reclaim is triggered.

Resource priorities and QoS classes are independent of each other and can be used as a combination. However, due to the limits of the colocation model and business requirements, only the following combinations are used:

  • Product + LS: This combination is suitable for online applications that require low latency and must be prioritized during resource allocation, such as web applications and latency-sensitive stream computing jobs.

  • Batch + BE: This combination is suitable for offline applications that have a lower priority than online applications in resource allocation, such as batch Spark jobs, batch MapReduce jobs, and AI training jobs.

Manage colocation policies

ACK provides a ConfigMap that you can use to manage the colocation policies of ack-koordinator. To enable all colocation policies of ack-koordinator, perform the following steps:

  1. Create a file named configmap.yaml based on the following ConfigMap content:

    # Example of the ack-slo-config ConfigMap. 
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ack-slo-config
      namespace: kube-system
    data:
      colocation-config: |-
        {
          "enable": true
        }
      resource-qos-config: |-
        {
          "clusterStrategy": {
            "lsClass": {
              "cpuQOS": {
                "enable": true
              },
              "memoryQOS": {
                "enable": true
              },
              "resctrlQOS": {
                "enable": true
              }
            },
            "beClass": {
              "cpuQOS": {
                "enable": true
              },
              "memoryQOS": {
                "enable": true
              },
              "resctrlQOS": {
                "enable": true
              }
            }
          }
        }
      resource-threshold-config: |-
        {
          "clusterStrategy": {
            "enable": true
          }
        }

    The following table describes the parameters specifies different colocation policies included in the preceding example.

    Parameter

    Description

    colocation-config

    When this policy is specified, ack-slo-config collects real-time monitoring data about the loads of the node and then analyzes the monitoring data to identify resources that can be overcommitted. If resources are allocated to pods but are not in use, the resources can be overcommitted. For more information, see Dynamic resource overcommitment.

    resource-qos-config

    When this policy is specified, ack-slo-config manages different types of resources in a fine-grained manner and ensures that resources are preferably allocated to pods with high-priority QoS classes. For more information, see CPU QoS, Memory QoS for containers, and Resource isolation based on the L3 cache and MBA.

    resource-threshold-config

    When this policy is specified, ack-slo-config dynamically limits the resources that can be used by pods with low-priority QoS classes based on the resource utilization watermark of the node. For more information, see Elastic resource limit.

  2. Run the following command to create the ConfigMap:

    kubectl apply -f configmap.yaml

Deploy applications

  1. Create a file named nginx-ls-pod.yaml and copy the following content to the file:

    Set the QoS class of the latency-sensitive application to LS. In this example, koordinator.sh/qosClass: LS is specified in the labels section in the configurations of the pod that is created for an NGINX application.

    # Example of the nginx-ls-pod.yaml file. 
    apiVersion: v1
    kind: Pod
    metadata:
      labels:
        koordinator.sh/qosClass: LS
        app: nginx
      name: nginx
    spec:
      containers:
        - image: 'koordinatorsh/nginx:v1.18-koord-exmaple'
          imagePullPolicy: IfNotPresent
          name: nginx
          ports:
            - containerPort: 8000
              hostPort: 8000 # The port that is used to perform stress tests. 
              protocol: TCP
          resources:
            limits:
              cpu: '8'
              memory: 1Gi
            requests:
              cpu: '8'
              memory: 1Gi
          volumeMounts:
            - mountPath: /apps/nginx/conf
              name: config
      hostNetwork: true
      restartPolicy: Never
      volumes:
        - configMap:
            items:
              - key: config
                path: nginx.conf
            name: nginx-conf
          name: config
  2. Create a file named ffmpeg-be-pod.yaml and copy the following content to the file:

    Set the QoS class of the resource-intensive application to BE and configure resource overcommitment by specifying the kubernetes.io/batch-cpu and kubernetes.io/batch-memory parameters. In this example, koordinator.sh/qosClass: BE is specified in the labels section in the configurations of the pod that is created for a transcoding application.

    # Example of the ffmpeg-be-pod.yaml file. 
    apiVersion: v1
    kind: Pod
    metadata:
      labels:
        koordinator.sh/qosClass: BE
      name: be-ffmpeg
    spec:
      containers:
        - command:
            - start-ffmpeg.sh
            - '30'
            - '2'
            - /apps/ffmpeg/input/HD2-h264.ts
            - /apps/ffmpeg/
          image: 'registry.cn-zhangjiakou.aliyuncs.com/acs/ffmpeg-4-4-1-for-slo-test:v0.1'
          imagePullPolicy: Always
          name: ffmpeg
          resources:
            limits:
              # Unit: millicores. 
              kubernetes.io/batch-cpu: "70k"
              kubernetes.io/batch-memory: "22Gi"
            requests:
              # Unit: millicores. 
              kubernetes.io/batch-cpu: "70k"
              kubernetes.io/batch-memory: "22Gi"
  3. Run the following command to deploy the pods for the latency-sensitive application and the resource-intensive application:

    kubectl apply -f nginx-ls-pod.yaml
    kubectl apply -f ffmpeg-be-pod.yaml

What to do next

After the applications are deployed, you can use the colocation features provided by ACK. For more information, see Colocate online services and video transcoding applications.