All Products
Search
Document Center

Container Service for Kubernetes:Basic monitoring capabilities

Last Updated:May 09, 2024

The ack-koordinator component of Container Service for Kubernetes (ACK) supports service level objective (SLO)-aware resource scheduling. This topic describes how to use ack-koordinator to enable basic monitoring in scenarios in which latency-sensitive (LS) workloads and best-effort (BE) workloads are colocated.

Prerequisites

View the basic monitoring data of colocation

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage. In the left-side navigation pane, choose Operations > Prometheus Monitoring.

  3. On the Prometheus Monitoring page, choose Cost Analysis/Resource Optimization > Hybrid Deployment of Online Workloads and Offline Workloads.

    The following figure shows the basic monitoring dashboard.在离线混部

Description of the basic monitoring dashboard

You can view the following information on the basic monitoring dashboard:

  • Resource benefits: provides panels that display statistics about resource benefits in colocation scenarios.

  • Observability: provides insights into the resources and workloads in colocation scenarios. You can view the resource statistics by cluster, node pool, node, and pod.

Resource benefits overview

The Resource Benefits Overview section displays information about resource benefits and resource usage trends.

Information about the total amount of colocated resources and the amount of colocated resources allocated集群混部收益情况

Term

Description

Non-colocated resources

All allocatable physical resources on the node. The total amount of non-colocated resources depends on the node specification and remains unchanged after you enable colocation.

Colocated resources

SLO-aware workload scheduling of ACK uses the dynamic resource overcommitment feature to utilize idle resources in the cluster. The total amount of colocated resources varies based on the resource utilization of the node and dynamically changes with the idle resources of the node. Colocated resources are schedulable resources used in colocation scenarios. The amount of colocated resources is a key metric used to evaluate resource benefits in colocation scenarios.

Total amount of colocated resources

Colocated resources include vCPUs and memory resources that can be used for colocation in the cluster. The preceding figure shows that the cluster has 118 vCPUs and 487 GiB of memory available for colocation of LS workloads and BE workloads. If the amount of colocated resources increases, the cluster can provide more schedulable idle resources. In this case, more applications can be deployed and more resource benefits can be generated in colocation scenarios.

Total amount of allocated colocated resources

Allocated colocated resources include vCPUs and memory resources that are allocated for colocation. The preceding figure shows that 2 vCPUs and 1 GiB of memory are allocated in colocation scenarios. If the amount of allocated colocated resources increases, more colocated resources are allocated. In this case, more applications can be deployed and more resource benefits can be generated in colocation scenarios.

Ratio of allocated colocated resources

The ratio of allocated colocated resources is calculated based on vCPUs and memory for colocation, respectively. The ratio is calculated by using the following formula: Ratio of allocated resources = Amount of allocated resources/Total amount of resources. The preceding figure shows that the ratio of vCPUs allocated for colocation is 1.70% and the ratio of memory allocated for colocation is 0.21%. A higher ratio indicates that more resource benefits can be generated in colocation scenarios.

Usage trend of colocated resources混部资源使用趋势

Term

Description

Number of colocated pods

The number of colocated pods includes the number of pods that use non-colocated resources and the number of pods that use colocated resources. The ratio of pods that use non-colocated resources and the ratio of pods that use colocated resources are also displayed.

Ratio of colocated resources

The ratio of vCPUs for colocation and the ratio of memory for colocation are displayed. The ratios indicate the amount of colocated resources and the amount of non-colocated resources. If the cluster has a large amount of idle resources and the ratio of colocated resources is high, a large amount of resources can be used for colocation.

Colocated resources overview

The Cluster Dimension, Node Dimension, and Pod Dimension sections display the resource usage and resource requests by cluster, node, and pod.

Cluster Dimension集群资源视图-1集群资源视图-2

Term

Description

Resource usage in the cluster

Information about the consumed CPU and memory resources is displayed. The information includes the total vCPUs and total memory capacity of the cluster, the vCPUs and memory used by non-colocated pods, the vCPUs and memory used by colocated pods, and the vCPUs and memory used by system components. If the vCPUs or memory used by non-colocated pods, colocated pods, and system components is much lower than the total vCPUs or memory capacity of the cluster, the CPU utilization and the memory utilization of the cluster are low and a large amount of CPU or memory resources is idle in the cluster.

Requests for colocated resources in the cluster

Information about the requests for colocated resources is displayed. The information includes the total allocatable vCPUs and memory for colocation, and the requested vCPUs and memory for colocation. If the amount of requested colocated resources is close to the total amount of colocated resources, the ratio of allocated colocated resources is high.

Requests for non-colocated resources in the cluster

Information about the requests for non-colocated resources is displayed. The information includes allocatable vCPUs and allocatable memory for non-colocation scenarios, and the requested vCPUs and memory for non-colocation scenarios. If the amount of requested non-colocated resources is close to the total amount of non-colocated resources, the ratio of allocated non-colocated resources is high.

You can change the value of the node_label parameter to view the panels of a specific node pool.

节点池视图

The following table describes the filters that are provided on the page.

Filter

Description

node_label_value

Default value: All. When the default value is used, the Resource Benefits Overview and Cluster Dimension sections display statistics about all cluster nodes.

You can select a node pool ID from the drop-down list to display information about the node pool in the Resource Benefits Overview and Cluster Dimension sections.

node_label

You can specify a node label to select a node. For more information, see the Notes section on the Hybrid Deployment of Online Workloads and Offline Workloads tab.

Node Dimension

On the Hybrid Deployment of Online Workloads and Offline Workloads tab, you can select a node to view the resource information about the node.

单机资源视图-1单机资源视图-2单机资源视图-3

Term

Description

Ratio of colocated resources on the node

Information about the ratio of colocated resources on the node is displayed. The information includes the ratio of vCPUs for non-colocation scenarios, the ratio of memory for non-colocation scenarios, the ratio of vCPUs for colocation, and the ratio of memory for colocation. The total amount of non-colocated resources is stacked on the total amount of colocated resources so you can compare the statistics.

Resource usage on the node

Information about the CPU usage, memory usage (including cache), and memory usage (excluding cache) on the node is displayed. The information includes the total vCPUs and total memory of the node, the vCPUs and memory used by non-colocated pods, the vCPUs and memory used by colocated pods, and the vCPUs and memory used by system components. If the amount of vCPUs or memory resources used by non-colocated pods, colocated pods, and system components is much lower than the total vCPUs or memory capacity of the node, the CPU utilization or memory utilization of the node is low and a large amount of CPU or memory resources is idle on the node.

Requested colocated resources on the node

Information about the requested vCPUs and memory for colocation is displayed. The information includes the total vCPUs, requested vCPUs, total memory, and requested memory for colocation on the node. If the amount of requested colocated resources is close to the total amount of colocated resources, the ratio of allocated colocated resources is high.

Requests for colocated resources by each pod

Information about the requests for vCPUs and memory for colocation by each pod on the node is displayed.

Colocated resource utilization of each pod

Information about the utilization of vCPUs for colocation and utilization of memory for colocation of each pod on the node is displayed.

Pod Dimension

On the Hybrid Deployment of Online Workloads and Offline Workloads tab, you can change the values of the pod_namespace and pod_name parameters to view the panels of each pod.

Pod资源视图-1Pod资源视图-2Pod资源视图-3

Term

Description

Amount of colocated resources used by the pod

Information about the amount of CPU and memory resources for colocation used by the pod. The information includes the CPU limit and memory limit, CPU request and memory request, and the actual CPU usage and memory usage.

Colocated resource utilization of the pod

Information about the utilization of vCPUs for colocation and utilization of memory for colocation of the pod on the node is displayed.

Amount of colocated resources used by each container

Information about the amount of CPU and memory resources for colocation used by each container in the pod. The information includes the CPU limit and memory limit, CPU request and memory request, and the actual CPU usage and memory usage.

FAQ

How do I resolve the issue that no data is displayed in the Resource Benefits Overview section on the Hybrid Deployment of Online Workloads and Offline Workloads tab?

  1. Check whether ack-koordinator is installed.

    1. Log on to the ACK console and click Clusters in the left-side navigation pane.

    2. On the Clusters page, click the name of the cluster that you want to manage and choose Applications > Helm in the left-side navigation pane.

    3. On the Helm page, check whether the ack-koordinator component exists.

  2. Check whether data is displayed in the Resource Benefits Overview section.

    If no data is displayed in this section, perform the following steps:

    1. Log on to the ARMS console.
    2. In the left-side navigation pane, choose Managed Service for Prometheus > Instances.

    3. In the upper-left corner of the page that appears, select the region in which the Prometheus instance that you want to manage is deployed and click the Prometheus instance. In the left-side navigation pane of the details page of the Prometheus instance, click Service Discovery.

    4. On the Metrics tab of the Service Discovery page, enter kube_node_labels in the search box. Find kube_node_labels and click Enable in the Actions column. In the Note message, click OK.

How do I resolve the issue that the view of the dashboard is different from the view that is provided in this topic?

Update the Hybrid Deployment of Online Workloads and Offline Workloads dashboard to the latest version. For more information about how to update a dashboard, see View Grafana dashboards.