Use metrics and dashboards of kube-scheduler - Container Service for Kubernetes

kube-scheduler is the default scheduler of Kubernetes clusters. This component is responsible for scheduling pods to run on appropriate cluster nodes. This topic describes the metrics of kube-scheduler. This topic also describes how to use the dashboards of kube-scheduler and provides suggestions on how to troubleshoot common metric anomalies.

Usage notes

Dashboard access

For more information, see View control plane component dashboards in ACK Pro clusters.

Metrics

Metrics can indicate the status and parameter settings of a component. The following table describes the metrics supported by kube-scheduler.

Metric	Type	Description
scheduler_scheduler_cache_size	Gauge	The numbers of nodes, pods, and assumed pods in the scheduler cache.
scheduler_pending_pods	Gauge	The number of pending pods. Pending pods consist of the following types: unschedulable: unschedulable pods. backoff: backoff queue pods, which are the pods that fail to be scheduled due to specific reasons. active: active queue pods, which are the pods ready to be scheduled.
scheduler_pod_scheduling_attempts_bucket	Histogram	The number of times that the scheduler attempts to schedule pods. The bucket thresholds are defined as the set `{1, 2, 4, 8, 16}`.
memory_utilization_byte	Gauge	The memory usage. Unit: bytes.
cpu_utilization_core	Gauge	The used CPU capacity. Unit: core.
rest_client_requests_total	Counter	The number of HTTP requests calculated based on status codes, methods, and hosts.
rest_client_request_duration_seconds_bucket	Histogram	The HTTP response latency calculated based on Verbs and URLs.

Note

The following resource utilization metrics are deprecated. Remove any alerts and monitoring data that depend on these metrics at the earliest opportunity:

cpu_utilization_ratio: CPU utilization.
memory_utilization_ratio: Memory utilization.

Usage notes for dashboards

Dashboards are generated based on metrics and Prometheus Query Language (PromQL). The following sections describe the observability and features of the dashboards of kube-scheduler.

If the metrics of kube-apiserver become abnormal, check whether the metric anomalies described in the following sections exist. If metric anomalies that are not described in the following sections occur, submit a ticket.

Overview

Observability

Features

Metric	PromQL	Description
Scheduler Pending Pods	scheduler_pending_pods{job="ack-scheduler"}	The number of pending pods. Pending pods consist of the following types: unschedulable: unschedulable pods. backoff: backoff queue pods, which are the pods that fail to be scheduled due to specific reasons. active: active queue pods, which are the pods ready to be scheduled.
Number of Scheduler attempts to successfully schedule a pod	histogram_quantile($quantile, sum(rate(scheduler_pod_scheduling_attempts_bucket{job="ack-scheduler"}[$interval])) by (pod, le))	The number of times that kube-scheduler attempts to schedule pods. The bucket thresholds are defined as the set `{1, 2, 4, 8, 16}`.
Scheduler cache Data Statistics	scheduler_scheduler_cache_size{job="ack-scheduler",type="nodes"} scheduler_scheduler_cache_size{job="ack-scheduler",type="pods"} scheduler_scheduler_cache_size{job="ack-scheduler",type="assumed_pods"}	The numbers of nodes, pods, and assumed pods in the scheduler cache.

Resources

Observability

Features

Metric	PromQL	Description
Memory Usage	memory_utilization_byte{container="kube-scheduler"}	The memory usage. Unit: bytes.
CPU Usage	cpu_utilization_core{container="kube-scheduler"}*1000	The used CPU capacity. Unit: millicore.

Kube API

Observability

Features

Metric	PromQL	Description
Kube API Request QPS	sum(rate(rest_client_requests_total{job="ack-scheduler",code=~"2.."}[$interval])) by (method,code) sum(rate(rest_client_requests_total{job="ack-scheduler",code=~"3.."}[$interval])) by (method,code) sum(rate(rest_client_requests_total{job="ack-scheduler",code=~"4.."}[$interval])) by (method,code) sum(rate(rest_client_requests_total{job="ack-scheduler",code=~"5.."}[$interval])) by (method,code)	The number of HTTP requests sent from kube-scheduler to kube-apiserver per second. The queries per second (QPS) value is calculated based on status codes, methods, and hosts.
Kube API Request Latency	histogram_quantile($quantile, sum(rate(rest_client_request_duration_seconds_bucket{job="ack-scheduler"}[$interval])) by (verb,url,le))	The time interval between a request sent by kube-scheduler and a response returned by kube-apiserver. The latency is calculated based on Verbs and URLs.

Common metric anomalies

Scheduler Pods

Normal case	Anomaly	Anomaly description	Suggestion
The number of scheduler pods is equal to or greater than 1.	The number of scheduler pods is 0.	No scheduler pods exist in the cluster.	Check whether the Deployment or StatefulSet corresponding to kube-scheduler exists. Check whether the scheduler pods are manually terminated by other users.

Scheduler Pending Pods

Normal case	Anomaly	Anomaly description	Suggestion
Pod scheduling is consistently slow.	The number of unschedulable pods continuously increases. The number of unschedulable pods does not decrease after other pods are terminated.	The resource requests of the pods in the cluster are not properly configured or the cluster does not have sufficient nodes.	Check whether the cluster has sufficient nodes for pod scheduling. Check whether improper pod affinity and attributes are configured.

Number of successful Scheduler attempts to schedule Pod

Normal case	Anomaly	Anomaly description	Suggestion
A pod can be scheduled to a node after several attempts.	A pod fails to be scheduled after several attempts.	The resource requests of the pods in the cluster are not properly configured or the cluster does not have sufficient nodes.	Check whether the cluster has sufficient nodes for pod scheduling. Check whether improper pod affinity and attributes are configured.

References

For more information about the metrics, usage notes for using the dashboards, and suggestions on how to troubleshoot common metric anomalies for other control plane components, see the following topics: Metrics of kube-apiserver, Metrics of etcd, Metrics of controller-manager, and Metrics of cloud-controller-manager,