All Products
Search
Document Center

Container Service for Kubernetes:Use metrics and dashboards of kube-scheduler

Last Updated:Sep 02, 2024

kube-scheduler is the default scheduler of Kubernetes clusters. This component is responsible for scheduling pods to run on appropriate cluster nodes. This topic describes the metrics of kube-scheduler. This topic also describes how to use the dashboards of kube-scheduler and provides suggestions on how to troubleshoot common metric anomalies.

Usage notes

Dashboard access

For more information, see View control plane component dashboards in ACK Pro clusters.

Metrics

Metrics can indicate the status and parameter settings of a component. The following table describes the metrics supported by kube-scheduler.

Metric

Type

Description

scheduler_scheduler_cache_size

Gauge

The numbers of nodes, pods, and assumed pods in the scheduler cache.

scheduler_pending_pods

Gauge

The number of pending pods. Pending pods consist of the following types:

  • unschedulable: unschedulable pods.

  • backoff: backoff queue pods, which are the pods that fail to be scheduled due to specific reasons.

  • active: active queue pods, which are the pods ready to be scheduled.

scheduler_pod_scheduling_attempts_bucket

Histogram

The number of times that the scheduler attempts to schedule pods. The bucket thresholds are defined as the set {1, 2, 4, 8, 16}.

memory_utilization_byte

Gauge

The memory usage. Unit: bytes.

cpu_utilization_core

Gauge

The used CPU capacity. Unit: core.

rest_client_requests_total

Counter

The number of HTTP requests calculated based on status codes, methods, and hosts.

rest_client_request_duration_seconds_bucket

Histogram

The HTTP response latency calculated based on Verbs and URLs.

Note

The following resource utilization metrics are deprecated. Remove any alerts and monitoring data that depend on these metrics at the earliest opportunity:

  • cpu_utilization_ratio: CPU utilization.

  • memory_utilization_ratio: Memory utilization.

Usage notes for dashboards

Dashboards are generated based on metrics and Prometheus Query Language (PromQL). The following sections describe the observability and features of the dashboards of kube-scheduler.

If the metrics of kube-apiserver become abnormal, check whether the metric anomalies described in the following sections exist. If metric anomalies that are not described in the following sections occur, submit a ticket.

Overview

Observability

image

Features

Metric

PromQL

Description

Scheduler Pending Pods

scheduler_pending_pods{job="ack-scheduler"}

The number of pending pods. Pending pods consist of the following types:

  • unschedulable: unschedulable pods.

  • backoff: backoff queue pods, which are the pods that fail to be scheduled due to specific reasons.

  • active: active queue pods, which are the pods ready to be scheduled.

Number of Scheduler attempts to successfully schedule a pod

histogram_quantile($quantile, sum(rate(scheduler_pod_scheduling_attempts_bucket{job="ack-scheduler"}[$interval])) by (pod, le))

The number of times that kube-scheduler attempts to schedule pods. The bucket thresholds are defined as the set {1, 2, 4, 8, 16}.

Scheduler cache Data Statistics

  • scheduler_scheduler_cache_size{job="ack-scheduler",type="nodes"}

  • scheduler_scheduler_cache_size{job="ack-scheduler",type="pods"}

  • scheduler_scheduler_cache_size{job="ack-scheduler",type="assumed_pods"}

The numbers of nodes, pods, and assumed pods in the scheduler cache.

Resources

Observabilityschedule2

Features

Metric

PromQL

Description

Memory Usage

memory_utilization_byte{container="kube-scheduler"}

The memory usage. Unit: bytes.

CPU Usage

cpu_utilization_core{container="kube-scheduler"}*1000

The used CPU capacity. Unit: millicore.

Kube API

Observabilityschedule3

Features

Metric

PromQL

Description

Kube API Request QPS

  • sum(rate(rest_client_requests_total{job="ack-scheduler",code=~"2.."}[$interval])) by (method,code)

  • sum(rate(rest_client_requests_total{job="ack-scheduler",code=~"3.."}[$interval])) by (method,code)

  • sum(rate(rest_client_requests_total{job="ack-scheduler",code=~"4.."}[$interval])) by (method,code)

  • sum(rate(rest_client_requests_total{job="ack-scheduler",code=~"5.."}[$interval])) by (method,code)

The number of HTTP requests sent from kube-scheduler to kube-apiserver per second. The queries per second (QPS) value is calculated based on status codes, methods, and hosts.

Kube API Request Latency

histogram_quantile($quantile, sum(rate(rest_client_request_duration_seconds_bucket{job="ack-scheduler"}[$interval])) by (verb,url,le))

The time interval between a request sent by kube-scheduler and a response returned by kube-apiserver. The latency is calculated based on Verbs and URLs.

Common metric anomalies

If the metrics of kube-apiserver become abnormal, check whether the metric anomalies described in the following sections exist. If metric anomalies that are not described in the following sections occur, submit a ticket.

Scheduler Pods

Normal case

Anomaly

Anomaly description

Suggestion

The number of scheduler pods is equal to or greater than 1.

The number of scheduler pods is 0.

No scheduler pods exist in the cluster.

  • Check whether the Deployment or StatefulSet corresponding to kube-scheduler exists.

  • Check whether the scheduler pods are manually terminated by other users.

Scheduler Pending Pods

Normal case

Anomaly

Anomaly description

Suggestion

Pod scheduling is consistently slow.

  • The number of unschedulable pods continuously increases.

  • The number of unschedulable pods does not decrease after other pods are terminated.

The resource requests of the pods in the cluster are not properly configured or the cluster does not have sufficient nodes.

  • Check whether the cluster has sufficient nodes for pod scheduling.

  • Check whether improper pod affinity and attributes are configured.

Number of successful Scheduler attempts to schedule Pod

Normal case

Anomaly

Anomaly description

Suggestion

A pod can be scheduled to a node after several attempts.

A pod fails to be scheduled after several attempts.

The resource requests of the pods in the cluster are not properly configured or the cluster does not have sufficient nodes.

  • Check whether the cluster has sufficient nodes for pod scheduling.

  • Check whether improper pod affinity and attributes are configured.

References

For more information about the metrics, usage notes for using the dashboards, and suggestions on how to troubleshoot common metric anomalies for other control plane components, see the following topics: Metrics of kube-apiserver, Metrics of etcd, Metrics of controller-manager, and Metrics of cloud-controller-manager,