All Products
Search
Document Center

Container Service for Kubernetes:Descheduling overview

Last Updated:Jul 17, 2024

Descheduling is a process of scheduling pods that match eviction rules on a node to another node. In scenarios where hotspot nodes exist due to cluster resource inequality or pods fail to be scheduled based on the predefined policy due to changes in node attributes, you can use descheduling to optimize resource usage and schedule pods to the optimal nodes. This ensures the high availability and efficiency of the workloads in your cluster.

Before you start

We recommend that you take note of the following items before you use descheduling.

After you read this topic, you will learn the following information about descheduling:

  • The use scenarios of descheduling.

  • The workflow of descheduling.

  • Terms related to descheduling, including descheduling templates, the Deschedule and Balance descheduling policies, and the DefaultEvictor and MigrationController evictors.

Why descheduling is important

Kubernetes Scheduler schedules pods to proper nodes based on the current cluster status. However, the cluster status constantly changes. In some scenarios, you may need to migrate running pods to other nodes. Examples:

  • The workloads in the cluster are not evenly distributed. Some nodes are overloaded. Consequently, the performance of these nodes is compromised.

  • The overall resource utilization of the cluster is low. You want to remove some nodes to save costs. To do this, you must first migrate pods on these nodes to other nodes.

  • The cluster contains large numbers of resource fragments. Consequently, a node may not have sufficient idle resources while the total resources in the cluster are still sufficient. In this scenario, pods that request large amounts of resources fail to be scheduled. This further increases the expenses on cluster resources.

  • Taints or labels are added to or removed from a node. In this scenario, some pods on the node no longer match the affinity rule. These pods must be migrated to a node whose affinity rule matches the pods.

For example, an application is experiencing fluctuating workloads at different times. Hotspot nodes may exist during peak hours, which further compromise the service quality of the application. To resolve this issue, the descheduling feature can evict abnormal pods to reduce the loads of the hotspot node in order to ensure service quality.

To meet this goal, ack-koordinator provides a descheduling module named Koordinator Descheduler, which runs as a Deployment on nodes. Koordinator Descheduler is developed based on Descheduling Framework of Kubernetes Descheduler. It is compatible with all descheduling policies of Kubernetes Descheduler. Koordinator Descheduler also optimizes pod descheduling in terms of descheduling policies, pod eviction methods, eviction traffic control, and eviction processes. For more information about the differences between Koordinator Descheduler and Kubernetes Descheduler, see Koordinator Descheduler and Kubernetes Descheduler.

Descheduling workflow

Koordinator Descheduler works periodically. You can configure multiple descheduling policies to filter, check, and evict pods that meet the eviction conditions. The following figure shows how Koordinator Descheduler works.

image
  1. Traverse descheduling templates and execute the Deschedule policy enabled in each template in sequence.

    1. Obtain information about nodes, workloads, and pods.

    2. Match pods against the Deschedule rules.

    3. Filter, check, and sort pods that meet the eviction rules.

    4. The pod evictor enabled in the descheduling template sends a pod eviction request.

  2. Traverse the descheduling templates again and execute the Balance policy enabled in each template in sequence. The workflow is similar to Step 1.

The following sections describe the terms used in the descheduling workflow.

Descheduling template

You can configure one or more descheduling templates to meet different descheduling requirements. You can specify the descheduling policy and pod evictor to be used in each template, set the applicable scope, and configure advanced parameters in each template for a specific type of pod.

Using descheduling policies allows you to apply different pod eviction policies in different scenarios. For example, to evict expired pods with different TTLs in different namespaces, you can configure multiple descheduling templates and set a threshold for the TTL of pods in each namespace.

During the descheduling process, Koordinator Descheduler traverses all descheduling templates and executes the Deschedule policy in each template. After all Deschedule policies are executed, Koordinator Descheduler traverses the templates again and executes the Balance policy in each template.

Descheduling policies: Deschedule and Balance

Two descheduling policies are available: Deschedule and Balance. Descheduling policies are used to match pods to be evicted. For more information about how to configure descheduling policies and the features supported by each policy, see Configure policy plug-ins.

  • Deschedule: Koordinator Descheduler matches each pod against the current scheduling constraint and evicts the matching pod in sequence. For example, Koordinator Descheduler evicts pods that do not match the node affinity or anti-affinity rule one after one.

  • Balance: Koordinator Descheduler optimizes the distribution of certain pods or all pods in the cluster, and then decides the pods to be evicted. For example, Koordinator Descheduler evicts pods from hotspot nodes based on the resource utilization of nodes.

Descheduling policies are independent of each other. Koordinator Descheduler executes each descheduling policy to evict pods based on the eviction rules of the policy, and then filters, checks, and sorts the matching pods. Constraints such as cluster capacity, resource utilization, and number of replicas are taken into consideration when pods are matched against the eviction rules. After Koordinator Descheduler decides the pods to be evicted, it sends a request to the pod evictor to evict the pods.

Evictors and eviction control

Evictors in Koordinator Descheduler are used to evict pods. The following evictors are available: DefaultEvictor and MigrationController.

  • DefaultEvictor: a basic pod evictor, which allows you to limit the maximum number of pods to be evicted. You can use and configure the DefaultEvictor in the same way you use and configure the DefaultEvictor in Kubernetes Descheduler. For more information, see DefaultEvictor.

  • MigrationController: allows you to evict and migrate pods in a secure way and trace pod eviction. For more information about the parameters, see Configure evictor plug-ins.

For more information about the difference between the two evictors, see Comparison.

Enable descheduling

For more information about how to install the ack-koordinator component and enable descheduling, see Enable the descheduling feature.

You can also configure advanced parameters for Koordinator Descheduler, descheduling templates, descheduling policy plug-ins, and evictor plug-ins in a ConfigMap. For more information, see Configure advanced parameters.

References

  • If you use Kubernetes Descheduler, we recommend that you migrate to Koordinator Descheduler. For more information about the difference and how to migrate to Koordinator Descheduler, see Comparison between Koordinator Descheduler and Kubernetes Descheduler.

  • The descheduling feature is developed based on Koordinator Descheduler of the ack-koordinator component. You can install and use the ack-koordinator component free of charge. However, this component may incur fees in certain scenarios. For more information, see Billing rules.