You can use ack-descheduler to optimize the scheduling of pods that cannot be matched with suitable nodes. This avoids resource waste and improves resource utilization in Container Service for Kubernetes (ACK) clusters. This topic describes how to use ack-descheduler to optimize pod scheduling.
ack-descheduler is no longer maintained. We recommend that you migrate to the currently maintained component Koordinator Descheduler. For more information, see [Component Notice] ack-descheduler migration.
Prerequisites
An ACK cluster that runs Kubernetes 1.14 or later is created. For more information, see Create an ACK managed cluster.
A kubectl client is connected to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
The Helm component version must be v3.0 or later. For more information, see Update Helm V2 to Helm V3.
Install ack-descheduler
Log on to the ACK console.
In the left-side navigation pane of the ACK console, choose .
On the Marketplace page, click the App Catalog tab. Find and click ack-descheduler.
On the ack-descheduler page, click Deploy.
In the Deploy wizard, select a cluster and a namespace, and then click Next.
On the Parameters wizard page, configure the parameters and click OK.
After ack-descheduler is installed, a CronJob is automatically created in the
kube-system
namespace. By default, this CronJob runs every 2 minutes. After ack-descheduler is installed, you are directed to the ack-descheduler-default page. If all the relevant resources are created, as shown in the following figure, the component is installed.
Use ack-descheduler to optimize pod scheduling
Run the following command to check the DeschedulerPolicy setting of the ack-descheduler-default ConfigMap.
kubectl describe cm ack-descheduler-default -n kube-system
Expected output:
The following table describes the scheduling policies returned in the preceding output. For more information about the policy settings in the
strategies
section, see Descheduler.Policy
Description
RemoveDuplicates
This policy removes duplicate pods and ensures that only one pod is associated with a ReplicaSet, ReplicationController, StatefulSet, or Job that runs on the same node.
RemovePodsViolatingInterPodAntiAffinity
This policy deletes pods that violate inter-pod anti-affinity rules.
LowNodeUtilization
This policy finds nodes that are underutilized, evicts pods from other nodes, and recreates the pods on the underutilized nodes. The parameters of this policy are configured in the
nodeResourceUtilizationThresholds
section.RemovePodsHavingTooManyRestarts
This policy deletes pods that have been restarted for a specified number of times.
Verify pod scheduling before the scheduling policy is modified.
Create a Deployment to test the scheduling.
Create an nginx.yaml file and copy the following content to the file:
apiVersion: apps/v1 # for versions before 1.8.0 use apps/v1beta1 kind: Deployment metadata: name: nginx-deployment-basic labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.7.9 # Replace with the image that you want to use. The value must be in the <image_name:tags> format. ports: - containerPort: 80
Run the following command to create a Deployment with the nginx.yaml file:
kubectl apply -f nginx.yaml
Expected output:
deployment.apps/nginx-deployment-basic created
Wait 2 minutes and run the following command to check the nodes to which the pods are scheduled:
kubectl get pod -o wide | grep nginx
Expected output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-deployment-basic-**1 1/1 Running 0 36s 172.25.XXX.XX1 cn-hangzhou.172.16.XXX.XX2 <none> <none> nginx-deployment-basic-**2 1/1 Running 0 11s 172.25.XXX.XX2 cn-hangzhou.172.16.XXX.XX3 <none> <none> nginx-deployment-basic-**3 1/1 Running 0 36s 172.25.XXX.XX3 cn-hangzhou.172.16.XXX.XX3 <none> <none>
The output shows that pod
nginx-deployment-basic-**2
and podnginx-deployment-basic-**3
are scheduled to the same nodecn-hangzhou.172.16.XXX.XX3
.NoteIf you use the default settings for the ack-descheduler-default ConfigMap, the scheduling result varies based on actual conditions of the cluster.
Modify the scheduling policy.
If you use multiple scheduling policies, unexpected scheduling results may be obtained. To prevent this issue, modify the ConfigMap in Step 1 to retain only the RemoveDuplicates policy.
NoteThe RemoveDuplicates policy ensures that pods managed by replication controllers are evenly distributed to different nodes.
In this example, the name of the ConfigMap is changed to newPolicy.yaml after the modification. The modified ConfigMap contains the following content:
apiVersion: v1 kind: ConfigMap metadata: name: descheduler namespace: kube-system labels: app.kubernetes.io/instance: descheduler app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: descheduler app.kubernetes.io/version: 0.20.0 helm.sh/chart: descheduler-0.20.0 annotations: meta.helm.sh/release-name: descheduler meta.helm.sh/release-namespace: kube-system data: policy.yaml: |- apiVersion: "descheduler/v1alpha1" kind: "DeschedulerPolicy" strategies: "RemoveDuplicates": # Retain only the RemoveDuplicates policy. enabled: true
Verify pod scheduling after the scheduling policy is modified.
Run the following command to apply the new scheduling policy:
kubectl apply -f newPolicy.yaml
Expected output:
configmap/descheduler created
Wait 2 minutes and run the following command to check the nodes to which the pods are scheduled:
kubectl get pod -o wide | grep nginx
Expected output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-deployment-basic-**1 1/1 Running 0 8m26s 172.25.XXX.XX1 cn-hangzhou.172.16.XXX.XX2 <none> <none> nginx-deployment-basic-**2 1/1 Running 0 8m1s 172.25.XXX.XX2 cn-hangzhou.172.16.XXX.XX1 <none> <none> nginx-deployment-basic-**3 1/1 Running 0 8m26s 172.25.XXX.XX3 cn-hangzhou.172.16.XXX.XX3 <none> <none>
The output shows that pod
nginx-deployment-basic-**2
is rescheduled tocn-hangzhou.172.16.XXX.XX1
by ack-descheduler. In this case, each of the three test pods is scheduled to a different node. This balances pod scheduling among multiple nodes.