By Alwyn Botha, Alibaba Cloud Community Blog author.
This tutorial demonstrates how to define Pods that run CPU-intensive workloads that are sensitive to context switches plus the following characteristics:
From https://kubernetes.io/blog/2018/07/24/feature-highlight-cpu-manager/
CPU manager might help Kubernetes Pod workloads with the following characteristics:
Sensitive to CPU throttling effects.
Sensitive to context switches.
Sensitive to processor cache misses.
Benefits from sharing a processor resources (e.g., data and instruction caches).
Sensitive to cross-socket memory traffic.
Sensitive or requires hyperthreads from the same physical CPU core.
When your Kubernetes node is under light CPU load, there is no problem. This means there are enough CPU cores for all Pods to work as if they were the only Pods using its CPU.
When many CPU-intensive loads run Pods compete for CPU cores then Pods must share CPU time. As CPU time becomes available such workloads may get done on other CPU cores.
A significant part of CPU time is spend switching between these workloads. This is called a context switch.
https://en.wikipedia.org/wiki/Context_switch
A context switch is the process of storing the state of a process or of a thread, so that it can be restored and execution resumed from the same point later. This allows multiple processes to share a single CPU.
Context switches are usually computationally intensive.
Kubernetes allows you to configure its CPU Manager policy so that such workloads can run more efficiently.
The kubelet CPU Manager policy is set with --cpu-manager-policy
Two policies are supported:
near exclusivity: Kubernetes processes may still use part of the CPU time
near exclusivity: all other Pods are prevented form using the allocated CPU core
Certain Pods
Only Guaranteed pods ( Pods with matching integer CPU requests and limits ) are granted exclusive access to the CPU requests the specify.
From https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/
This static assignment increases CPU affinity and decreases context switches due to throttling for the CPU-bound workload.
Contents
Initially all CPUs are available to all Pods. Guaranteed Pods remove CPUs from this availability pool.
When using this static policy --kube-reserved or --system-reserved must be specified.
These settings reserve CPU resources for Kubernetes system daemons.
--kube-reserved are allocated first : Kubernetes must be able to run
Guaranteed Pods then remove their requested integer CPU quantities from the shared CPU pool.
BestEffort and Burstable pods use remaining CPU pool. Their workload may context switch from time to time but this does not seriously affect their performance. ( If it seriously affects their performance they would have been defined Guaranteed. )
This tutorial uses minikube. To use static CPU Management Policies we need to set these kubelet settings:
minikube start --extra-config=kubelet.cpu-manager-policy="static" --extra-config=kubelet.kube-reserved="cpu=500m" --extra-config=kubelet.feature-gates="CPUManager=true"
Minikube will start up as normal. Only when you make a syntax error will it hang.
Everything looks great. Please enjoy minikube! means settings got applied successfully.
Unfortunately this is not mentioned in the documentation.
You cannot just change your Kubernetes cluster from CPU policy "none" to CPU policy "static" using just those flags on minikube start.
You will get this error at the bottom of journalctl -u kubelet :
3110 server.go:262] failed to run Kubelet: could not initialize checkpoint manager: could not restore state from checkpoint: configured policy "static" differs from state checkpoint policy "none"
Feb 07 07:08:43 minikube kubelet[3110]: Please drain this node and delete the CPU manager checkpoint file
"/var/lib/kubelet/cpu_manager_state" before restarting Kubelet.
To fix this problem get a list of your nodes:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
minikube Ready master 42d v1.12.4
Then drain your node name:
kubectl drain minikube
Make a backup of /var/lib/kubelet/cpu_manager_state
mv /var/lib/kubelet/cpu_manager_state /var/lib/kubelet/cpu_manager_state-old
Allow the node to accept work again:
kubectl uncordon minikube
node/minikube uncordoned
If you now run minikube with the static CPU Management Policy flags; it should work now.
We need to define a Guaranteed Pod:
nano myguaranteed.yaml
apiVersion: v1
kind: Pod
metadata:
name: myguaranteed-pod
spec:
containers:
- name: myram-container-1
image: mytutorials/centos:bench
imagePullPolicy: IfNotPresent
command: ['sh', '-c', 'stress --vm 1 --vm-bytes 5M --vm-hang 3000 -t 3600']
resources:
limits:
memory: "10Mi"
cpu: 1
requests:
memory: "10Mi"
cpu: 1
restartPolicy: Never
terminationGracePeriodSeconds: 0
You can specify CPU requests and limits using whole integers or Millicores.
One CPU core comprises 1000m = 1000 millicores.
A 4 core node has a CPU capacity of 4000m
4 cores * 1000m = 4000m
CPU Management Policies understand both formats. ( Specifying 1000m for both the CPU values would also work in our example above. )
kubectl create -f myguaranteed.yaml
pod/myguaranteed-pod created
Actual extract from minikube logs moments after the Pod got created.
Feb 07 07:41:39 minikube kubelet[2866]: I0207 07:41:39.833658 2866 policy_static.go:175] [cpumanager] static policy: AddContainer (pod: myguaranteed-pod, container: myram-container-1, container id: 07ff2dc06a922ffbddeb6bb3894492458764d7413cdc7d0552a26910da6ff13d)
Feb 07 07:41:39 minikube kubelet[2866]: I0207 07:41:39.833693 2866 policy_static.go:205] [cpumanager] allocateCpus: (numCPUs: 1)
Feb 07 07:41:39 minikube kubelet[2866]: I0207 07:41:39.833709 2866 state_mem.go:84] [cpumanager] updated default cpuset: "0,2-3"
Feb 07 07:41:39 minikube kubelet[2866]: I0207 07:41:39.834392 2866 policy_static.go:213] [cpumanager] allocateCPUs: returning "1"
Feb 07 07:41:39 minikube kubelet[2866]: I0207 07:41:39.834412 2866 state_mem.go:76] [cpumanager] updated desired cpuset (container id: 07ff2dc06a922ffbddeb6bb3894492458764d7413cdc7d0552a26910da6ff13d, cpuset: "1")
Logs edited for readability:
07:41:39 policy_static.go:175] [cpumanager] static policy: AddContainer (pod: myguaranteed-pod, container: myram-container-1, container id: 07ff2dc06a922ffbddeb6bb3894492458764d7413cdc7d0552a26910da6ff13d)
07:41:39 policy_static.go:205] [cpumanager] allocateCpus: (numCPUs: 1)
07:41:39 state_mem.go:84] [cpumanager] updated default cpuset: "0,2-3"
07:41:39 policy_static.go:213] [cpumanager] allocateCPUs: returning "1"
07:41:39 state_mem.go:76] [cpumanager] updated desired cpuset (container id: 07ff2dc06a922ffbddeb6bb3894492458764d7413cdc7d0552a26910da6ff13d, cpuset: "1")
kubectl describe pods/myguaranteed-pod | grep QoS
QoS Class: Guaranteed
kubectl delete -f myguaranteed.yaml
pod "myguaranteed-pod" deleted
kubelet logs after delete:
Feb 07 08:18:50 minikube kubelet[2866]: I0207 08:18:50.899414 2866 policy_static.go:195] [cpumanager] static policy: RemoveContainer (container id: b900c86f17b6da1899153af6728b6944318107fbcf78a97942667b600e65f6dd)
Feb 07 08:18:50 minikube kubelet[2866]: I0207 08:18:50.909928 2866 state_mem.go:84] [cpumanager] updated default cpuset: "0-3"
Default cpuset updated to include all CPUs again.
Here are some interesting log lines showing cpumanager startup cpuset assignments
Raw log lines
Feb 07 11:55:28 minikube kubelet[2847]: I0207 11:55:28.642182 2847 cpu_manager.go:113] [cpumanager] detected CPU topology: &{4 4 1 map[1:{0 1} 2:{0 2} 3:{0 3} 0:{0 0}]}
Feb 07 11:55:28 minikube kubelet[2847]: I0207 11:55:28.642203 2847 policy_static.go:97] [cpumanager] reserved 1 CPUs ("0") not available for exclusive assignment
Feb 07 11:55:28 minikube kubelet[2847]: I0207 11:55:28.642211 2847 state_mem.go:36] [cpumanager] initializing new in-memory state store
Feb 07 11:55:28 minikube kubelet[2847]: I0207 11:55:28.642486 2847 state_mem.go:84] [cpumanager] updated default cpuset: "0-3"
Feb 07 11:55:28 minikube kubelet[2847]: I0207 11:55:28.642494 2847 state_mem.go:92] [cpumanager] updated cpuset assignments: "map[]"
Edited for clarity.
11:55:28 [cpumanager] detected CPU topology: &{4 4 1 map[1:{0 1} 2:{0 2} 3:{0 3} 0:{0 0}]}
11:55:28 [cpumanager] reserved 1 CPUs ("0") not available for exclusive assignment
11:55:28 [cpumanager] initializing new in-memory state store
11:55:28 [cpumanager] updated default cpuset: "0-3"
11:55:28 [cpumanager] updated cpuset assignments: "map[]"
If we create a guaranteed Pod , but with FRACTIONAL CPU requests and limits, there are no lines added to kubelet logs about changes in cpuset.
Such Pods will share usage of CPUs in the pool: default cpuset: "0-3"
nano my-not-exclusive-guaranteed.yaml
apiVersion: v1
kind: Pod
metadata:
name: my-not-exclusive-guaranteed-pod
spec:
containers:
- name: myram-container-1
image: mytutorials/centos:bench
imagePullPolicy: IfNotPresent
command: ['sh', '-c', 'stress --vm 1 --vm-bytes 5M --vm-hang 3000 -t 3600']
resources:
limits:
memory: "10Mi"
cpu: 1.1
requests:
memory: "10Mi"
cpu: 1.1
restartPolicy: Never
terminationGracePeriodSeconds: 0
kubectl create -f my-not-exclusive-guaranteed.yaml
pod/my-not-exclusive-guaranteed-pod created
If you now investigate the tail end of minikube logs you will not find any mention of cpuset being changed.
kubectl delete myguaranteed-pod
kubectl delete my-not-exclusive-guaranteed-pod
You can see how this setting ( we used right at start of tutorial ) reserved CPU resources for Kubernetes.
Extract from my 4 CPU Kubernetes node. Note that 500m CPU is reserved.
kubectl describe node minikube
Capacity:
cpu: 4
Allocatable:
cpu: 3500m
cpu : 4 = 4000m
This tutorial gave practical experience on Kuberentes CPU manager, you can find more theoretical information about the Kubernetes CPU manager at the official Kubernetes blog: https://kubernetes.io/blog/2018/07/24/feature-highlight-cpu-manager/
2,599 posts | 762 followers
FollowAlibaba Container Service - April 28, 2020
Alibaba Cloud Native Community - September 18, 2023
Alibaba Cloud Native Community - December 7, 2023
Alibaba Cloud Native - June 9, 2022
Alibaba Cloud Native Community - December 1, 2022
Alibaba Cloud Native Community - January 25, 2024
2,599 posts | 762 followers
FollowElastic and secure virtual cloud servers to cater all your cloud hosting needs.
Learn MoreLearn More
Alibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn MoreMore Posts by Alibaba Clouder