You can deploy the vertical-pod-autoscaler component in a Container Service for Kubernetes (ACK) cluster. vertical-pod-autoscaler is a Vertical Pod Autoscaler (VPA) that enables vertical auto scaling of pods. vertical-pod-autoscaler automatically sets limits on the resource usage of a cluster based on the resource usage of the pods in the cluster. This way, ACK can schedule pods to nodes that have sufficient resources. vertical-pod-autoscaler also maintains the ratio of the resource requests to the resource limit that you specify in the initial container configurations. This topic describes how to use a YAML file to enable vertical pod auto scaling.
Prerequisites
Make sure that the following operations are complete:
An ACK cluster is created and its Kubernetes version is later than 1.12. For more information, see Create an ACK managed cluster.
A command-line tool is connected to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
vertical-pod-autoscaler is uninstalled from the cluster. This avoids conflicts when you deploy a new version of vertical-pod-autoscaler.
Background information
Vertical pod auto scaling is in testing. Exercise caution when you use this feature.
You can use vertical-pod-autoscaler to update the resource configurations of running pods. This feature is in testing. The configuration updates will lead to pod restart and recreation, and the pods may be scheduled to other nodes.
vertical-pod-autoscaler does not evict the pods that are not managed by replication controllers. For these pods, the Auto mode is equivalent to the Initial mode.
vertical-pod-autoscaler and the Horizontal Pod Autoscaler (HPA) cannot run at the same time. The HPA monitors the CPU and memory metrics. If the HPA monitors only custom or external resource metrics other than CPU and memory metrics, you can use vertical-pod-autoscaler in conjunction with the HPA.
vertical-pod-autoscaler uses an admission webhook as its admission controller. If other admission webhooks exist in the cluster, make sure that the admission webhooks do not conflict with the admission webhook of vertical-pod-autoscaler. The execution sequence of admission controllers is defined in the parameters of the API server.
vertical-pod-autoscaler can handle most out of memory (OOM) events, but may fail to handle OOM events in specific scenarios.
The performance of vertical-pod-autoscaler is not tested in large-scale clusters.
The pod resource requests that are modified by vertical-pod-autoscaler may exceed the upper limit of the actual resources, including node resources, idle resources, and resource quotas. In this case, a pod may enter the Pending state and fail to be scheduled. You can use the cluster autoscaler to mitigate the impact of this issue.
If multiple vertical-pod-autoscaler components monitor the resource usage of a pod at the same time, undefined behavior may occur.
Install vertical-pod-autoscaler
Run the following command to create a role-based access control (RBAC) permission file:
kubectl apply -f rbac.yaml
Click to view details
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: system:metrics-reader rules: - apiGroups: - "metrics.k8s.io" resources: - pods verbs: - get - list --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: system:vpa-actor rules: - apiGroups: - "" resources: - pods - nodes - limitranges verbs: - get - list - watch - apiGroups: - "" resources: - events verbs: - get - list - watch - create - apiGroups: - "poc.autoscaling.k8s.io" resources: - verticalpodautoscalers verbs: - get - list - watch - patch - apiGroups: - "autoscaling.k8s.io" resources: - verticalpodautoscalers verbs: - get - list - watch - patch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: system:vpa-checkpoint-actor rules: - apiGroups: - "poc.autoscaling.k8s.io" resources: - verticalpodautoscalercheckpoints verbs: - get - list - watch - create - patch - delete - apiGroups: - "autoscaling.k8s.io" resources: - verticalpodautoscalercheckpoints verbs: - get - list - watch - create - patch - delete - apiGroups: - "" resources: - namespaces verbs: - get - list --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: system:evictioner rules: - apiGroups: - "apps" - "extensions" resources: - replicasets verbs: - get - apiGroups: - "" resources: - pods/eviction verbs: - create --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:metrics-reader roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:metrics-reader subjects: - kind: ServiceAccount name: vpa-recommender namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:vpa-actor roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:vpa-actor subjects: - kind: ServiceAccount name: vpa-recommender namespace: kube-system - kind: ServiceAccount name: vpa-updater namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:vpa-checkpoint-actor roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:vpa-checkpoint-actor subjects: - kind: ServiceAccount name: vpa-recommender namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: system:vpa-target-reader rules: - apiGroups: - '*' resources: - '*/scale' verbs: - get - watch - apiGroups: - "" resources: - replicationcontrollers verbs: - get - list - watch - apiGroups: - apps resources: - daemonsets - deployments - replicasets - statefulsets verbs: - get - list - watch - apiGroups: - batch resources: - jobs - cronjobs verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:vpa-target-reader-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:vpa-target-reader subjects: - kind: ServiceAccount name: vpa-recommender namespace: kube-system - kind: ServiceAccount name: vpa-admission-controller namespace: kube-system - kind: ServiceAccount name: vpa-updater namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:vpa-evictioner-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:evictioner subjects: - kind: ServiceAccount name: vpa-updater namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: vpa-admission-controller namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: vpa-recommender namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: vpa-updater namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: system:vpa-admission-controller rules: - apiGroups: - "" resources: - pods - configmaps - nodes - limitranges verbs: - get - list - watch - apiGroups: - "admissionregistration.k8s.io" resources: - mutatingwebhookconfigurations verbs: - create - delete - get - list - apiGroups: - "poc.autoscaling.k8s.io" resources: - verticalpodautoscalers verbs: - get - list - watch - apiGroups: - "autoscaling.k8s.io" resources: - verticalpodautoscalers verbs: - get - list - watch - apiGroups: - "coordination.k8s.io" resources: - leases verbs: - create - update - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:vpa-admission-controller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:vpa-admission-controller subjects: - kind: ServiceAccount name: vpa-admission-controller namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: system:vpa-status-reader rules: - apiGroups: - "coordination.k8s.io" resources: - leases verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:vpa-status-reader-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:vpa-status-reader subjects: - kind: ServiceAccount name: vpa-updater namespace: kube-system
Run the following command to create a CustomResourceDefinition (CRD) for vertical-pod-autoscaler.
The CRD improves the scalability of ACK clusters. For more information, see Extend the Kubernetes API with CustomResourceDefinitions.
kubectl apply -f crd.yaml
crd.yaml template for clusters whose Kubernetes versions are earlier than 1.22
apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: verticalpodautoscalers.autoscaling.k8s.io annotations: "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797" spec: group: autoscaling.k8s.io scope: Namespaced names: plural: verticalpodautoscalers singular: verticalpodautoscaler kind: VerticalPodAutoscaler shortNames: - vpa version: v1beta1 versions: - name: v1beta1 served: false storage: false - name: v1beta2 served: true storage: true - name: v1 served: true storage: false validation: # openAPIV3Schema is the schema for validating custom objects. openAPIV3Schema: type: object properties: spec: type: object required: [] properties: targetRef: type: object updatePolicy: type: object properties: updateMode: type: string resourcePolicy: type: object properties: containerPolicies: type: array items: type: object properties: containerName: type: string controlledValues: type: string enum: ["RequestsAndLimits", "RequestsOnly"] mode: type: string enum: ["Auto", "Off"] minAllowed: type: object maxAllowed: type: object controlledResources: type: array items: type: string enum: ["cpu", "memory"] --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: verticalpodautoscalercheckpoints.autoscaling.k8s.io annotations: "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797" spec: group: autoscaling.k8s.io scope: Namespaced names: plural: verticalpodautoscalercheckpoints singular: verticalpodautoscalercheckpoint kind: VerticalPodAutoscalerCheckpoint shortNames: - vpacheckpoint version: v1beta1 versions: - name: v1beta1 served: false storage: false - name: v1beta2 served: true storage: true - name: v1 served: true storage: false
crd.yaml template for clusters whose Kubernetes versions are 1.22 and later
apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: annotations: api-approved.kubernetes.io: https://github.com/kubernetes/kubernetes/pull/63797 controller-gen.kubebuilder.io/version: v0.9.2 creationTimestamp: null name: verticalpodautoscalercheckpoints.autoscaling.k8s.io spec: group: autoscaling.k8s.io names: kind: VerticalPodAutoscalerCheckpoint listKind: VerticalPodAutoscalerCheckpointList plural: verticalpodautoscalercheckpoints shortNames: - vpacheckpoint singular: verticalpodautoscalercheckpoint scope: Namespaced versions: - name: v1 schema: openAPIV3Schema: description: VerticalPodAutoscalerCheckpoint is the checkpoint of the internal state of VPA that is used for recovery after recommender's restart. properties: apiVersion: description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources' type: string kind: description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds' type: string metadata: type: object spec: description: 'Specification of the checkpoint. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.' properties: containerName: description: Name of the checkpointed container. type: string vpaObjectName: description: Name of the VPA object that stored VerticalPodAutoscalerCheckpoint object. type: string type: object status: description: Data of the checkpoint. properties: cpuHistogram: description: Checkpoint of histogram for consumption of CPU. properties: bucketWeights: description: Map from bucket index to bucket weight. type: object x-kubernetes-preserve-unknown-fields: true referenceTimestamp: description: Reference timestamp for samples collected within this histogram. format: date-time nullable: true type: string totalWeight: description: Sum of samples to be used as denominator for weights from BucketWeights. type: number type: object firstSampleStart: description: Timestamp of the fist sample from the histograms. format: date-time nullable: true type: string lastSampleStart: description: Timestamp of the last sample from the histograms. format: date-time nullable: true type: string lastUpdateTime: description: The time when the status was last refreshed. format: date-time nullable: true type: string memoryHistogram: description: Checkpoint of histogram for consumption of memory. properties: bucketWeights: description: Map from bucket index to bucket weight. type: object x-kubernetes-preserve-unknown-fields: true referenceTimestamp: description: Reference timestamp for samples collected within this histogram. format: date-time nullable: true type: string totalWeight: description: Sum of samples to be used as denominator for weights from BucketWeights. type: number type: object totalSamplesCount: description: Total number of samples in the histograms. type: integer version: description: Version of the format of the stored data. type: string type: object type: object served: true storage: true - name: v1beta2 schema: openAPIV3Schema: description: VerticalPodAutoscalerCheckpoint is the checkpoint of the internal state of VPA that is used for recovery after recommender's restart. properties: apiVersion: description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources' type: string kind: description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds' type: string metadata: type: object spec: description: 'Specification of the checkpoint. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.' properties: containerName: description: Name of the checkpointed container. type: string vpaObjectName: description: Name of the VPA object that stored VerticalPodAutoscalerCheckpoint object. type: string type: object status: description: Data of the checkpoint. properties: cpuHistogram: description: Checkpoint of histogram for consumption of CPU. properties: bucketWeights: description: Map from bucket index to bucket weight. type: object x-kubernetes-preserve-unknown-fields: true referenceTimestamp: description: Reference timestamp for samples collected within this histogram. format: date-time nullable: true type: string totalWeight: description: Sum of samples to be used as denominator for weights from BucketWeights. type: number type: object firstSampleStart: description: Timestamp of the fist sample from the histograms. format: date-time nullable: true type: string lastSampleStart: description: Timestamp of the last sample from the histograms. format: date-time nullable: true type: string lastUpdateTime: description: The time when the status was last refreshed. format: date-time nullable: true type: string memoryHistogram: description: Checkpoint of histogram for consumption of memory. properties: bucketWeights: description: Map from bucket index to bucket weight. type: object x-kubernetes-preserve-unknown-fields: true referenceTimestamp: description: Reference timestamp for samples collected within this histogram. format: date-time nullable: true type: string totalWeight: description: Sum of samples to be used as denominator for weights from BucketWeights. type: number type: object totalSamplesCount: description: Total number of samples in the histograms. type: integer version: description: Version of the format of the stored data. type: string type: object type: object served: true storage: false --- apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: annotations: api-approved.kubernetes.io: https://github.com/kubernetes/kubernetes/pull/63797 controller-gen.kubebuilder.io/version: v0.9.2 creationTimestamp: null name: verticalpodautoscalers.autoscaling.k8s.io spec: group: autoscaling.k8s.io names: kind: VerticalPodAutoscaler listKind: VerticalPodAutoscalerList plural: verticalpodautoscalers shortNames: - vpa singular: verticalpodautoscaler scope: Namespaced versions: - additionalPrinterColumns: - jsonPath: .spec.updatePolicy.updateMode name: Mode type: string - jsonPath: .status.recommendation.containerRecommendations[0].target.cpu name: CPU type: string - jsonPath: .status.recommendation.containerRecommendations[0].target.memory name: Mem type: string - jsonPath: .status.conditions[?(@.type=='RecommendationProvided')].status name: Provided type: string - jsonPath: .metadata.creationTimestamp name: Age type: date name: v1 schema: openAPIV3Schema: description: VerticalPodAutoscaler is the configuration for a vertical pod autoscaler, which automatically manages pod resources based on historical and real time resource utilization. properties: apiVersion: description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources' type: string kind: description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds' type: string metadata: type: object spec: description: 'Specification of the behavior of the autoscaler. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.' properties: recommenders: description: Recommender responsible for generating recommendation for this object. List should be empty (then the default recommender will generate the recommendation) or contain exactly one recommender. items: description: VerticalPodAutoscalerRecommenderSelector points to a specific Vertical Pod Autoscaler recommender. In the future it might pass parameters to the recommender. properties: name: description: Name of the recommender responsible for generating recommendation for this object. type: string required: - name type: object type: array resourcePolicy: description: Controls how the autoscaler computes recommended resources. The resource policy may be used to set constraints on the recommendations for individual containers. If not specified, the autoscaler computes recommended resources for all containers in the pod, without additional constraints. properties: containerPolicies: description: Per-container resource policies. items: description: ContainerResourcePolicy controls how autoscaler computes the recommended resources for a specific container. properties: containerName: description: Name of the container or DefaultContainerResourcePolicy, in which case the policy is used by the containers that don't have their own policy specified. type: string controlledResources: description: Specifies the type of recommendations that will be computed (and possibly applied) by VPA. If not specified, the default of [ResourceCPU, ResourceMemory] will be used. items: description: ResourceName is the name identifying various resources in a ResourceList. type: string type: array controlledValues: description: Specifies which resource values should be controlled. The default is "RequestsAndLimits". enum: - RequestsAndLimits - RequestsOnly type: string maxAllowed: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: Specifies the maximum amount of resources that will be recommended for the container. The default is no maximum. type: object minAllowed: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: Specifies the minimal amount of resources that will be recommended for the container. The default is no minimum. type: object mode: description: Whether autoscaler is enabled for the container. The default is "Auto". enum: - Auto - "Off" type: string type: object type: array type: object targetRef: description: TargetRef points to the controller managing the set of pods for the autoscaler to control - e.g. Deployment, StatefulSet. VerticalPodAutoscaler can be targeted at controller implementing scale subresource (the pod set is retrieved from the controller's ScaleStatus) or some well known controllers (e.g. for DaemonSet the pod set is read from the controller's spec). If VerticalPodAutoscaler cannot use specified target it will report ConfigUnsupported condition. Note that VerticalPodAutoscaler does not require full implementation of scale subresource - it will not use it to modify the replica count. The only thing retrieved is a label selector matching pods grouped by the target resource. properties: apiVersion: description: API version of the referent type: string kind: description: 'Kind of the referent; More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds"' type: string name: description: 'Name of the referent; More info: http://kubernetes.io/docs/user-guide/identifiers#names' type: string required: - kind - name type: object x-kubernetes-map-type: atomic updatePolicy: description: Describes the rules on how changes are applied to the pods. If not specified, all fields in the `PodUpdatePolicy` are set to their default values. properties: minReplicas: description: Minimal number of replicas which need to be alive for Updater to attempt pod eviction (pending other checks like PDB). Only positive values are allowed. Overrides global '--min-replicas' flag. format: int32 type: integer updateMode: description: Controls when autoscaler applies changes to the pod resources. The default is 'Auto'. enum: - "Off" - Initial - Recreate - Auto type: string type: object required: - targetRef type: object status: description: Current information about the autoscaler. properties: conditions: description: Conditions is the set of conditions required for this autoscaler to scale its target, and indicates whether or not those conditions are met. items: description: VerticalPodAutoscalerCondition describes the state of a VerticalPodAutoscaler at a certain point. properties: lastTransitionTime: description: lastTransitionTime is the last time the condition transitioned from one status to another format: date-time type: string message: description: message is a human-readable explanation containing details about the transition type: string reason: description: reason is the reason for the condition's last transition. type: string status: description: status is the status of the condition (True, False, Unknown) type: string type: description: type describes the current condition type: string required: - status - type type: object type: array recommendation: description: The most recently computed amount of resources recommended by the autoscaler for the controlled pods. properties: containerRecommendations: description: Resources recommended by the autoscaler for each container. items: description: RecommendedContainerResources is the recommendation of resources computed by autoscaler for a specific container. Respects the container resource policy if present in the spec. In particular the recommendation is not produced for containers with `ContainerScalingMode` set to 'Off'. properties: containerName: description: Name of the container. type: string lowerBound: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: Minimum recommended amount of resources. Observes ContainerResourcePolicy. This amount is not guaranteed to be sufficient for the application to operate in a stable way, however running with less resources is likely to have significant impact on performance/availability. type: object target: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: Recommended amount of resources. Observes ContainerResourcePolicy. type: object uncappedTarget: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: The most recent recommended resources target computed by the autoscaler for the controlled pods, based only on actual resource usage, not taking into account the ContainerResourcePolicy. May differ from the Recommendation if the actual resource usage causes the target to violate the ContainerResourcePolicy (lower than MinAllowed or higher that MaxAllowed). Used only as status indication, will not affect actual resource assignment. type: object upperBound: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: Maximum recommended amount of resources. Observes ContainerResourcePolicy. Any resources allocated beyond this value are likely wasted. This value may be larger than the maximum amount of application is actually capable of consuming. type: object required: - target type: object type: array type: object type: object required: - spec type: object served: true storage: true subresources: {} - deprecated: true deprecationWarning: autoscaling.k8s.io/v1beta2 API is deprecated name: v1beta2 schema: openAPIV3Schema: description: VerticalPodAutoscaler is the configuration for a vertical pod autoscaler, which automatically manages pod resources based on historical and real time resource utilization. properties: apiVersion: description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources' type: string kind: description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds' type: string metadata: type: object spec: description: 'Specification of the behavior of the autoscaler. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.' properties: resourcePolicy: description: Controls how the autoscaler computes recommended resources. The resource policy may be used to set constraints on the recommendations for individual containers. If not specified, the autoscaler computes recommended resources for all containers in the pod, without additional constraints. properties: containerPolicies: description: Per-container resource policies. items: description: ContainerResourcePolicy controls how autoscaler computes the recommended resources for a specific container. properties: containerName: description: Name of the container or DefaultContainerResourcePolicy, in which case the policy is used by the containers that don't have their own policy specified. type: string maxAllowed: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: Specifies the maximum amount of resources that will be recommended for the container. The default is no maximum. type: object minAllowed: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: Specifies the minimal amount of resources that will be recommended for the container. The default is no minimum. type: object mode: description: Whether autoscaler is enabled for the container. The default is "Auto". enum: - Auto - "Off" type: string type: object type: array type: object targetRef: description: TargetRef points to the controller managing the set of pods for the autoscaler to control - e.g. Deployment, StatefulSet. VerticalPodAutoscaler can be targeted at controller implementing scale subresource (the pod set is retrieved from the controller's ScaleStatus) or some well known controllers (e.g. for DaemonSet the pod set is read from the controller's spec). If VerticalPodAutoscaler cannot use specified target it will report ConfigUnsupported condition. Note that VerticalPodAutoscaler does not require full implementation of scale subresource - it will not use it to modify the replica count. The only thing retrieved is a label selector matching pods grouped by the target resource. properties: apiVersion: description: API version of the referent type: string kind: description: 'Kind of the referent; More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds"' type: string name: description: 'Name of the referent; More info: http://kubernetes.io/docs/user-guide/identifiers#names' type: string required: - kind - name type: object x-kubernetes-map-type: atomic updatePolicy: description: Describes the rules on how changes are applied to the pods. If not specified, all fields in the `PodUpdatePolicy` are set to their default values. properties: updateMode: description: Controls when autoscaler applies changes to the pod resources. The default is 'Auto'. enum: - "Off" - Initial - Recreate - Auto type: string type: object required: - targetRef type: object status: description: Current information about the autoscaler. properties: conditions: description: Conditions is the set of conditions required for this autoscaler to scale its target, and indicates whether or not those conditions are met. items: description: VerticalPodAutoscalerCondition describes the state of a VerticalPodAutoscaler at a certain point. properties: lastTransitionTime: description: lastTransitionTime is the last time the condition transitioned from one status to another format: date-time type: string message: description: message is a human-readable explanation containing details about the transition type: string reason: description: reason is the reason for the condition's last transition. type: string status: description: status is the status of the condition (True, False, Unknown) type: string type: description: type describes the current condition type: string required: - status - type type: object type: array recommendation: description: The most recently computed amount of resources recommended by the autoscaler for the controlled pods. properties: containerRecommendations: description: Resources recommended by the autoscaler for each container. items: description: RecommendedContainerResources is the recommendation of resources computed by autoscaler for a specific container. Respects the container resource policy if present in the spec. In particular the recommendation is not produced for containers with `ContainerScalingMode` set to 'Off'. properties: containerName: description: Name of the container. type: string lowerBound: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: Minimum recommended amount of resources. Observes ContainerResourcePolicy. This amount is not guaranteed to be sufficient for the application to operate in a stable way, however running with less resources is likely to have significant impact on performance/availability. type: object target: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: Recommended amount of resources. Observes ContainerResourcePolicy. type: object uncappedTarget: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: The most recent recommended resources target computed by the autoscaler for the controlled pods, based only on actual resource usage, not taking into account the ContainerResourcePolicy. May differ from the Recommendation if the actual resource usage causes the target to violate the ContainerResourcePolicy (lower than MinAllowed or higher that MaxAllowed). Used only as status indication, will not affect actual resource assignment. type: object upperBound: additionalProperties: anyOf: - type: integer - type: string pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ x-kubernetes-int-or-string: true description: Maximum recommended amount of resources. Observes ContainerResourcePolicy. Any resources allocated beyond this value are likely wasted. This value may be larger than the maximum amount of application is actually capable of consuming. type: object required: - target type: object type: array type: object type: object required: - spec type: object served: true storage: false
Install the components of vertical-pod-autoscaler.
vertical-pod-autoscaler contains the following components: admission-controller, recommender, and updater.
NoteBefore you install the admission-controller component, you must use a script to generate a certificate for a webhook.
YAML template for clusters whose Kubernetes versions are earlier than 1.22
Install admission-controller
apiVersion: apps/v1 kind: Deployment metadata: name: vpa-admission-controller namespace: kube-system spec: replicas: 1 selector: matchLabels: app: vpa-admission-controller template: metadata: labels: app: vpa-admission-controller spec: serviceAccountName: admin containers: - name: admission-controller image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-admission-controller:0.7.0 imagePullPolicy: Always env: - name: NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: tls-certs mountPath: "/etc/tls-certs" readOnly: true resources: limits: cpu: 200m memory: 500Mi requests: cpu: 50m memory: 200Mi ports: - containerPort: 8000 volumes: - name: tls-certs secret: secretName: vpa-tls-certs --- apiVersion: v1 kind: Service metadata: name: vpa-webhook namespace: kube-system spec: ports: - port: 443 targetPort: 8000 selector: app: vpa-admission-controller
Install recommender
apiVersion: apps/v1 kind: Deployment metadata: name: vpa-recommender namespace: kube-system spec: replicas: 1 selector: matchLabels: app: vpa-recommender template: metadata: labels: app: vpa-recommender spec: serviceAccountName: admin containers: - name: recommender image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-recommender:0.7.0 imagePullPolicy: Always resources: limits: cpu: 200m memory: 1000Mi requests: cpu: 50m memory: 500Mi ports: - containerPort: 8080
Install updater
apiVersion: apps/v1 kind: Deployment metadata: name: vpa-updater namespace: kube-system spec: replicas: 1 selector: matchLabels: app: vpa-updater template: metadata: labels: app: vpa-updater spec: serviceAccountName: admin containers: - name: updater image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-updater:0.7.0 imagePullPolicy: Always resources: limits: cpu: 200m memory: 1000Mi requests: cpu: 50m memory: 500Mi ports: - containerPort: 8080
YAML template for clusters whose Kubernetes versions are 1.22 and later
Install admission-controller
apiVersion: apps/v1 kind: Deployment metadata: name: vpa-admission-controller namespace: kube-system spec: replicas: 1 selector: matchLabels: app: vpa-admission-controller template: metadata: labels: app: vpa-admission-controller spec: serviceAccountName: vpa-admission-controller securityContext: runAsNonRoot: true runAsUser: 65534 # nobody containers: - name: admission-controller image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-admission-controller:0.13.0 imagePullPolicy: Always env: - name: NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: tls-certs mountPath: "/etc/tls-certs" readOnly: true resources: limits: cpu: 200m memory: 500Mi requests: cpu: 50m memory: 200Mi ports: - containerPort: 8000 - name: prometheus containerPort: 8944 volumes: - name: tls-certs secret: secretName: vpa-tls-certs --- apiVersion: v1 kind: Service metadata: name: vpa-webhook namespace: kube-system spec: ports: - port: 443 targetPort: 8000 selector: app: vpa-admission-controller
Install recommender
apiVersion: apps/v1 kind: Deployment metadata: name: vpa-recommender namespace: kube-system spec: replicas: 1 selector: matchLabels: app: vpa-recommender template: metadata: labels: app: vpa-recommender spec: serviceAccountName: vpa-recommender securityContext: runAsNonRoot: true runAsUser: 65534 # nobody containers: - name: recommender image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-recommender:0.13.0 imagePullPolicy: Always resources: limits: cpu: 200m memory: 1000Mi requests: cpu: 50m memory: 500Mi ports: - name: prometheus containerPort: 8942
Install updater
apiVersion: apps/v1 kind: Deployment metadata: name: vpa-updater namespace: kube-system spec: replicas: 1 selector: matchLabels: app: vpa-updater template: metadata: labels: app: vpa-updater spec: serviceAccountName: vpa-updater securityContext: runAsNonRoot: true runAsUser: 65534 # nobody containers: - name: updater image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-updater:0.13.0 imagePullPolicy: Always env: - name: NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace resources: limits: cpu: 200m memory: 1000Mi requests: cpu: 50m memory: 500Mi ports: - name: prometheus containerPort: 8943
Verify that vertical-pod-autoscaler is installed
Use the following YAML file to create a Deployment named nginx-deployment-basic and a VPA resource named nginx-deployment-basic-vpa:
Click to view details
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment-basic labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.7.9 ports: - containerPort: 80 --- apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: nginx-deployment-basic-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: nginx-deployment-basic updatePolicy: updateMode: "Off"
NoteSet updateMode to Off, and leave the requests and limits fields empty in the configuration of the Deployment.
Run the following command to query the CPU requests and memory requests that vertical-pod-autoscaler recommends for the Deployment.
NoteThe output is returned 2 minutes after you run the command.
kubectl describe vpa nginx-deployment-basic-vpa
The following output shows an example of the recommended resource requests:
Click to view details
Recommendation: Container Recommendations: Container Name: nginx Lower Bound: Cpu: 25m Memory: 262144k Target: Cpu: 25m Memory: 262144k Uncapped Target: Cpu: 25m Memory: 262144k Upper Bound: Cpu: 11601m Memory: 12128573170
You can specify resource requests for the Deployment based on the recommendation. vertical-pod-autoscaler continuously monitors the resource usage of the Deployment and provides suggestions on how to improve resource utilization.