Vertical pod auto scaling - Container Service for Kubernetes

You can deploy the vertical-pod-autoscaler component in a Container Service for Kubernetes (ACK) cluster. vertical-pod-autoscaler is a Vertical Pod Autoscaler (VPA) that enables vertical auto scaling of pods. vertical-pod-autoscaler automatically sets limits on the resource usage of a cluster based on the resource usage of the pods in the cluster. This way, ACK can schedule pods to nodes that have sufficient resources. vertical-pod-autoscaler also maintains the ratio of the resource requests to the resource limit that you specify in the initial container configurations. This topic describes how to use a YAML file to enable vertical pod auto scaling.

Prerequisites

Make sure that the following operations are complete:

An ACK cluster is created and its Kubernetes version is later than 1.12. For more information, see Create an ACK managed cluster.
A command-line tool is connected to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
vertical-pod-autoscaler is uninstalled from the cluster. This avoids conflicts when you deploy a new version of vertical-pod-autoscaler.

Background information

Important

Vertical pod auto scaling is in testing. Exercise caution when you use this feature.

You can use vertical-pod-autoscaler to update the resource configurations of running pods. This feature is in testing. The configuration updates will lead to pod restart and recreation, and the pods may be scheduled to other nodes.
vertical-pod-autoscaler does not evict the pods that are not managed by replication controllers. For these pods, the Auto mode is equivalent to the Initial mode.
vertical-pod-autoscaler and the Horizontal Pod Autoscaler (HPA) cannot run at the same time. The HPA monitors the CPU and memory metrics. If the HPA monitors only custom or external resource metrics other than CPU and memory metrics, you can use vertical-pod-autoscaler in conjunction with the HPA.
vertical-pod-autoscaler uses an admission webhook as its admission controller. If other admission webhooks exist in the cluster, make sure that the admission webhooks do not conflict with the admission webhook of vertical-pod-autoscaler. The execution sequence of admission controllers is defined in the parameters of the API server.
vertical-pod-autoscaler can handle most out of memory (OOM) events, but may fail to handle OOM events in specific scenarios.
The performance of vertical-pod-autoscaler is not tested in large-scale clusters.
The pod resource requests that are modified by vertical-pod-autoscaler may exceed the upper limit of the actual resources, including node resources, idle resources, and resource quotas. In this case, a pod may enter the Pending state and fail to be scheduled. You can use the cluster autoscaler to mitigate the impact of this issue.
If multiple vertical-pod-autoscaler components monitor the resource usage of a pod at the same time, undefined behavior may occur.

Install vertical-pod-autoscaler

Run the following command to create a role-based access control (RBAC) permission file:

kubectl apply -f rbac.yaml

Click to view details

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:metrics-reader
rules:
  - apiGroups:
      - "metrics.k8s.io"
    resources:
      - pods
    verbs:
      - get
      - list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:vpa-actor
rules:
  - apiGroups:
      - ""
    resources:
      - pods
      - nodes
      - limitranges
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - events
    verbs:
      - get
      - list
      - watch
      - create
  - apiGroups:
      - "poc.autoscaling.k8s.io"
    resources:
      - verticalpodautoscalers
    verbs:
      - get
      - list
      - watch
      - patch
  - apiGroups:
      - "autoscaling.k8s.io"
    resources:
      - verticalpodautoscalers
    verbs:
      - get
      - list
      - watch
      - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:vpa-checkpoint-actor
rules:
  - apiGroups:
      - "poc.autoscaling.k8s.io"
    resources:
      - verticalpodautoscalercheckpoints
    verbs:
      - get
      - list
      - watch
      - create
      - patch
      - delete
  - apiGroups:
      - "autoscaling.k8s.io"
    resources:
      - verticalpodautoscalercheckpoints
    verbs:
      - get
      - list
      - watch
      - create
      - patch
      - delete
  - apiGroups:
      - ""
    resources:
      - namespaces
    verbs:
      - get
      - list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:evictioner
rules:
  - apiGroups:
      - "apps"
      - "extensions"
    resources:
      - replicasets
    verbs:
      - get
  - apiGroups:
      - ""
    resources:
      - pods/eviction
    verbs:
      - create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:metrics-reader
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-reader
subjects:
  - kind: ServiceAccount
    name: vpa-recommender
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-actor
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:vpa-actor
subjects:
  - kind: ServiceAccount
    name: vpa-recommender
    namespace: kube-system
  - kind: ServiceAccount
    name: vpa-updater
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-checkpoint-actor
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:vpa-checkpoint-actor
subjects:
  - kind: ServiceAccount
    name: vpa-recommender
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:vpa-target-reader
rules:
  - apiGroups:
    - '*'
    resources:
    - '*/scale'
    verbs:
    - get
    - watch
  - apiGroups:
      - ""
    resources:
      - replicationcontrollers
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - apps
    resources:
      - daemonsets
      - deployments
      - replicasets
      - statefulsets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - batch
    resources:
      - jobs
      - cronjobs
    verbs:
      - get
      - list
      - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-target-reader-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:vpa-target-reader
subjects:
  - kind: ServiceAccount
    name: vpa-recommender
    namespace: kube-system
  - kind: ServiceAccount
    name: vpa-admission-controller
    namespace: kube-system
  - kind: ServiceAccount
    name: vpa-updater
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-evictioner-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:evictioner
subjects:
  - kind: ServiceAccount
    name: vpa-updater
    namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: vpa-admission-controller
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: vpa-recommender
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: vpa-updater
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:vpa-admission-controller
rules:
  - apiGroups:
      - ""
    resources:
      - pods
      - configmaps
      - nodes
      - limitranges
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "admissionregistration.k8s.io"
    resources:
      - mutatingwebhookconfigurations
    verbs:
      - create
      - delete
      - get
      - list
  - apiGroups:
      - "poc.autoscaling.k8s.io"
    resources:
      - verticalpodautoscalers
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "autoscaling.k8s.io"
    resources:
      - verticalpodautoscalers
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "coordination.k8s.io"
    resources:
      - leases
    verbs:
      - create
      - update
      - get
      - list
      - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-admission-controller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:vpa-admission-controller
subjects:
  - kind: ServiceAccount
    name: vpa-admission-controller
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:vpa-status-reader
rules:
  - apiGroups:
      - "coordination.k8s.io"
    resources:
      - leases
    verbs:
      - get
      - list
      - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:vpa-status-reader-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:vpa-status-reader
subjects:
  - kind: ServiceAccount
    name: vpa-updater
    namespace: kube-system

Run the following command to create a CustomResourceDefinition (CRD) for vertical-pod-autoscaler.

The CRD improves the scalability of ACK clusters. For more information, see Extend the Kubernetes API with CustomResourceDefinitions.

kubectl apply -f crd.yaml

crd.yaml template for clusters whose Kubernetes versions are earlier than 1.22

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: verticalpodautoscalers.autoscaling.k8s.io
  annotations:
    "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
spec:
  group: autoscaling.k8s.io
  scope: Namespaced
  names:
    plural: verticalpodautoscalers
    singular: verticalpodautoscaler
    kind: VerticalPodAutoscaler
    shortNames:
      - vpa
  version: v1beta1
  versions:
    - name: v1beta1
      served: false
      storage: false
    - name: v1beta2
      served: true
      storage: true
    - name: v1
      served: true
      storage: false
  validation:
    # openAPIV3Schema is the schema for validating custom objects.
    openAPIV3Schema:
      type: object
      properties:
        spec:
          type: object
          required: []
          properties:
            targetRef:
              type: object
            updatePolicy:
              type: object
              properties:
                updateMode:
                  type: string
            resourcePolicy:
              type: object
              properties:
                containerPolicies:
                  type: array
                  items:
                    type: object
                    properties:
                      containerName:
                        type: string
                      controlledValues:
                        type: string
                        enum: ["RequestsAndLimits", "RequestsOnly"]
                      mode:
                        type: string
                        enum: ["Auto", "Off"]
                      minAllowed:
                        type: object
                      maxAllowed:
                        type: object
                      controlledResources:
                        type: array
                        items:
                          type: string
                          enum: ["cpu", "memory"]
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: verticalpodautoscalercheckpoints.autoscaling.k8s.io
  annotations:
    "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
spec:
  group: autoscaling.k8s.io
  scope: Namespaced
  names:
    plural: verticalpodautoscalercheckpoints
    singular: verticalpodautoscalercheckpoint
    kind: VerticalPodAutoscalerCheckpoint
    shortNames:
      - vpacheckpoint
  version: v1beta1
  versions:
    - name: v1beta1
      served: false
      storage: false
    - name: v1beta2
      served: true
      storage: true
    - name: v1
      served: true
      storage: false

crd.yaml template for clusters whose Kubernetes versions are 1.22 and later

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    api-approved.kubernetes.io: https://github.com/kubernetes/kubernetes/pull/63797
    controller-gen.kubebuilder.io/version: v0.9.2
  creationTimestamp: null
  name: verticalpodautoscalercheckpoints.autoscaling.k8s.io
spec:
  group: autoscaling.k8s.io
  names:
    kind: VerticalPodAutoscalerCheckpoint
    listKind: VerticalPodAutoscalerCheckpointList
    plural: verticalpodautoscalercheckpoints
    shortNames:
    - vpacheckpoint
    singular: verticalpodautoscalercheckpoint
  scope: Namespaced
  versions:
  - name: v1
    schema:
      openAPIV3Schema:
        description: VerticalPodAutoscalerCheckpoint is the checkpoint of the internal
          state of VPA that is used for recovery after recommender's restart.
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          spec:
            description: 'Specification of the checkpoint. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
            properties:
              containerName:
                description: Name of the checkpointed container.
                type: string
              vpaObjectName:
                description: Name of the VPA object that stored VerticalPodAutoscalerCheckpoint
                  object.
                type: string
            type: object
          status:
            description: Data of the checkpoint.
            properties:
              cpuHistogram:
                description: Checkpoint of histogram for consumption of CPU.
                properties:
                  bucketWeights:
                    description: Map from bucket index to bucket weight.
                    type: object
                    x-kubernetes-preserve-unknown-fields: true
                  referenceTimestamp:
                    description: Reference timestamp for samples collected within
                      this histogram.
                    format: date-time
                    nullable: true
                    type: string
                  totalWeight:
                    description: Sum of samples to be used as denominator for weights
                      from BucketWeights.
                    type: number
                type: object
              firstSampleStart:
                description: Timestamp of the fist sample from the histograms.
                format: date-time
                nullable: true
                type: string
              lastSampleStart:
                description: Timestamp of the last sample from the histograms.
                format: date-time
                nullable: true
                type: string
              lastUpdateTime:
                description: The time when the status was last refreshed.
                format: date-time
                nullable: true
                type: string
              memoryHistogram:
                description: Checkpoint of histogram for consumption of memory.
                properties:
                  bucketWeights:
                    description: Map from bucket index to bucket weight.
                    type: object
                    x-kubernetes-preserve-unknown-fields: true
                  referenceTimestamp:
                    description: Reference timestamp for samples collected within
                      this histogram.
                    format: date-time
                    nullable: true
                    type: string
                  totalWeight:
                    description: Sum of samples to be used as denominator for weights
                      from BucketWeights.
                    type: number
                type: object
              totalSamplesCount:
                description: Total number of samples in the histograms.
                type: integer
              version:
                description: Version of the format of the stored data.
                type: string
            type: object
        type: object
    served: true
    storage: true
  - name: v1beta2
    schema:
      openAPIV3Schema:
        description: VerticalPodAutoscalerCheckpoint is the checkpoint of the internal
          state of VPA that is used for recovery after recommender's restart.
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          spec:
            description: 'Specification of the checkpoint. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
            properties:
              containerName:
                description: Name of the checkpointed container.
                type: string
              vpaObjectName:
                description: Name of the VPA object that stored VerticalPodAutoscalerCheckpoint
                  object.
                type: string
            type: object
          status:
            description: Data of the checkpoint.
            properties:
              cpuHistogram:
                description: Checkpoint of histogram for consumption of CPU.
                properties:
                  bucketWeights:
                    description: Map from bucket index to bucket weight.
                    type: object
                    x-kubernetes-preserve-unknown-fields: true
                  referenceTimestamp:
                    description: Reference timestamp for samples collected within
                      this histogram.
                    format: date-time
                    nullable: true
                    type: string
                  totalWeight:
                    description: Sum of samples to be used as denominator for weights
                      from BucketWeights.
                    type: number
                type: object
              firstSampleStart:
                description: Timestamp of the fist sample from the histograms.
                format: date-time
                nullable: true
                type: string
              lastSampleStart:
                description: Timestamp of the last sample from the histograms.
                format: date-time
                nullable: true
                type: string
              lastUpdateTime:
                description: The time when the status was last refreshed.
                format: date-time
                nullable: true
                type: string
              memoryHistogram:
                description: Checkpoint of histogram for consumption of memory.
                properties:
                  bucketWeights:
                    description: Map from bucket index to bucket weight.
                    type: object
                    x-kubernetes-preserve-unknown-fields: true
                  referenceTimestamp:
                    description: Reference timestamp for samples collected within
                      this histogram.
                    format: date-time
                    nullable: true
                    type: string
                  totalWeight:
                    description: Sum of samples to be used as denominator for weights
                      from BucketWeights.
                    type: number
                type: object
              totalSamplesCount:
                description: Total number of samples in the histograms.
                type: integer
              version:
                description: Version of the format of the stored data.
                type: string
            type: object
        type: object
    served: true
    storage: false
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    api-approved.kubernetes.io: https://github.com/kubernetes/kubernetes/pull/63797
    controller-gen.kubebuilder.io/version: v0.9.2
  creationTimestamp: null
  name: verticalpodautoscalers.autoscaling.k8s.io
spec:
  group: autoscaling.k8s.io
  names:
    kind: VerticalPodAutoscaler
    listKind: VerticalPodAutoscalerList
    plural: verticalpodautoscalers
    shortNames:
    - vpa
    singular: verticalpodautoscaler
  scope: Namespaced
  versions:
  - additionalPrinterColumns:
    - jsonPath: .spec.updatePolicy.updateMode
      name: Mode
      type: string
    - jsonPath: .status.recommendation.containerRecommendations[0].target.cpu
      name: CPU
      type: string
    - jsonPath: .status.recommendation.containerRecommendations[0].target.memory
      name: Mem
      type: string
    - jsonPath: .status.conditions[?(@.type=='RecommendationProvided')].status
      name: Provided
      type: string
    - jsonPath: .metadata.creationTimestamp
      name: Age
      type: date
    name: v1
    schema:
      openAPIV3Schema:
        description: VerticalPodAutoscaler is the configuration for a vertical pod
          autoscaler, which automatically manages pod resources based on historical
          and real time resource utilization.
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          spec:
            description: 'Specification of the behavior of the autoscaler. More info:
              https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
            properties:
              recommenders:
                description: Recommender responsible for generating recommendation
                  for this object. List should be empty (then the default recommender
                  will generate the recommendation) or contain exactly one recommender.
                items:
                  description: VerticalPodAutoscalerRecommenderSelector points to
                    a specific Vertical Pod Autoscaler recommender. In the future
                    it might pass parameters to the recommender.
                  properties:
                    name:
                      description: Name of the recommender responsible for generating
                        recommendation for this object.
                      type: string
                  required:
                  - name
                  type: object
                type: array
              resourcePolicy:
                description: Controls how the autoscaler computes recommended resources.
                  The resource policy may be used to set constraints on the recommendations
                  for individual containers. If not specified, the autoscaler computes
                  recommended resources for all containers in the pod, without additional
                  constraints.
                properties:
                  containerPolicies:
                    description: Per-container resource policies.
                    items:
                      description: ContainerResourcePolicy controls how autoscaler
                        computes the recommended resources for a specific container.
                      properties:
                        containerName:
                          description: Name of the container or DefaultContainerResourcePolicy,
                            in which case the policy is used by the containers that
                            don't have their own policy specified.
                          type: string
                        controlledResources:
                          description: Specifies the type of recommendations that
                            will be computed (and possibly applied) by VPA. If not
                            specified, the default of [ResourceCPU, ResourceMemory]
                            will be used.
                          items:
                            description: ResourceName is the name identifying various
                              resources in a ResourceList.
                            type: string
                          type: array
                        controlledValues:
                          description: Specifies which resource values should be controlled.
                            The default is "RequestsAndLimits".
                          enum:
                          - RequestsAndLimits
                          - RequestsOnly
                          type: string
                        maxAllowed:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Specifies the maximum amount of resources that
                            will be recommended for the container. The default is
                            no maximum.
                          type: object
                        minAllowed:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Specifies the minimal amount of resources that
                            will be recommended for the container. The default is
                            no minimum.
                          type: object
                        mode:
                          description: Whether autoscaler is enabled for the container.
                            The default is "Auto".
                          enum:
                          - Auto
                          - "Off"
                          type: string
                      type: object
                    type: array
                type: object
              targetRef:
                description: TargetRef points to the controller managing the set of
                  pods for the autoscaler to control - e.g. Deployment, StatefulSet.
                  VerticalPodAutoscaler can be targeted at controller implementing
                  scale subresource (the pod set is retrieved from the controller's
                  ScaleStatus) or some well known controllers (e.g. for DaemonSet
                  the pod set is read from the controller's spec). If VerticalPodAutoscaler
                  cannot use specified target it will report ConfigUnsupported condition.
                  Note that VerticalPodAutoscaler does not require full implementation
                  of scale subresource - it will not use it to modify the replica
                  count. The only thing retrieved is a label selector matching pods
                  grouped by the target resource.
                properties:
                  apiVersion:
                    description: API version of the referent
                    type: string
                  kind:
                    description: 'Kind of the referent; More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds"'
                    type: string
                  name:
                    description: 'Name of the referent; More info: http://kubernetes.io/docs/user-guide/identifiers#names'
                    type: string
                required:
                - kind
                - name
                type: object
                x-kubernetes-map-type: atomic
              updatePolicy:
                description: Describes the rules on how changes are applied to the
                  pods. If not specified, all fields in the `PodUpdatePolicy` are
                  set to their default values.
                properties:
                  minReplicas:
                    description: Minimal number of replicas which need to be alive
                      for Updater to attempt pod eviction (pending other checks like
                      PDB). Only positive values are allowed. Overrides global '--min-replicas'
                      flag.
                    format: int32
                    type: integer
                  updateMode:
                    description: Controls when autoscaler applies changes to the pod
                      resources. The default is 'Auto'.
                    enum:
                    - "Off"
                    - Initial
                    - Recreate
                    - Auto
                    type: string
                type: object
            required:
            - targetRef
            type: object
          status:
            description: Current information about the autoscaler.
            properties:
              conditions:
                description: Conditions is the set of conditions required for this
                  autoscaler to scale its target, and indicates whether or not those
                  conditions are met.
                items:
                  description: VerticalPodAutoscalerCondition describes the state
                    of a VerticalPodAutoscaler at a certain point.
                  properties:
                    lastTransitionTime:
                      description: lastTransitionTime is the last time the condition
                        transitioned from one status to another
                      format: date-time
                      type: string
                    message:
                      description: message is a human-readable explanation containing
                        details about the transition
                      type: string
                    reason:
                      description: reason is the reason for the condition's last transition.
                      type: string
                    status:
                      description: status is the status of the condition (True, False,
                        Unknown)
                      type: string
                    type:
                      description: type describes the current condition
                      type: string
                  required:
                  - status
                  - type
                  type: object
                type: array
              recommendation:
                description: The most recently computed amount of resources recommended
                  by the autoscaler for the controlled pods.
                properties:
                  containerRecommendations:
                    description: Resources recommended by the autoscaler for each
                      container.
                    items:
                      description: RecommendedContainerResources is the recommendation
                        of resources computed by autoscaler for a specific container.
                        Respects the container resource policy if present in the spec.
                        In particular the recommendation is not produced for containers
                        with `ContainerScalingMode` set to 'Off'.
                      properties:
                        containerName:
                          description: Name of the container.
                          type: string
                        lowerBound:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Minimum recommended amount of resources. Observes
                            ContainerResourcePolicy. This amount is not guaranteed
                            to be sufficient for the application to operate in a stable
                            way, however running with less resources is likely to
                            have significant impact on performance/availability.
                          type: object
                        target:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Recommended amount of resources. Observes ContainerResourcePolicy.
                          type: object
                        uncappedTarget:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: The most recent recommended resources target
                            computed by the autoscaler for the controlled pods, based
                            only on actual resource usage, not taking into account
                            the ContainerResourcePolicy. May differ from the Recommendation
                            if the actual resource usage causes the target to violate
                            the ContainerResourcePolicy (lower than MinAllowed or
                            higher that MaxAllowed). Used only as status indication,
                            will not affect actual resource assignment.
                          type: object
                        upperBound:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Maximum recommended amount of resources. Observes
                            ContainerResourcePolicy. Any resources allocated beyond
                            this value are likely wasted. This value may be larger
                            than the maximum amount of application is actually capable
                            of consuming.
                          type: object
                      required:
                      - target
                      type: object
                    type: array
                type: object
            type: object
        required:
        - spec
        type: object
    served: true
    storage: true
    subresources: {}
  - deprecated: true
    deprecationWarning: autoscaling.k8s.io/v1beta2 API is deprecated
    name: v1beta2
    schema:
      openAPIV3Schema:
        description: VerticalPodAutoscaler is the configuration for a vertical pod
          autoscaler, which automatically manages pod resources based on historical
          and real time resource utilization.
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          spec:
            description: 'Specification of the behavior of the autoscaler. More info:
              https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
            properties:
              resourcePolicy:
                description: Controls how the autoscaler computes recommended resources.
                  The resource policy may be used to set constraints on the recommendations
                  for individual containers. If not specified, the autoscaler computes
                  recommended resources for all containers in the pod, without additional
                  constraints.
                properties:
                  containerPolicies:
                    description: Per-container resource policies.
                    items:
                      description: ContainerResourcePolicy controls how autoscaler
                        computes the recommended resources for a specific container.
                      properties:
                        containerName:
                          description: Name of the container or DefaultContainerResourcePolicy,
                            in which case the policy is used by the containers that
                            don't have their own policy specified.
                          type: string
                        maxAllowed:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Specifies the maximum amount of resources that
                            will be recommended for the container. The default is
                            no maximum.
                          type: object
                        minAllowed:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Specifies the minimal amount of resources that
                            will be recommended for the container. The default is
                            no minimum.
                          type: object
                        mode:
                          description: Whether autoscaler is enabled for the container.
                            The default is "Auto".
                          enum:
                          - Auto
                          - "Off"
                          type: string
                      type: object
                    type: array
                type: object
              targetRef:
                description: TargetRef points to the controller managing the set of
                  pods for the autoscaler to control - e.g. Deployment, StatefulSet.
                  VerticalPodAutoscaler can be targeted at controller implementing
                  scale subresource (the pod set is retrieved from the controller's
                  ScaleStatus) or some well known controllers (e.g. for DaemonSet
                  the pod set is read from the controller's spec). If VerticalPodAutoscaler
                  cannot use specified target it will report ConfigUnsupported condition.
                  Note that VerticalPodAutoscaler does not require full implementation
                  of scale subresource - it will not use it to modify the replica
                  count. The only thing retrieved is a label selector matching pods
                  grouped by the target resource.
                properties:
                  apiVersion:
                    description: API version of the referent
                    type: string
                  kind:
                    description: 'Kind of the referent; More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds"'
                    type: string
                  name:
                    description: 'Name of the referent; More info: http://kubernetes.io/docs/user-guide/identifiers#names'
                    type: string
                required:
                - kind
                - name
                type: object
                x-kubernetes-map-type: atomic
              updatePolicy:
                description: Describes the rules on how changes are applied to the
                  pods. If not specified, all fields in the `PodUpdatePolicy` are
                  set to their default values.
                properties:
                  updateMode:
                    description: Controls when autoscaler applies changes to the pod
                      resources. The default is 'Auto'.
                    enum:
                    - "Off"
                    - Initial
                    - Recreate
                    - Auto
                    type: string
                type: object
            required:
            - targetRef
            type: object
          status:
            description: Current information about the autoscaler.
            properties:
              conditions:
                description: Conditions is the set of conditions required for this
                  autoscaler to scale its target, and indicates whether or not those
                  conditions are met.
                items:
                  description: VerticalPodAutoscalerCondition describes the state
                    of a VerticalPodAutoscaler at a certain point.
                  properties:
                    lastTransitionTime:
                      description: lastTransitionTime is the last time the condition
                        transitioned from one status to another
                      format: date-time
                      type: string
                    message:
                      description: message is a human-readable explanation containing
                        details about the transition
                      type: string
                    reason:
                      description: reason is the reason for the condition's last transition.
                      type: string
                    status:
                      description: status is the status of the condition (True, False,
                        Unknown)
                      type: string
                    type:
                      description: type describes the current condition
                      type: string
                  required:
                  - status
                  - type
                  type: object
                type: array
              recommendation:
                description: The most recently computed amount of resources recommended
                  by the autoscaler for the controlled pods.
                properties:
                  containerRecommendations:
                    description: Resources recommended by the autoscaler for each
                      container.
                    items:
                      description: RecommendedContainerResources is the recommendation
                        of resources computed by autoscaler for a specific container.
                        Respects the container resource policy if present in the spec.
                        In particular the recommendation is not produced for containers
                        with `ContainerScalingMode` set to 'Off'.
                      properties:
                        containerName:
                          description: Name of the container.
                          type: string
                        lowerBound:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Minimum recommended amount of resources. Observes
                            ContainerResourcePolicy. This amount is not guaranteed
                            to be sufficient for the application to operate in a stable
                            way, however running with less resources is likely to
                            have significant impact on performance/availability.
                          type: object
                        target:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Recommended amount of resources. Observes ContainerResourcePolicy.
                          type: object
                        uncappedTarget:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: The most recent recommended resources target
                            computed by the autoscaler for the controlled pods, based
                            only on actual resource usage, not taking into account
                            the ContainerResourcePolicy. May differ from the Recommendation
                            if the actual resource usage causes the target to violate
                            the ContainerResourcePolicy (lower than MinAllowed or
                            higher that MaxAllowed). Used only as status indication,
                            will not affect actual resource assignment.
                          type: object
                        upperBound:
                          additionalProperties:
                            anyOf:
                            - type: integer
                            - type: string
                            pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                            x-kubernetes-int-or-string: true
                          description: Maximum recommended amount of resources. Observes
                            ContainerResourcePolicy. Any resources allocated beyond
                            this value are likely wasted. This value may be larger
                            than the maximum amount of application is actually capable
                            of consuming.
                          type: object
                      required:
                      - target
                      type: object
                    type: array
                type: object
            type: object
        required:
        - spec
        type: object
    served: true
    storage: false

Install the components of vertical-pod-autoscaler.

vertical-pod-autoscaler contains the following components: admission-controller, recommender, and updater.

Note

Before you install the admission-controller component, you must use a script to generate a certificate for a webhook.

YAML template for clusters whose Kubernetes versions are earlier than 1.22

Install admission-controller

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-admission-controller
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-admission-controller
  template:
    metadata:
      labels:
        app: vpa-admission-controller
    spec:
      serviceAccountName: admin
      containers:
        - name: admission-controller
          image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-admission-controller:0.7.0
          imagePullPolicy: Always
          env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          volumeMounts:
            - name: tls-certs
              mountPath: "/etc/tls-certs"
              readOnly: true
          resources:
            limits:
              cpu: 200m
              memory: 500Mi
            requests:
              cpu: 50m
              memory: 200Mi
          ports:
            - containerPort: 8000
      volumes:
        - name: tls-certs
          secret:
            secretName: vpa-tls-certs
---
apiVersion: v1
kind: Service
metadata:
  name: vpa-webhook
  namespace: kube-system
spec:
  ports:
    - port: 443
      targetPort: 8000
  selector:
    app: vpa-admission-controller

Install recommender

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-recommender
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-recommender
  template:
    metadata:
      labels:
        app: vpa-recommender
    spec:
      serviceAccountName: admin
      containers:
      - name: recommender
        image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-recommender:0.7.0
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 200m
            memory: 1000Mi
          requests:
            cpu: 50m
            memory: 500Mi
        ports:
        - containerPort: 8080

Install updater

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-updater
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-updater
  template:
    metadata:
      labels:
        app: vpa-updater
    spec:
      serviceAccountName: admin
      containers:
        - name: updater
          image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-updater:0.7.0
          imagePullPolicy: Always
          resources:
            limits:
              cpu: 200m
              memory: 1000Mi
            requests:
              cpu: 50m
              memory: 500Mi
          ports:
            - containerPort: 8080

YAML template for clusters whose Kubernetes versions are 1.22 and later

Install admission-controller

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-admission-controller
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-admission-controller
  template:
    metadata:
      labels:
        app: vpa-admission-controller
    spec:
      serviceAccountName: vpa-admission-controller
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534 # nobody
      containers:
        - name: admission-controller
          image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-admission-controller:0.13.0
          imagePullPolicy: Always
          env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          volumeMounts:
            - name: tls-certs
              mountPath: "/etc/tls-certs"
              readOnly: true
          resources:
            limits:
              cpu: 200m
              memory: 500Mi
            requests:
              cpu: 50m
              memory: 200Mi
          ports:
            - containerPort: 8000
            - name: prometheus
              containerPort: 8944
      volumes:
        - name: tls-certs
          secret:
            secretName: vpa-tls-certs
---
apiVersion: v1
kind: Service
metadata:
  name: vpa-webhook
  namespace: kube-system
spec:
  ports:
    - port: 443
      targetPort: 8000
  selector:
    app: vpa-admission-controller

Install recommender

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-recommender
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-recommender
  template:
    metadata:
      labels:
        app: vpa-recommender
    spec:
      serviceAccountName: vpa-recommender
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534 # nobody
      containers:
      - name: recommender
        image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-recommender:0.13.0
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 200m
            memory: 1000Mi
          requests:
            cpu: 50m
            memory: 500Mi
        ports:
        - name: prometheus
          containerPort: 8942

Install updater

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vpa-updater
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vpa-updater
  template:
    metadata:
      labels:
        app: vpa-updater
    spec:
      serviceAccountName: vpa-updater
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534 # nobody
      containers:
        - name: updater
          image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-updater:0.13.0
          imagePullPolicy: Always
          env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          resources:
            limits:
              cpu: 200m
              memory: 1000Mi
            requests:
              cpu: 50m
              memory: 500Mi
          ports:
            - name: prometheus
              containerPort: 8943

Verify that vertical-pod-autoscaler is installed

Use the following YAML file to create a Deployment named nginx-deployment-basic and a VPA resource named nginx-deployment-basic-vpa:

Click to view details

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment-basic
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: nginx-deployment-basic-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind:       Deployment
    name:       nginx-deployment-basic
  updatePolicy:
    updateMode: "Off"

Note

Set updateMode to Off, and leave the requests and limits fields empty in the configuration of the Deployment.

Run the following command to query the CPU requests and memory requests that vertical-pod-autoscaler recommends for the Deployment.
Note
The output is returned 2 minutes after you run the command.
```
kubectl describe vpa nginx-deployment-basic-vpa
```
The following output shows an example of the recommended resource requests:
Click to view details
```
  Recommendation:
    Container Recommendations:
      Container Name:  nginx
      Lower Bound:
        Cpu:     25m
        Memory:  262144k
      Target:
        Cpu:     25m
        Memory:  262144k
      Uncapped Target:
        Cpu:     25m
        Memory:  262144k
      Upper Bound:
        Cpu:     11601m
        Memory:  12128573170
```
You can specify resource requests for the Deployment based on the recommendation. vertical-pod-autoscaler continuously monitors the resource usage of the Deployment and provides suggestions on how to improve resource utilization.