All Products
Search
Document Center

Container Service for Kubernetes:Vertical pod auto scaling

Last Updated:Jul 27, 2023

You can deploy the vertical-pod-autoscaler component in a Container Service for Kubernetes (ACK) cluster. vertical-pod-autoscaler is a Vertical Pod Autoscaler (VPA) that enables vertical auto scaling of pods. vertical-pod-autoscaler automatically sets limits on the resource usage of a cluster based on the resource usage of the pods in the cluster. This way, ACK can schedule pods to nodes that have sufficient resources. vertical-pod-autoscaler also maintains the ratio of the resource requests to the resource limit that you specify in the initial container configurations. This topic describes how to use a YAML file to enable vertical pod auto scaling.

Prerequisites

Make sure that the following operations are complete:

Background information

Important

Vertical pod auto scaling is in testing. Exercise caution when you use this feature.

  • You can use vertical-pod-autoscaler to update the resource configurations of running pods. This feature is in testing. The configuration updates will lead to pod restart and recreation, and the pods may be scheduled to other nodes.

  • vertical-pod-autoscaler does not evict the pods that are not managed by replication controllers. For these pods, the Auto mode is equivalent to the Initial mode.

  • vertical-pod-autoscaler and the Horizontal Pod Autoscaler (HPA) cannot run at the same time. The HPA monitors the CPU and memory metrics. If the HPA monitors only custom or external resource metrics other than CPU and memory metrics, you can use vertical-pod-autoscaler in conjunction with the HPA.

  • vertical-pod-autoscaler uses an admission webhook as its admission controller. If other admission webhooks exist in the cluster, make sure that the admission webhooks do not conflict with the admission webhook of vertical-pod-autoscaler. The execution sequence of admission controllers is defined in the parameters of the API server.

  • vertical-pod-autoscaler can handle most out of memory (OOM) events, but may fail to handle OOM events in specific scenarios.

  • The performance of vertical-pod-autoscaler is not tested in large-scale clusters.

  • The pod resource requests that are modified by vertical-pod-autoscaler may exceed the upper limit of the actual resources, including node resources, idle resources, and resource quotas. In this case, a pod may enter the Pending state and fail to be scheduled. You can use the cluster autoscaler to mitigate the impact of this issue.

  • If multiple vertical-pod-autoscaler components monitor the resource usage of a pod at the same time, undefined behavior may occur.

Install vertical-pod-autoscaler

  1. Run the following command to create a role-based access control (RBAC) permission file:

    kubectl apply -f rbac.yaml

    Click to view details

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: system:metrics-reader
    rules:
      - apiGroups:
          - "metrics.k8s.io"
        resources:
          - pods
        verbs:
          - get
          - list
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: system:vpa-actor
    rules:
      - apiGroups:
          - ""
        resources:
          - pods
          - nodes
          - limitranges
        verbs:
          - get
          - list
          - watch
      - apiGroups:
          - ""
        resources:
          - events
        verbs:
          - get
          - list
          - watch
          - create
      - apiGroups:
          - "poc.autoscaling.k8s.io"
        resources:
          - verticalpodautoscalers
        verbs:
          - get
          - list
          - watch
          - patch
      - apiGroups:
          - "autoscaling.k8s.io"
        resources:
          - verticalpodautoscalers
        verbs:
          - get
          - list
          - watch
          - patch
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: system:vpa-checkpoint-actor
    rules:
      - apiGroups:
          - "poc.autoscaling.k8s.io"
        resources:
          - verticalpodautoscalercheckpoints
        verbs:
          - get
          - list
          - watch
          - create
          - patch
          - delete
      - apiGroups:
          - "autoscaling.k8s.io"
        resources:
          - verticalpodautoscalercheckpoints
        verbs:
          - get
          - list
          - watch
          - create
          - patch
          - delete
      - apiGroups:
          - ""
        resources:
          - namespaces
        verbs:
          - get
          - list
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: system:evictioner
    rules:
      - apiGroups:
          - "apps"
          - "extensions"
        resources:
          - replicasets
        verbs:
          - get
      - apiGroups:
          - ""
        resources:
          - pods/eviction
        verbs:
          - create
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: system:metrics-reader
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:metrics-reader
    subjects:
      - kind: ServiceAccount
        name: vpa-recommender
        namespace: kube-system
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: system:vpa-actor
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:vpa-actor
    subjects:
      - kind: ServiceAccount
        name: vpa-recommender
        namespace: kube-system
      - kind: ServiceAccount
        name: vpa-updater
        namespace: kube-system
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: system:vpa-checkpoint-actor
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:vpa-checkpoint-actor
    subjects:
      - kind: ServiceAccount
        name: vpa-recommender
        namespace: kube-system
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: system:vpa-target-reader
    rules:
      - apiGroups:
        - '*'
        resources:
        - '*/scale'
        verbs:
        - get
        - watch
      - apiGroups:
          - ""
        resources:
          - replicationcontrollers
        verbs:
          - get
          - list
          - watch
      - apiGroups:
          - apps
        resources:
          - daemonsets
          - deployments
          - replicasets
          - statefulsets
        verbs:
          - get
          - list
          - watch
      - apiGroups:
          - batch
        resources:
          - jobs
          - cronjobs
        verbs:
          - get
          - list
          - watch
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: system:vpa-target-reader-binding
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:vpa-target-reader
    subjects:
      - kind: ServiceAccount
        name: vpa-recommender
        namespace: kube-system
      - kind: ServiceAccount
        name: vpa-admission-controller
        namespace: kube-system
      - kind: ServiceAccount
        name: vpa-updater
        namespace: kube-system
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: system:vpa-evictioner-binding
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:evictioner
    subjects:
      - kind: ServiceAccount
        name: vpa-updater
        namespace: kube-system
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: vpa-admission-controller
      namespace: kube-system
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: vpa-recommender
      namespace: kube-system
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: vpa-updater
      namespace: kube-system
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: system:vpa-admission-controller
    rules:
      - apiGroups:
          - ""
        resources:
          - pods
          - configmaps
          - nodes
          - limitranges
        verbs:
          - get
          - list
          - watch
      - apiGroups:
          - "admissionregistration.k8s.io"
        resources:
          - mutatingwebhookconfigurations
        verbs:
          - create
          - delete
          - get
          - list
      - apiGroups:
          - "poc.autoscaling.k8s.io"
        resources:
          - verticalpodautoscalers
        verbs:
          - get
          - list
          - watch
      - apiGroups:
          - "autoscaling.k8s.io"
        resources:
          - verticalpodautoscalers
        verbs:
          - get
          - list
          - watch
      - apiGroups:
          - "coordination.k8s.io"
        resources:
          - leases
        verbs:
          - create
          - update
          - get
          - list
          - watch
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: system:vpa-admission-controller
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:vpa-admission-controller
    subjects:
      - kind: ServiceAccount
        name: vpa-admission-controller
        namespace: kube-system
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: system:vpa-status-reader
    rules:
      - apiGroups:
          - "coordination.k8s.io"
        resources:
          - leases
        verbs:
          - get
          - list
          - watch
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: system:vpa-status-reader-binding
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:vpa-status-reader
    subjects:
      - kind: ServiceAccount
        name: vpa-updater
        namespace: kube-system
  2. Run the following command to create a CustomResourceDefinition (CRD) for vertical-pod-autoscaler.

    The CRD improves the scalability of ACK clusters. For more information, see Extend the Kubernetes API with CustomResourceDefinitions.

    kubectl apply -f crd.yaml

    crd.yaml template for clusters whose Kubernetes versions are earlier than 1.22

    apiVersion: apiextensions.k8s.io/v1beta1
    kind: CustomResourceDefinition
    metadata:
      name: verticalpodautoscalers.autoscaling.k8s.io
      annotations:
        "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
    spec:
      group: autoscaling.k8s.io
      scope: Namespaced
      names:
        plural: verticalpodautoscalers
        singular: verticalpodautoscaler
        kind: VerticalPodAutoscaler
        shortNames:
          - vpa
      version: v1beta1
      versions:
        - name: v1beta1
          served: false
          storage: false
        - name: v1beta2
          served: true
          storage: true
        - name: v1
          served: true
          storage: false
      validation:
        # openAPIV3Schema is the schema for validating custom objects.
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              required: []
              properties:
                targetRef:
                  type: object
                updatePolicy:
                  type: object
                  properties:
                    updateMode:
                      type: string
                resourcePolicy:
                  type: object
                  properties:
                    containerPolicies:
                      type: array
                      items:
                        type: object
                        properties:
                          containerName:
                            type: string
                          controlledValues:
                            type: string
                            enum: ["RequestsAndLimits", "RequestsOnly"]
                          mode:
                            type: string
                            enum: ["Auto", "Off"]
                          minAllowed:
                            type: object
                          maxAllowed:
                            type: object
                          controlledResources:
                            type: array
                            items:
                              type: string
                              enum: ["cpu", "memory"]
    ---
    apiVersion: apiextensions.k8s.io/v1beta1
    kind: CustomResourceDefinition
    metadata:
      name: verticalpodautoscalercheckpoints.autoscaling.k8s.io
      annotations:
        "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/63797"
    spec:
      group: autoscaling.k8s.io
      scope: Namespaced
      names:
        plural: verticalpodautoscalercheckpoints
        singular: verticalpodautoscalercheckpoint
        kind: VerticalPodAutoscalerCheckpoint
        shortNames:
          - vpacheckpoint
      version: v1beta1
      versions:
        - name: v1beta1
          served: false
          storage: false
        - name: v1beta2
          served: true
          storage: true
        - name: v1
          served: true
          storage: false

    crd.yaml template for clusters whose Kubernetes versions are 1.22 and later

    apiVersion: apiextensions.k8s.io/v1
    kind: CustomResourceDefinition
    metadata:
      annotations:
        api-approved.kubernetes.io: https://github.com/kubernetes/kubernetes/pull/63797
        controller-gen.kubebuilder.io/version: v0.9.2
      creationTimestamp: null
      name: verticalpodautoscalercheckpoints.autoscaling.k8s.io
    spec:
      group: autoscaling.k8s.io
      names:
        kind: VerticalPodAutoscalerCheckpoint
        listKind: VerticalPodAutoscalerCheckpointList
        plural: verticalpodautoscalercheckpoints
        shortNames:
        - vpacheckpoint
        singular: verticalpodautoscalercheckpoint
      scope: Namespaced
      versions:
      - name: v1
        schema:
          openAPIV3Schema:
            description: VerticalPodAutoscalerCheckpoint is the checkpoint of the internal
              state of VPA that is used for recovery after recommender's restart.
            properties:
              apiVersion:
                description: 'APIVersion defines the versioned schema of this representation
                  of an object. Servers should convert recognized schemas to the latest
                  internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
                type: string
              kind:
                description: 'Kind is a string value representing the REST resource this
                  object represents. Servers may infer this from the endpoint the client
                  submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
                type: string
              metadata:
                type: object
              spec:
                description: 'Specification of the checkpoint. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
                properties:
                  containerName:
                    description: Name of the checkpointed container.
                    type: string
                  vpaObjectName:
                    description: Name of the VPA object that stored VerticalPodAutoscalerCheckpoint
                      object.
                    type: string
                type: object
              status:
                description: Data of the checkpoint.
                properties:
                  cpuHistogram:
                    description: Checkpoint of histogram for consumption of CPU.
                    properties:
                      bucketWeights:
                        description: Map from bucket index to bucket weight.
                        type: object
                        x-kubernetes-preserve-unknown-fields: true
                      referenceTimestamp:
                        description: Reference timestamp for samples collected within
                          this histogram.
                        format: date-time
                        nullable: true
                        type: string
                      totalWeight:
                        description: Sum of samples to be used as denominator for weights
                          from BucketWeights.
                        type: number
                    type: object
                  firstSampleStart:
                    description: Timestamp of the fist sample from the histograms.
                    format: date-time
                    nullable: true
                    type: string
                  lastSampleStart:
                    description: Timestamp of the last sample from the histograms.
                    format: date-time
                    nullable: true
                    type: string
                  lastUpdateTime:
                    description: The time when the status was last refreshed.
                    format: date-time
                    nullable: true
                    type: string
                  memoryHistogram:
                    description: Checkpoint of histogram for consumption of memory.
                    properties:
                      bucketWeights:
                        description: Map from bucket index to bucket weight.
                        type: object
                        x-kubernetes-preserve-unknown-fields: true
                      referenceTimestamp:
                        description: Reference timestamp for samples collected within
                          this histogram.
                        format: date-time
                        nullable: true
                        type: string
                      totalWeight:
                        description: Sum of samples to be used as denominator for weights
                          from BucketWeights.
                        type: number
                    type: object
                  totalSamplesCount:
                    description: Total number of samples in the histograms.
                    type: integer
                  version:
                    description: Version of the format of the stored data.
                    type: string
                type: object
            type: object
        served: true
        storage: true
      - name: v1beta2
        schema:
          openAPIV3Schema:
            description: VerticalPodAutoscalerCheckpoint is the checkpoint of the internal
              state of VPA that is used for recovery after recommender's restart.
            properties:
              apiVersion:
                description: 'APIVersion defines the versioned schema of this representation
                  of an object. Servers should convert recognized schemas to the latest
                  internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
                type: string
              kind:
                description: 'Kind is a string value representing the REST resource this
                  object represents. Servers may infer this from the endpoint the client
                  submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
                type: string
              metadata:
                type: object
              spec:
                description: 'Specification of the checkpoint. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
                properties:
                  containerName:
                    description: Name of the checkpointed container.
                    type: string
                  vpaObjectName:
                    description: Name of the VPA object that stored VerticalPodAutoscalerCheckpoint
                      object.
                    type: string
                type: object
              status:
                description: Data of the checkpoint.
                properties:
                  cpuHistogram:
                    description: Checkpoint of histogram for consumption of CPU.
                    properties:
                      bucketWeights:
                        description: Map from bucket index to bucket weight.
                        type: object
                        x-kubernetes-preserve-unknown-fields: true
                      referenceTimestamp:
                        description: Reference timestamp for samples collected within
                          this histogram.
                        format: date-time
                        nullable: true
                        type: string
                      totalWeight:
                        description: Sum of samples to be used as denominator for weights
                          from BucketWeights.
                        type: number
                    type: object
                  firstSampleStart:
                    description: Timestamp of the fist sample from the histograms.
                    format: date-time
                    nullable: true
                    type: string
                  lastSampleStart:
                    description: Timestamp of the last sample from the histograms.
                    format: date-time
                    nullable: true
                    type: string
                  lastUpdateTime:
                    description: The time when the status was last refreshed.
                    format: date-time
                    nullable: true
                    type: string
                  memoryHistogram:
                    description: Checkpoint of histogram for consumption of memory.
                    properties:
                      bucketWeights:
                        description: Map from bucket index to bucket weight.
                        type: object
                        x-kubernetes-preserve-unknown-fields: true
                      referenceTimestamp:
                        description: Reference timestamp for samples collected within
                          this histogram.
                        format: date-time
                        nullable: true
                        type: string
                      totalWeight:
                        description: Sum of samples to be used as denominator for weights
                          from BucketWeights.
                        type: number
                    type: object
                  totalSamplesCount:
                    description: Total number of samples in the histograms.
                    type: integer
                  version:
                    description: Version of the format of the stored data.
                    type: string
                type: object
            type: object
        served: true
        storage: false
    ---
    apiVersion: apiextensions.k8s.io/v1
    kind: CustomResourceDefinition
    metadata:
      annotations:
        api-approved.kubernetes.io: https://github.com/kubernetes/kubernetes/pull/63797
        controller-gen.kubebuilder.io/version: v0.9.2
      creationTimestamp: null
      name: verticalpodautoscalers.autoscaling.k8s.io
    spec:
      group: autoscaling.k8s.io
      names:
        kind: VerticalPodAutoscaler
        listKind: VerticalPodAutoscalerList
        plural: verticalpodautoscalers
        shortNames:
        - vpa
        singular: verticalpodautoscaler
      scope: Namespaced
      versions:
      - additionalPrinterColumns:
        - jsonPath: .spec.updatePolicy.updateMode
          name: Mode
          type: string
        - jsonPath: .status.recommendation.containerRecommendations[0].target.cpu
          name: CPU
          type: string
        - jsonPath: .status.recommendation.containerRecommendations[0].target.memory
          name: Mem
          type: string
        - jsonPath: .status.conditions[?(@.type=='RecommendationProvided')].status
          name: Provided
          type: string
        - jsonPath: .metadata.creationTimestamp
          name: Age
          type: date
        name: v1
        schema:
          openAPIV3Schema:
            description: VerticalPodAutoscaler is the configuration for a vertical pod
              autoscaler, which automatically manages pod resources based on historical
              and real time resource utilization.
            properties:
              apiVersion:
                description: 'APIVersion defines the versioned schema of this representation
                  of an object. Servers should convert recognized schemas to the latest
                  internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
                type: string
              kind:
                description: 'Kind is a string value representing the REST resource this
                  object represents. Servers may infer this from the endpoint the client
                  submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
                type: string
              metadata:
                type: object
              spec:
                description: 'Specification of the behavior of the autoscaler. More info:
                  https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
                properties:
                  recommenders:
                    description: Recommender responsible for generating recommendation
                      for this object. List should be empty (then the default recommender
                      will generate the recommendation) or contain exactly one recommender.
                    items:
                      description: VerticalPodAutoscalerRecommenderSelector points to
                        a specific Vertical Pod Autoscaler recommender. In the future
                        it might pass parameters to the recommender.
                      properties:
                        name:
                          description: Name of the recommender responsible for generating
                            recommendation for this object.
                          type: string
                      required:
                      - name
                      type: object
                    type: array
                  resourcePolicy:
                    description: Controls how the autoscaler computes recommended resources.
                      The resource policy may be used to set constraints on the recommendations
                      for individual containers. If not specified, the autoscaler computes
                      recommended resources for all containers in the pod, without additional
                      constraints.
                    properties:
                      containerPolicies:
                        description: Per-container resource policies.
                        items:
                          description: ContainerResourcePolicy controls how autoscaler
                            computes the recommended resources for a specific container.
                          properties:
                            containerName:
                              description: Name of the container or DefaultContainerResourcePolicy,
                                in which case the policy is used by the containers that
                                don't have their own policy specified.
                              type: string
                            controlledResources:
                              description: Specifies the type of recommendations that
                                will be computed (and possibly applied) by VPA. If not
                                specified, the default of [ResourceCPU, ResourceMemory]
                                will be used.
                              items:
                                description: ResourceName is the name identifying various
                                  resources in a ResourceList.
                                type: string
                              type: array
                            controlledValues:
                              description: Specifies which resource values should be controlled.
                                The default is "RequestsAndLimits".
                              enum:
                              - RequestsAndLimits
                              - RequestsOnly
                              type: string
                            maxAllowed:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: Specifies the maximum amount of resources that
                                will be recommended for the container. The default is
                                no maximum.
                              type: object
                            minAllowed:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: Specifies the minimal amount of resources that
                                will be recommended for the container. The default is
                                no minimum.
                              type: object
                            mode:
                              description: Whether autoscaler is enabled for the container.
                                The default is "Auto".
                              enum:
                              - Auto
                              - "Off"
                              type: string
                          type: object
                        type: array
                    type: object
                  targetRef:
                    description: TargetRef points to the controller managing the set of
                      pods for the autoscaler to control - e.g. Deployment, StatefulSet.
                      VerticalPodAutoscaler can be targeted at controller implementing
                      scale subresource (the pod set is retrieved from the controller's
                      ScaleStatus) or some well known controllers (e.g. for DaemonSet
                      the pod set is read from the controller's spec). If VerticalPodAutoscaler
                      cannot use specified target it will report ConfigUnsupported condition.
                      Note that VerticalPodAutoscaler does not require full implementation
                      of scale subresource - it will not use it to modify the replica
                      count. The only thing retrieved is a label selector matching pods
                      grouped by the target resource.
                    properties:
                      apiVersion:
                        description: API version of the referent
                        type: string
                      kind:
                        description: 'Kind of the referent; More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds"'
                        type: string
                      name:
                        description: 'Name of the referent; More info: http://kubernetes.io/docs/user-guide/identifiers#names'
                        type: string
                    required:
                    - kind
                    - name
                    type: object
                    x-kubernetes-map-type: atomic
                  updatePolicy:
                    description: Describes the rules on how changes are applied to the
                      pods. If not specified, all fields in the `PodUpdatePolicy` are
                      set to their default values.
                    properties:
                      minReplicas:
                        description: Minimal number of replicas which need to be alive
                          for Updater to attempt pod eviction (pending other checks like
                          PDB). Only positive values are allowed. Overrides global '--min-replicas'
                          flag.
                        format: int32
                        type: integer
                      updateMode:
                        description: Controls when autoscaler applies changes to the pod
                          resources. The default is 'Auto'.
                        enum:
                        - "Off"
                        - Initial
                        - Recreate
                        - Auto
                        type: string
                    type: object
                required:
                - targetRef
                type: object
              status:
                description: Current information about the autoscaler.
                properties:
                  conditions:
                    description: Conditions is the set of conditions required for this
                      autoscaler to scale its target, and indicates whether or not those
                      conditions are met.
                    items:
                      description: VerticalPodAutoscalerCondition describes the state
                        of a VerticalPodAutoscaler at a certain point.
                      properties:
                        lastTransitionTime:
                          description: lastTransitionTime is the last time the condition
                            transitioned from one status to another
                          format: date-time
                          type: string
                        message:
                          description: message is a human-readable explanation containing
                            details about the transition
                          type: string
                        reason:
                          description: reason is the reason for the condition's last transition.
                          type: string
                        status:
                          description: status is the status of the condition (True, False,
                            Unknown)
                          type: string
                        type:
                          description: type describes the current condition
                          type: string
                      required:
                      - status
                      - type
                      type: object
                    type: array
                  recommendation:
                    description: The most recently computed amount of resources recommended
                      by the autoscaler for the controlled pods.
                    properties:
                      containerRecommendations:
                        description: Resources recommended by the autoscaler for each
                          container.
                        items:
                          description: RecommendedContainerResources is the recommendation
                            of resources computed by autoscaler for a specific container.
                            Respects the container resource policy if present in the spec.
                            In particular the recommendation is not produced for containers
                            with `ContainerScalingMode` set to 'Off'.
                          properties:
                            containerName:
                              description: Name of the container.
                              type: string
                            lowerBound:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: Minimum recommended amount of resources. Observes
                                ContainerResourcePolicy. This amount is not guaranteed
                                to be sufficient for the application to operate in a stable
                                way, however running with less resources is likely to
                                have significant impact on performance/availability.
                              type: object
                            target:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: Recommended amount of resources. Observes ContainerResourcePolicy.
                              type: object
                            uncappedTarget:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: The most recent recommended resources target
                                computed by the autoscaler for the controlled pods, based
                                only on actual resource usage, not taking into account
                                the ContainerResourcePolicy. May differ from the Recommendation
                                if the actual resource usage causes the target to violate
                                the ContainerResourcePolicy (lower than MinAllowed or
                                higher that MaxAllowed). Used only as status indication,
                                will not affect actual resource assignment.
                              type: object
                            upperBound:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: Maximum recommended amount of resources. Observes
                                ContainerResourcePolicy. Any resources allocated beyond
                                this value are likely wasted. This value may be larger
                                than the maximum amount of application is actually capable
                                of consuming.
                              type: object
                          required:
                          - target
                          type: object
                        type: array
                    type: object
                type: object
            required:
            - spec
            type: object
        served: true
        storage: true
        subresources: {}
      - deprecated: true
        deprecationWarning: autoscaling.k8s.io/v1beta2 API is deprecated
        name: v1beta2
        schema:
          openAPIV3Schema:
            description: VerticalPodAutoscaler is the configuration for a vertical pod
              autoscaler, which automatically manages pod resources based on historical
              and real time resource utilization.
            properties:
              apiVersion:
                description: 'APIVersion defines the versioned schema of this representation
                  of an object. Servers should convert recognized schemas to the latest
                  internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
                type: string
              kind:
                description: 'Kind is a string value representing the REST resource this
                  object represents. Servers may infer this from the endpoint the client
                  submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
                type: string
              metadata:
                type: object
              spec:
                description: 'Specification of the behavior of the autoscaler. More info:
                  https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status.'
                properties:
                  resourcePolicy:
                    description: Controls how the autoscaler computes recommended resources.
                      The resource policy may be used to set constraints on the recommendations
                      for individual containers. If not specified, the autoscaler computes
                      recommended resources for all containers in the pod, without additional
                      constraints.
                    properties:
                      containerPolicies:
                        description: Per-container resource policies.
                        items:
                          description: ContainerResourcePolicy controls how autoscaler
                            computes the recommended resources for a specific container.
                          properties:
                            containerName:
                              description: Name of the container or DefaultContainerResourcePolicy,
                                in which case the policy is used by the containers that
                                don't have their own policy specified.
                              type: string
                            maxAllowed:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: Specifies the maximum amount of resources that
                                will be recommended for the container. The default is
                                no maximum.
                              type: object
                            minAllowed:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: Specifies the minimal amount of resources that
                                will be recommended for the container. The default is
                                no minimum.
                              type: object
                            mode:
                              description: Whether autoscaler is enabled for the container.
                                The default is "Auto".
                              enum:
                              - Auto
                              - "Off"
                              type: string
                          type: object
                        type: array
                    type: object
                  targetRef:
                    description: TargetRef points to the controller managing the set of
                      pods for the autoscaler to control - e.g. Deployment, StatefulSet.
                      VerticalPodAutoscaler can be targeted at controller implementing
                      scale subresource (the pod set is retrieved from the controller's
                      ScaleStatus) or some well known controllers (e.g. for DaemonSet
                      the pod set is read from the controller's spec). If VerticalPodAutoscaler
                      cannot use specified target it will report ConfigUnsupported condition.
                      Note that VerticalPodAutoscaler does not require full implementation
                      of scale subresource - it will not use it to modify the replica
                      count. The only thing retrieved is a label selector matching pods
                      grouped by the target resource.
                    properties:
                      apiVersion:
                        description: API version of the referent
                        type: string
                      kind:
                        description: 'Kind of the referent; More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds"'
                        type: string
                      name:
                        description: 'Name of the referent; More info: http://kubernetes.io/docs/user-guide/identifiers#names'
                        type: string
                    required:
                    - kind
                    - name
                    type: object
                    x-kubernetes-map-type: atomic
                  updatePolicy:
                    description: Describes the rules on how changes are applied to the
                      pods. If not specified, all fields in the `PodUpdatePolicy` are
                      set to their default values.
                    properties:
                      updateMode:
                        description: Controls when autoscaler applies changes to the pod
                          resources. The default is 'Auto'.
                        enum:
                        - "Off"
                        - Initial
                        - Recreate
                        - Auto
                        type: string
                    type: object
                required:
                - targetRef
                type: object
              status:
                description: Current information about the autoscaler.
                properties:
                  conditions:
                    description: Conditions is the set of conditions required for this
                      autoscaler to scale its target, and indicates whether or not those
                      conditions are met.
                    items:
                      description: VerticalPodAutoscalerCondition describes the state
                        of a VerticalPodAutoscaler at a certain point.
                      properties:
                        lastTransitionTime:
                          description: lastTransitionTime is the last time the condition
                            transitioned from one status to another
                          format: date-time
                          type: string
                        message:
                          description: message is a human-readable explanation containing
                            details about the transition
                          type: string
                        reason:
                          description: reason is the reason for the condition's last transition.
                          type: string
                        status:
                          description: status is the status of the condition (True, False,
                            Unknown)
                          type: string
                        type:
                          description: type describes the current condition
                          type: string
                      required:
                      - status
                      - type
                      type: object
                    type: array
                  recommendation:
                    description: The most recently computed amount of resources recommended
                      by the autoscaler for the controlled pods.
                    properties:
                      containerRecommendations:
                        description: Resources recommended by the autoscaler for each
                          container.
                        items:
                          description: RecommendedContainerResources is the recommendation
                            of resources computed by autoscaler for a specific container.
                            Respects the container resource policy if present in the spec.
                            In particular the recommendation is not produced for containers
                            with `ContainerScalingMode` set to 'Off'.
                          properties:
                            containerName:
                              description: Name of the container.
                              type: string
                            lowerBound:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: Minimum recommended amount of resources. Observes
                                ContainerResourcePolicy. This amount is not guaranteed
                                to be sufficient for the application to operate in a stable
                                way, however running with less resources is likely to
                                have significant impact on performance/availability.
                              type: object
                            target:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: Recommended amount of resources. Observes ContainerResourcePolicy.
                              type: object
                            uncappedTarget:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: The most recent recommended resources target
                                computed by the autoscaler for the controlled pods, based
                                only on actual resource usage, not taking into account
                                the ContainerResourcePolicy. May differ from the Recommendation
                                if the actual resource usage causes the target to violate
                                the ContainerResourcePolicy (lower than MinAllowed or
                                higher that MaxAllowed). Used only as status indication,
                                will not affect actual resource assignment.
                              type: object
                            upperBound:
                              additionalProperties:
                                anyOf:
                                - type: integer
                                - type: string
                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
                                x-kubernetes-int-or-string: true
                              description: Maximum recommended amount of resources. Observes
                                ContainerResourcePolicy. Any resources allocated beyond
                                this value are likely wasted. This value may be larger
                                than the maximum amount of application is actually capable
                                of consuming.
                              type: object
                          required:
                          - target
                          type: object
                        type: array
                    type: object
                type: object
            required:
            - spec
            type: object
        served: true
        storage: false

  3. Install the components of vertical-pod-autoscaler.

    vertical-pod-autoscaler contains the following components: admission-controller, recommender, and updater.

    Note

    Before you install the admission-controller component, you must use a script to generate a certificate for a webhook.

    • YAML template for clusters whose Kubernetes versions are earlier than 1.22

      Install admission-controller

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: vpa-admission-controller
        namespace: kube-system
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: vpa-admission-controller
        template:
          metadata:
            labels:
              app: vpa-admission-controller
          spec:
            serviceAccountName: admin
            containers:
              - name: admission-controller
                image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-admission-controller:0.7.0
                imagePullPolicy: Always
                env:
                  - name: NAMESPACE
                    valueFrom:
                      fieldRef:
                        fieldPath: metadata.namespace
                volumeMounts:
                  - name: tls-certs
                    mountPath: "/etc/tls-certs"
                    readOnly: true
                resources:
                  limits:
                    cpu: 200m
                    memory: 500Mi
                  requests:
                    cpu: 50m
                    memory: 200Mi
                ports:
                  - containerPort: 8000
            volumes:
              - name: tls-certs
                secret:
                  secretName: vpa-tls-certs
      ---
      apiVersion: v1
      kind: Service
      metadata:
        name: vpa-webhook
        namespace: kube-system
      spec:
        ports:
          - port: 443
            targetPort: 8000
        selector:
          app: vpa-admission-controller

      Install recommender

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: vpa-recommender
        namespace: kube-system
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: vpa-recommender
        template:
          metadata:
            labels:
              app: vpa-recommender
          spec:
            serviceAccountName: admin
            containers:
            - name: recommender
              image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-recommender:0.7.0
              imagePullPolicy: Always
              resources:
                limits:
                  cpu: 200m
                  memory: 1000Mi
                requests:
                  cpu: 50m
                  memory: 500Mi
              ports:
              - containerPort: 8080

      Install updater

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: vpa-updater
        namespace: kube-system
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: vpa-updater
        template:
          metadata:
            labels:
              app: vpa-updater
          spec:
            serviceAccountName: admin
            containers:
              - name: updater
                image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-updater:0.7.0
                imagePullPolicy: Always
                resources:
                  limits:
                    cpu: 200m
                    memory: 1000Mi
                  requests:
                    cpu: 50m
                    memory: 500Mi
                ports:
                  - containerPort: 8080

    • YAML template for clusters whose Kubernetes versions are 1.22 and later

      Install admission-controller

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: vpa-admission-controller
        namespace: kube-system
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: vpa-admission-controller
        template:
          metadata:
            labels:
              app: vpa-admission-controller
          spec:
            serviceAccountName: vpa-admission-controller
            securityContext:
              runAsNonRoot: true
              runAsUser: 65534 # nobody
            containers:
              - name: admission-controller
                image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-admission-controller:0.13.0
                imagePullPolicy: Always
                env:
                  - name: NAMESPACE
                    valueFrom:
                      fieldRef:
                        fieldPath: metadata.namespace
                volumeMounts:
                  - name: tls-certs
                    mountPath: "/etc/tls-certs"
                    readOnly: true
                resources:
                  limits:
                    cpu: 200m
                    memory: 500Mi
                  requests:
                    cpu: 50m
                    memory: 200Mi
                ports:
                  - containerPort: 8000
                  - name: prometheus
                    containerPort: 8944
            volumes:
              - name: tls-certs
                secret:
                  secretName: vpa-tls-certs
      ---
      apiVersion: v1
      kind: Service
      metadata:
        name: vpa-webhook
        namespace: kube-system
      spec:
        ports:
          - port: 443
            targetPort: 8000
        selector:
          app: vpa-admission-controller

      Install recommender

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: vpa-recommender
        namespace: kube-system
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: vpa-recommender
        template:
          metadata:
            labels:
              app: vpa-recommender
          spec:
            serviceAccountName: vpa-recommender
            securityContext:
              runAsNonRoot: true
              runAsUser: 65534 # nobody
            containers:
            - name: recommender
              image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-recommender:0.13.0
              imagePullPolicy: Always
              resources:
                limits:
                  cpu: 200m
                  memory: 1000Mi
                requests:
                  cpu: 50m
                  memory: 500Mi
              ports:
              - name: prometheus
                containerPort: 8942

      Install updater

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: vpa-updater
        namespace: kube-system
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: vpa-updater
        template:
          metadata:
            labels:
              app: vpa-updater
          spec:
            serviceAccountName: vpa-updater
            securityContext:
              runAsNonRoot: true
              runAsUser: 65534 # nobody
            containers:
              - name: updater
                image: registry.cn-hangzhou.aliyuncs.com/acs/vpa-updater:0.13.0
                imagePullPolicy: Always
                env:
                  - name: NAMESPACE
                    valueFrom:
                      fieldRef:
                        fieldPath: metadata.namespace
                resources:
                  limits:
                    cpu: 200m
                    memory: 1000Mi
                  requests:
                    cpu: 50m
                    memory: 500Mi
                ports:
                  - name: prometheus
                    containerPort: 8943

Verify that vertical-pod-autoscaler is installed

  1. Use the following YAML file to create a Deployment named nginx-deployment-basic and a VPA resource named nginx-deployment-basic-vpa:

    Click to view details

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-deployment-basic
      labels:
        app: nginx
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: nginx:1.7.9
            ports:
            - containerPort: 80
    ---
    apiVersion: autoscaling.k8s.io/v1
    kind: VerticalPodAutoscaler
    metadata:
      name: nginx-deployment-basic-vpa
    spec:
      targetRef:
        apiVersion: "apps/v1"
        kind:       Deployment
        name:       nginx-deployment-basic
      updatePolicy:
        updateMode: "Off"

    Note

    Set updateMode to Off, and leave the requests and limits fields empty in the configuration of the Deployment.

  2. Run the following command to query the CPU requests and memory requests that vertical-pod-autoscaler recommends for the Deployment.

    Note

    The output is returned 2 minutes after you run the command.

    kubectl describe vpa nginx-deployment-basic-vpa

    The following output shows an example of the recommended resource requests:

    Click to view details

      Recommendation:
        Container Recommendations:
          Container Name:  nginx
          Lower Bound:
            Cpu:     25m
            Memory:  262144k
          Target:
            Cpu:     25m
            Memory:  262144k
          Uncapped Target:
            Cpu:     25m
            Memory:  262144k
          Upper Bound:
            Cpu:     11601m
            Memory:  12128573170

    You can specify resource requests for the Deployment based on the recommendation. vertical-pod-autoscaler continuously monitors the resource usage of the Deployment and provides suggestions on how to improve resource utilization.