All Products
Search
Document Center

Container Service for Kubernetes:FAQ about auto scaling

Last Updated:Aug 22, 2024

This topic provides answers to some frequently asked questions (FAQ) about auto scaling in Container Service for Kubernetes (ACK).

Category

Issue

Node auto scaling

Node instant scaling

How do I specify the nodes that I want to delete during the scale-in activities of node instant scaling?

Workload scaling

HPA based on Alibaba Cloud metrics

FAQ about node scaling

How do I update cluster-autoscaler to the latest version?

For a cluster with auto scaling enabled, you can use one of the following methods to update cluster-autoscaler:

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Nodes > Node Pools.

  3. Click Edit on the right side of Node Scaling. In the Node Scaling Configuration panel, click OK to update cluster-autoscaler to the latest version.

What resources can cluster-autoscaler simulate?

cluster-autoscaler can simulate and evaluate the following resources:

cpu
memory
sigma/eni
ephemeral-storage
aliyun.com/gpu-mem (shared GPUs only)
nvidia.com/gpu

For more information, see the How do I add custom resources to node pools for which auto scaling is enabled? section of this topic.

Does cluster-autoscaler support custom resources?

cluster-autoscaler supports only Kubernetes standard objects and does not support Kubernetes custom resources.

How do I prevent cluster-autoscaler from removing nodes?

To prevent cluster-autoscaler from removing nodes, add the "cluster-autoscaler.kubernetes.io/scale-down-disabled": "true" annotation to the node configurations. Run the following command to add the annotation:

kubectl annotate node <nodename> cluster-autoscaler.kubernetes.io/scale-down-disabled=true

How do I set a scale-out delay in cluster-autoscaler for unschedulable pods?

You can use the cluster-autoscaler.kubernetes.io/pod-scale-up-delay annotation to set a scale-out delay for each pod. If pods are still unschedulable after the delay ends, cluster-autoscaler may add nodes to schedule the pods. For example, you can use the "cluster-autoscaler.kubernetes.io/pod-scale-up-delay": "600s" annotation.

Why does cluster-autoscaler fail to add nodes after a scale-out activity is triggered?

Check whether the following situations exist:

  • The instance types in the scaling group cannot meet the resource request from pods. Some resources provided by the specified Elastic Compute Service (ECS) instance type are reserved or occupied for the following purposes:

  • Cross-zone scale-out activities cannot be triggered for pods that have limits on zones.

  • The Resource Access Management (RAM) role does not have the permissions to manage the Kubernetes cluster. You must configure RAM roles for each Kubernetes cluster that is involved in the scale-out activity. For more information, see Enable node auto scaling.

  • The following issues occur when you enable node auto scaling:

    • The instance fails to be added to the cluster and a timeout error occurs.

    • The node is not ready and a timeout error occurs.

    To ensure that nodes can be accurately scaled, cluster-autoscaler does not perform any scaling activities before it fixes the abnormal nodes.

Why does cluster-autoscaler fail to remove nodes after a scale-in activity is triggered?

Check whether the following situations exist:

  • The requested resource threshold of each pod is higher than the specified scale-in threshold.

  • Pods that belong to the kube-system namespace are running on the node.

  • A scheduling policy forces the pods to run on the current node. Therefore, the pods cannot be scheduled to other nodes.

  • PodDisruptionBudget is set for the pods on the node and the minimum value of PodDisruptionBudget is reached.

For more information about FAQ, see autoscaler.

How does the system choose a scaling group for a scaling activity?

When pods cannot be scheduled to nodes, the auto scaling component simulates the scheduling of the pods based on the configurations of scaling groups. The configurations include labels, taints, and instance specifications. If a scaling group meets the requirements, this scaling group is selected for the scale-out activity. If more than one scaling group meet the requirements, the system selects the scaling group that has the fewest idle resources after simulation.

What types of pods can prevent cluster-autoscaler from removing nodes?

If a pod is not created by a native Kubernetes Controller, such as a Deployment, ReplicaSet, Job, or StatefulSet, or If pods on a node cannot be securely terminated or migrated, the node may not be removed by the cluster-autoscaler. For more information, see What types of pods can prevent CA from removing a node?

What scheduling policies does cluster-autoscaler use to determine whether unschedulable pods can be scheduled to a node pool for which node auto scaling is enabled?

The following list describes the scheduling policies used by cluster-autoscaler:

  • PodFitsResources

  • GeneralPredicates

  • PodToleratesNodeTaints

  • MaxGCEPDVolumeCount

  • NoDiskConflict

  • CheckNodeCondition

  • CheckNodeDiskPressure

  • CheckNodeMemoryPressure

  • CheckNodePIDPressure

  • CheckVolumeBinding

  • MaxAzureDiskVolumeCount

  • MaxEBSVolumeCount

  • ready

  • NoVolumeZoneConflict

How do I add custom resources to node pools for which auto scaling is enabled?

To enable cluster-autoscaler to identify custom resources provided by a node pool for which auto scaling is enabled or identify the amounts of specific resource types provided by the node pool, you need to add an ECS tag with the following prefix to the node pool.

k8s.io/cluster-autoscaler/node-template/resource/{Resource name}:{Resource amount}

Example:

k8s.io/cluster-autoscaler/node-template/resource/hugepages-1Gi:2Gi

Why does a pod fail to be scheduled to a node that is added by cluster-autoscaler?

The estimated amount of available resources on the node that is added by cluster-autoscaler may be more than the actual amount of resources available on the node due to the accuracy of underlying resource calculation by cluster-autoscaler. For more information about the precision of underlying resource calculation by cluster-autoscaler, see the Why does a purchased instance have a memory size different from the memory size defined in the instance type? section of the "Instance FAQ" topic. If the resources requested by the pods on a node exceed 70% of the computing resources provided by the node, we recommend that you check whether the pods can be scheduled to another node of the same instance type.

cluster-autoscaler checks only the resource requests of pending pods and pods created by DaemonSets when cluster-autoscaler determines whether the nodes in your cluster can provide sufficient resources for pod scheduling. If the nodes in your cluster have static pods that are not created by DaemonSets, we recommend that you reserve resources for these pods.

How do I use pod annotations to allow cluster-autoscaler to remove the node that hosts the pod or prevent cluster-autoscaler from removing the pod that hosts the pod?

You can configure a pod to allow cluster-autoscaler to remove the node that hosts the pod or prevent cluster-autoscaler from removing the node that hosts the pod.

  • To configure a pod to prevent cluster-autoscaler from removing the node that hosts the pod, add the "cluster-autoscaler.kubernetes.io/safe-to-evict": "false" annotation to the pod configuration.

  • To configure a pod to allow cluster-autoscaler to remove the node that hosts the pod, add the "cluster-autoscaler.kubernetes.io/safe-to-evict": "true" annotation to the pod configuration.

What operations can trigger the system to automatically update cluster-autoscaler?

The following operations can trigger the system to automatically update cluster-autoscaler. This ensures that the configurations of cluster-autoscaler are up-to-date and its version is compatible with the cluster.

  • Update the auto scaling setting

  • Create, delete, or update node pools for which auto scaling is enabled

  • Successfully update clusters

How do I enable or disable pod eviction for a DaemonSet pod?

cluster-autoscaler decides whether to evict DaemonSet pods based on the Evict DaemonSet Pods setting. The setting takes effect on all DaemonSet pods in the cluster. For more information , see Step 1: Enable node auto scaling. If you want to enable the DaemonSet pod eviction feature for a DaemonSet pod, add the "cluster-autoscaler.kubernetes.io/enable-ds-eviction": "true" annotation to the pod configuration.

If you want to disable the DaemonSet pod eviction feature for a DaemonSet pod, add the "cluster-autoscaler.kubernetes.io/enable-ds-eviction": "false" annotation to the pod configuration.

Note
  • If the DaemonSet pod eviction feature is disabled, DaemonSet pods with the annotation are evicted only if the node hosts pods other than DaemonSet pods. If you want to use the annotation to evict a node that hosts only DaemonSet pods, you need to first enable the DaemonSet pod eviction feature.

  • You need to add the preceding annotation to the DaemonSet pod instead of the DaemonSet.

  • This annotation does not take effect on pods that are not created by DaemonSets.

  • By default, cluster-autoscaler does not delay other tasks when it evicts DaemonSet pods. DaemonSet pod eviction is performed simultaneously with other tasks. If you want cluster-autoscaler to wait until all DaemonSet pods are evicted, you need to add the "cluster-autoscaler.kubernetes.io/wait-until-evicted":"true" annotation to the pod configuration.

How does Auto Scaling evaluate the resource capacity of a scaling group that uses multiple types of instances?

For a scaling group that uses multiple types of instances, Auto Scaling evaluates the resource capacity of the scaling group based on the least amount of resources that the scaling group can provide.

For example, a scaling group uses two types of instances. One instance type provides 4 vCores and 32 GB of memory and the other one provides 8 vCores and 16 GB of memory. In this scenario, Auto Scaling considers that the scaling group can add instances each of which provides 4 vCores and 16 GB of memory. If a pending pod requests resources more than 4 vCores or 16 GB of memory, the pod is not scheduled.

You still need to take resource reservation into consideration after you specify multiple instance types for a scaling group. For more information, see the Why does cluster-autoscaler fail to add nodes after a scale-out activity is triggered? section of this topic.

FAQ about node instant scaling

How do I specify the nodes that I want to delete during the scale-in activities of node instant scaling?

You can add the goatscaler.io/force-to-delete:true:NoSchedule taint to the nodes that you want to delete. After you add this taint, the node instant scaling execute the delete operation without checking the pod status or whether the pod has been evicted from the drained node. Use this feature with caution, as it may result in service interruptions or data loss.

FAQ about workload scaling

What do I do if unknown is displayed in the current field in the HPA metrics?

If unknown is displayed in the current field, kube-controller-manager cannot access the data sources to collect resource metrics, and Horizontal Pod Autoscaler (HPA) fails to scale up or down.

Name:                                                  kubernetes-tutorial-deployment
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Mon, 10 Jun 2019 11:46:48  0530
Reference:                                             Deployment/kubernetes-tutorial-deployment
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 2%
Min replicas:                                          1
Max replicas:                                          4
Deployment pods:                                       1 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Events:
  Type     Reason                   Age                      From                       Message
  ----     ------                   ----                     ----                       -------
  Warning  FailedGetResourceMetric  3m3s (x1009 over 4h18m)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)

Possible causes:

  • Cause 1: The data sources from which resource metrics are collected are unavailable. Run the kubectl top pod command to check whether the metric data of monitored pods is returned. If no metric data is returned, run the kubectl get apiservice command to check whether the metrics-server component is available. The following sample code provides an example of the returned data:

    View sample code

    NAME                                   SERVICE                      AVAILABLE   AGE
    v1.                                    Local                        True        29h
    v1.admissionregistration.k8s.io        Local                        True        29h
    v1.apiextensions.k8s.io                Local                        True        29h
    v1.apps                                Local                        True        29h
    v1.authentication.k8s.io               Local                        True        29h
    v1.authorization.k8s.io                Local                        True        29h
    v1.autoscaling                         Local                        True        29h
    v1.batch                               Local                        True        29h
    v1.coordination.k8s.io                 Local                        True        29h
    v1.monitoring.coreos.com               Local                        True        29h
    v1.networking.k8s.io                   Local                        True        29h
    v1.rbac.authorization.k8s.io           Local                        True        29h
    v1.scheduling.k8s.io                   Local                        True        29h
    v1.storage.k8s.io                      Local                        True        29h
    v1alpha1.argoproj.io                   Local                        True        29h
    v1alpha1.fedlearner.k8s.io             Local                        True        5h11m
    v1beta1.admissionregistration.k8s.io   Local                        True        29h
    v1beta1.alicloud.com                   Local                        True        29h
    v1beta1.apiextensions.k8s.io           Local                        True        29h
    v1beta1.apps                           Local                        True        29h
    v1beta1.authentication.k8s.io          Local                        True        29h
    v1beta1.authorization.k8s.io           Local                        True        29h
    v1beta1.batch                          Local                        True        29h
    v1beta1.certificates.k8s.io            Local                        True        29h
    v1beta1.coordination.k8s.io            Local                        True        29h
    v1beta1.events.k8s.io                  Local                        True        29h
    v1beta1.extensions                     Local                        True        29h
    ...
    [v1beta1.metrics.k8s.io                 kube-system/metrics-server   True        29h]
    ...
    v1beta1.networking.k8s.io              Local                        True        29h
    v1beta1.node.k8s.io                    Local                        True        29h
    v1beta1.policy                         Local                        True        29h
    v1beta1.rbac.authorization.k8s.io      Local                        True        29h
    v1beta1.scheduling.k8s.io              Local                        True        29h
    v1beta1.storage.k8s.io                 Local                        True        29h
    v1beta2.apps                           Local                        True        29h
    v2beta1.autoscaling                    Local                        True        29h
    v2beta2.autoscaling                    Local                        True        29h

    If kube-system/metrics-server is not displayed in the SERVICE column of v1beta1.metrics.k8s.io, check whether metrics-server is overwritten by Prometheus Operator. If metrics-server is overwritten by Prometheus Operator, use the following YAML template to redeploy metrics-server:

    apiVersion: apiregistration.k8s.io/v1beta1
    kind: APIService
    metadata:
      name: v1beta1.metrics.k8s.io
    spec:
      service:
        name: metrics-server
        namespace: kube-system
      group: metrics.k8s.io
      version: v1beta1
      insecureSkipTLSVerify: true
      groupPriorityMinimum: 100
      versionPriority: 100

    If the issue persists, go to the Add-ons page of the cluster in the ACK console to check whether metrics-server is installed. For more information, see metrics-server.

  • Cause 2: Metrics cannot be fetched during a rolling update or scale-out activity.

    By default, metrics-server collects metrics at intervals of 1 minute. However, metrics-server must wait a few minutes before it can collect metrics after a rolling update or scale-out activity. We recommend that you query metrics 2 minutes after a rolling update or scale-out activity is complete.

  • Cause 3: The request field is not specified for the pod.

    By default, HPA obtains the CPU or memory usage of the pod by calculating the value of used resources/requested resources. If the requested resources are not specified in the pod configurations, HPA cannot calculate the resource usage. Therefore, you must make sure that the request field is specified in the resource parameter of the pod configurations.

  • Cause 4: The metric name is incorrect. Check whether the metric name is correct. The metric name is case-sensitive. For example, if the cpu metric supported by HPA is accidentally written as CPU, unknown is displayed in the current field.

What do I do if HPA fails to scale up or down with abnormal metrics?

If unknown is displayed in the current field in the HPA metrics, HPA fails to scale up or down may be caused by abnormal metrics. HPA cannot access the metrics used to determine scaling, and fails to adjust the number of pods. For more information about how to troubleshoot, see What do I do if unknown is displayed in the current field in the HPA metrics?.

What do I do if excess pods are added by HPA during a rolling update?

During a rolling update, kube-controller-manager performs zero filling on pods whose monitoring data cannot be collected. This may cause HPA to add an excessive number of pods. You can perform the following steps to fix this issue.

  • Fix this issue for all workloads in the cluster:

    To fix this issue, we recommend that you update metrics-server to the latest version and add the following parameter to the startup settings of metrics-server.

    The following configuration takes effect on all workloads in the cluster.

    ## Add the following configuration to the startup settings of metrics-server. 
    --enable-hpa-rolling-update-skipped=true  
  • Fix this issue for specific workloads. You can use one of the following methods to fix this issue for specific workloads:

    • Method 1: Add the following annotation to the template of a workload to skip HPA during rolling updates.

      ## Add the following annotation to the spec.template.metadata.annotations parameter of the workload configuration to skip HPA during rolling updates. 
      HPARollingUpdateSkipped: "true"

      View sample code

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: nginx-deployment-basic
        labels:
          app: nginx
      spec:
        replicas: 2
        selector:
          matchLabels:
            app: nginx
          template:
              metadata:
                labels:
                  app: nginx
                annotations:
                  HPARollingUpdateSkipped: "true"  # Skip HPA during rolling updates. 
              spec:
                containers:
                - name: nginx
                  image: nginx:1.7.9
                  ports:
                  - containerPort: 80

    • Method 2: Add the following annotation to the template of a workload to skip the warm-up period before rolling updates.

      ## Add the following annotation to the spec.template.metadata.annotations parameter of the workload configuration to skip the warm-up period before rolling updates. 
      HPAScaleUpDelay: 3m # You can change the value based on your business requirements.

      View sample code

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: nginx-deployment-basic
        labels:
          app: nginx
      spec:
        replicas: 2
        selector:
          matchLabels:
            app: nginx
          template:
              metadata:
                labels:
                  app: nginx
                annotations:
                  HPAScaleUpDelay: 3m  # This setting indicates that HPA takes effect 3 minutes after the pods are created. Valid units: s and m. s indicates seconds and m indicates minutes. 
              spec:
                containers:
                - name: nginx
                  image: nginx:1.7.9
                  ports:
                  - containerPort: 80

What do I do if HPA does not scale pods when the scaling threshold is reached?

HPA may not scale pods even if the CPU or memory usage drops below the scale-in threshold or exceeds the scale-out threshold. HPA also takes other factors into consideration when it scales pods. For example, HPA checks whether the current scale-out activity triggers a scale-in activity or the current scale-in activity triggers a scale-out activity. This prevents repetitive scaling and unnecessary resource consumption.

For example, if the scale-out threshold is 80% and you have two pods whose CPU utilizations are both 70%, the pods are not scaled in. This is because the CPU utilization of one pod may be higher than 80% after the pods are scaled in. This triggers another scale-out activity.

How do I configure the metric collection interval of HPA?

For metrics-server whose version is later than 0.2.1-b46d98c-aliyun, specify the --metric_resolution parameter in the startup settings. Example: --metric_resolution=15s.

Can CronHPA and HPA interact without conflicts?

CronHPA and HPA can interact without conflicts. ACK modifies the CronHPA configurations by setting scaleTargetRef to the scaling object of HPA. This way, only HPA scales the application that is specified by scaleTargetRef. This also enables CronHPA to detect the state of HPA. CronHPA does not directly change the number of pods for the Deployment. CronHPA triggers HPA to scale the pods. This way, conflicts between CronHPA and HPA are prevented.

How do I fix the issue that excess pods are added by HPA when CPU or memory usage rapidly increases?

When the pods of Java applications or applications powered by Java frameworks start, the CPU and memory usage may be high for a few minutes during the warm-up period. This may trigger HPA to scale out the pods. To fix this issue, update the version of metrics-server provided by ACK to 0.3.9.6 or later and add annotations to the pod configurations to prevent HPA from accidentally triggering scaling activities. For more information about how to update metrics-server, see Update the metrics-server component before you update the Kubernetes version to 1.12.

The following YAML template provides the sample pod configurations that prevent HPA from accidentally triggering scaling activities in this scenario.

View sample code

## In this example, a Deployment is used.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment-basic
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
      annotations:
        HPAScaleUpDelay: 3m # This setting indicates that HPA takes effect 3 minutes after the pods are created. Valid units: s and m. s indicates seconds and m indicates minutes. 
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9 # Replace it with your exactly <image_name:tags>.
        ports:
        - containerPort: 80 

What do I do if HPA scales out an application while the metric value in the audit log is lower than the threshold?

Cause

HPA calculates the desired number of replicated pods based on the ratio of the current metric value to the desired metric value: Desired number of replicated pods = ceil[Current number of replicated pods × (Current metric value/Desired metric value)].

The formula indicates that the accuracy of the result depends on the accuracies of the current number of replicated pods, the current metric value, and the desired metric value. For example, when HPA queries metrics about the current number of replicated pods, HPA first queries the subresource named scale of the object specified by the scaleTargetRef parameter and then selects pods based on the label specified in the Selector field in the status section of the subresource. If some pods queried by HPA do not belong to the object specified by the scaleTargetRef parameter, the desired number of replicated pods calculated by HPA may not meet your expectations. For example, HPA may scale out the application while the real-time metric value is lower than the threshold.

The number of matching pods may be inaccurate for the following reasons:

  • A rolling update is in progress.

  • Pods that do not belong to the object specified by the scaleTargetRef parameter have the label specified in the Selector field in the status section of the scale subresource. Run the following command to query the pods:

    kubectl get pods -n {Namespace} -l {Value of the Selector field in the status section of the subresource named scale}

Solution

  • If an in-progress rolling update causes the number of matching pods to be inaccurate, fix this issue by referring to the What do I do if excess pods are added by HPA during a rolling update? section of this topic.

  • If pods that do not belong to the object specified by the scaleTargetRef parameter have the label specified in the Selector field in the status section of the scale subresource, locate these pods and then change the label. You can also delete the pods that you no longer require.

Can HPA determine the order in which pods are scaled in?

No, HPA cannot determine the order in which pods are scaled in. HPA can automatically increase or decrease the number of pods based on defined metrics. However, HPA cannot determine which pods are terminated first. The order in which pods are terminated and the graceful shutdown time of the pods are determined by the controller that manages the pods.

What is the unit of HPA usage metrics?

Usage metrics are integers. Unit: m or none, with a conversion ratio of 1000m = 1. For example, tcp_connection_counts of 70000m equals 70.

FAQ about HPA based on Alibaba Cloud metrics

What do I do if unknow is displayed in the TARGETS column after I run the kubectl get hpa command?

Perform the following operations to troubleshoot the issue:

  1. Run the kubectl describe hpa <hpa_name> command to check why HPA becomes abnormal.

    • If the value of AbleToScale is False in the Conditions field, check whether the Deployment is created as expected.

    • If the value of ScalingActive is False in the Conditions field, proceed to the next step.

  2. Run the kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/" command. If Error from server (NotFound): the server could not find the requested resource is returned, verify the status of alibaba-cloud-metrics-adapter.

    If the status of alibaba-cloud-metrics-adapter is normal, check whether the HPA metrics are relevant to the Ingress. If the metrics are relevant to the Ingress, make sure that you deploy the Simple Log Service component before ack-alibaba-cloud-metrics-adapter is deployed. For more information, see Analyze and monitor the access log of nginx-ingress-controller.

  3. Make sure that the values of the HPA metrics are valid. The value of sls.ingress.route must be in the <namespace>-<svc>-<port> format.

    • namespace specifies the namespace to which the Ingress belongs.

    • svc specifies the name of the Service that you selected when you created the Ingress.

    • port specifies the port of the Service.

How do I find the metrics that are supported by HPA?

For more information about the metrics that are supported by HPA, see Alibaba Cloud metrics adapter. The following table describes the commonly used metrics.

Metric

Description

Additional parameter

sls_ingress_qps

The number of requests that the Ingress can process per second based on a specific routing rule.

sls.ingress.route

sls_alb_ingress_qps

The number of requests that the Application Load Balancer (ALB) Ingress can process per second based on a specific routing rule.

sls.ingress.route

sls_ingress_latency_avg

The average latency of all requests.

sls.ingress.route

sls_ingress_latency_p50

The maximum latency for the fastest 50% of all requests.

sls.ingress.route

sls_ingress_latency_p95

The maximum latency for the fastest 95% of all requests.

sls.ingress.route

sls_ingress_latency_p99

The maximum latency for the fastest 99% of all requests.

sls.ingress.route

sls_ingress_latency_p9999

The maximum latency for the fastest 99.99% of all requests.

sls.ingress.route

sls_ingress_inflow

The inbound bandwidth of the Ingress.

sls.ingress.route

What do I do if I customize the NGINX Ingress logs in a custom format?

For more information about how to perform horizontal pod autoscaling based on the Ingress metrics that are collected by Simple Log Service, see Implement horizontal auto scaling based on Alibaba Cloud metrics. You must configure Simple Log Service to collect NGINX Ingress logs.

  • When you create an ACK cluster, Simple Log Service is enabled for the cluster by default. If you use the default log collection settings, you can view the log analysis reports and real-time status of NGINX Ingresses in the Simple Log Service console after you create the cluster.

  • If you disable Simple Log Service when you create an ACK cluster, you cannot perform horizontal pod autoscaling based on the Ingress metrics that are collected by Simple Log Service. You must enable Simple Log Service for the cluster before you can use this feature. For more information, see Analyze and monitor the access log of nginx-ingress-controller.

  • The AliyunLogConfig that is generated the first time you enable Simple Log Service applies only to the default log format that ACK defines for the Ingress controller. If you have changed the log format, you must modify the processor_regex settings in the AliyunLogConfig. For more information, see Use CRDs to collect container logs in DaemonSet mode.