To handle sudden traffic bursts effectively, a precise and responsive auto scaling strategy is crucial. A well-configured Horizontal Pod Autoscaler (HPA) can improve application responsiveness while optimizing cluster resource utilization. This topic explains how to use the Kubernetes External Metrics API to integrate with key business metrics, such as HTTP request rate and Ingress queries per second (QPS), enabling a more intelligent and automated scaling policy.
This guide will walk you through the following steps, using an nginx application as an example. You will deploy an nginx Deployment, Service, and Ingress, then configure an HPA to automatically scale the Deployment based on the Ingress QPS metric collected from Simple Log Service (SLS).
Step 1: Deploy ack-alibaba-cloud-metrics-adapter
ack-alibaba-cloud-metrics-adapter enables the Kubernetes HPA to scale workloads based on metrics from Alibaba Cloud services, such as Elastic Compute Service (ECS), Server Load Balancer (SLB), and ApsaraDB RDS (RDS), via the External Metrics API.
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of the target cluster. In the navigation pane on the left, choose .
On the Helm page, click Deploy. Configure the Basic Information parameters, select
ack-alibaba-cloud-metrics-adapter, then click Next.In the Parameters step, configure the Chart Version parameter and click OK.
ack-alibaba-cloud-metrics-adapter does not support in-place upgrades. To upgrade to the latest version, you must first uninstall the existing version and then install the new one.
Step 2: Create an application and a service
Create a file named nginx-test.yaml.
Run the following command to create the
DeploymentandService.kubectl apply -f nginx-test.yaml
Step 3: Create an Ingress
In the left navigation pane of the cluster management page, choose . In the upper-left corner of the Ingresses page, click Create Ingress.
In the Create Ingress panel, complete the required fields and click OK. After the Ingress is created, you are redirected to the Ingresses page.
Under the Name column, click the Ingress name to view its routing rules. For Ingress details, see Ingress management.
Step 4: Configure the HPA
In this step, you will configure the HPA to use two metrics from your SLS project for scaling decisions: sls_ingress_qps and sls_ingress_latency_p9999.
The target type for each metric is configured differently:
sls_ingress_qps:Target type:
AverageValueDescription: This target type tells the HPA to divide the total QPS metric by the current number of pod replicas. The scaling decision is then based on this per-Pod average.
sls_ingress_latency_p9999:Target type:
ValueDescription: This target type tells the HPA to use the metric's raw value directly, without dividing it by the number of pods.
Create a file named
ingress-hpa.yamland copy the following content into it.The following table describes the parameters in the HPA configuration.
Parameter
Required
Description
sls.ingress.routeYes
Specifies the target Ingress route in the format:
<namespace>-<svc>-<port>.<namespace>: The namespace where the Ingress is located.<svc>: The name of theServicetargeted by the Ingress.<port>: The name of the port on theService. Example:default-nginx-80.
sls.logstoreYes
The name of the SLS Logstore where the metrics are stored. Default value in a cluster is
nginx-ingress.sls.projectYes
The name of the SLS Project. Default value in a cluster is
k8s-log-Cluster ID.sls.internal.endpointNo
Specifies whether to access SLS via a private (internal) or public endpoint.
true (default): Access SLS via the internal VPC endpoint.
false: Access SLS via the public endpoint.
Run the following command to create the HPA.
kubectl apply -f ingress-hpa.yaml
Step 5: Verify the results
After configuring the HPA, you can verify its behavior by generating load and observing the scaling status.
Run the following command to start a load test against the service exposed by your Ingress.
# This command uses Apache Benchmark (ab) to simulate 10 concurrent users (-c 10) for a duration of 300 seconds (-t 300). # Replace <your-ingress-domain> with the hostname you configured. ab -t 300 -c 10 <your-ingress-domain>Verify the scaling status.
Run the following command to check the HPA status:
kubectl get hpa ingress-hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE ingress-hpa Depolyment/nginx-deployment-basic 21/10 (avg) 2 10 10 7m49sThe scale-out is considered successful if the value in the REPLICAS column has reached the value in the MAXPODS column. This confirms that the HPA has responded to the increased load by scaling up the application to its maximum configured limit.
FAQ
How do I query the sls_ingress_qps metric from the command line?
You can query the external metric directly from the Kubernetes API. The following example shows how to query for sls_ingress_qps.
kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/*/sls_ingress_qps?labelSelector=sls.project={{SLS_Project}},sls.logstore=nginx-ingress{{SLS_Project}}: The name of the SLS Project associated with this ACK cluster. If you have not customized it, the default name isk8s-log-{{ClusterId}}, where{{ClusterId}}is the ID of your cluster.
Analyzing the results
If you receive an error similar to this:
Error from server: { "httpCode": 400, "errorCode": "ParameterInvalid", "errorMessage": "key (slb_pool_name) is not config as key value config,if symbol : is in your log,please wrap : with quotation mark \"", "requestID": "xxxxxxx" }This indicates that no data was found for this metric. A common cause is trying to query for an ALB Ingress metric (
sls_alb_ingress_qps) when you are not using an ALB Ingress.If the query is successful, the output will look like this:
{ "kind": "ExternalMetricValueList", "apiVersion": "external.metrics.k8s.io/v1beta1", "metadata": {}, "items": [ { "metricName": "sls_ingress_qps", "timestamp": "2025-02-26T16:45:00Z", "value": "50", # This is the QPS value "metricLabels": { "sls.project": "your-sls-project-name", "sls.logstore": "nginx-ingress" } } ] }This output confirms that the Kubernetes external metrics API successfully retrieved the QPS data. The
valuefield contains the current QPS.
What do I do if unknown is displayed in the TARGETS column after I run the kubectl get hpa command?
Perform the following operations to troubleshoot the issue:
Run the
kubectl describe hpa <hpa_name>command to check why HPA becomes abnormal.If the value of
AbleToScaleisFalsein theConditionsfield, check whether the Deployment is created as expected.If the value of
ScalingActiveisFalsein theConditionsfield, proceed to the next step.
Run the
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/"command. IfError from server (NotFound): the server could not find the requested resourceis returned, verify the status of alibaba-cloud-metrics-adapter.If the status of
alibaba-cloud-metrics-adapteris normal, check whether the HPA metrics are related to the Ingress. If the metrics are related to the Ingress, make sure that you deploy the Simple Log Service (SLS) add-on before deployingack-alibaba-cloud-metrics-adapter. For more information, see Analyze and monitor the access log of nginx-ingress.Make sure that the values of the HPA metrics are valid. The value of sls.ingress.route must be in the
<namespace>-<svc>-<port>format.namespace: the namespace to which the Ingress belongs.svc: the name of the Service that you selected when you created the Ingress.port: the port of the Service.
How do I find the metrics that are supported by HPA?
For more information about the metrics that are supported by HPA, see Alibaba Cloud metrics adapter. The following table describes the commonly used metrics.
Metric | Description | Additional parameter |
sls_ingress_qps | The number of requests that the Ingress can process per second based on a specific routing rule. | sls.ingress.route |
sls_alb_ingress_qps | The number of requests that the ALB Ingress can process per second based on a specific routing rule. | sls.ingress.route |
sls_ingress_latency_avg | The average latency of all requests. | sls.ingress.route |
sls_ingress_latency_p50 | The maximum latency for the fastest 50% of all requests. | sls.ingress.route |
sls_ingress_latency_p95 | The maximum latency for the fastest 95% of all requests. | sls.ingress.route |
sls_ingress_latency_p99 | The maximum latency for the fastest 99% of all requests. | sls.ingress.route |
sls_ingress_latency_p9999 | The maximum latency for the fastest 99.99% of all requests. | sls.ingress.route |
sls_ingress_inflow | The inbound bandwidth of the Ingress. | sls.ingress.route |
How do I configure horizontal autoscaling after I customize the format of NGINX Ingress logs?
Refer to Horizontally scale pods with Alibaba Cloud metrics to perform horizontal pod autoscaling based on the Ingress metrics that are collected by SLS. You must configure SLS to collect NGINX Ingress logs.
By default, SLS is enabled when you create a cluster. If you use the default log collection settings, you can view the log analysis reports and real-time status of NGINX Ingresses in the SLS console after you create the cluster.
If you disable SLS when you create an ACK cluster, you cannot perform horizontal pod autoscaling based on the Ingress metrics that are collected by SLS. You must enable SLS for the cluster before you can use this feature. For more information, see Analyze and monitor the access log of nginx-ingress-controller.
The AliyunLogConfig that is generated when you enable SLS for the cluster for the first time applies only to the default log format that ACK defines for the Ingress controller. If you have changed the log format, you must modify the
processor_regexsettings in the AliyunLogConfig. For more information, see Use CRDs to collect container logs in DaemonSet mode.
Failed to pull alibaba-cloud-metrics-adapter image
Symptom
When attempting to upgrade ack-alibaba-cloud-metrics-adapter to version 1.3.7, the image pull fails with an error similar to this:
Failed to pull image "registry-<region-id>-vpc.ack.aliyuncs.com/acs/alibaba-cloud-metrics-adapter-amd64:v0.2.9-ba634de-aliyun"
Cause
ack-alibaba-cloud-metrics-adapter does not currently support in-place upgrades.
Solution
You must upgrade it by performing a clean reinstallation.
Back up your current add-on configuration.
Uninstall the old version of the add-on.
Install the new version using your backed-up configuration.
During the uninstall and reinstall process, any HPAs that rely on this metrics adapter will be unable to fetch metrics and will temporarily suspend all scaling operations.