Use AHPA to Configure Custom Metrics for Application Scaling - Container Service for Kubernetes

In some scenarios, you may need to scale applications based on custom metrics, such as the QPS of HTTP requests or message queue length. Autoscaling Horizontal Pod Autoscaler (AHPA) provides the External Metrics mechanism that can work with the alibaba-cloud-metrics-adapter component to allow you to scale applications based on custom metrics. This topic describes how to use AHPA to configure custom metrics for application scaling.

Prerequisites

An ACK managed cluster or ACK Serverless cluster is created. For more information, see Create an ACK managed cluster and Create an ACK Serverless cluster.
The AHPA controller is installed.

Step 1: Make preparations

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Workloads > Deployments.
On the Deployments page, click Create from YAML in the upper-right corner.

On the Create page, copy the following YAML content to create a Deployment named sample-app, a Service, and a Deployment named fib-loader-qps for stress testing, and then click Create.

Note

The custom metric requests_per_second that indicates the number of requests per second is collected.

Click to view YAML content

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-app
  labels:
    app: sample-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
      - image: registry.cn-hangzhou.aliyuncs.com/acs/knative-sample-fib-server:v1
        name: metrics-provider
        ports:
        - name: http
          containerPort: 8080
        env:
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name

On the Create page, copy the following YAML content to create a ServiceMonitor and click Create:

Click to view YAML content

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: sample-app
  namespace: default
spec:
  endpoints:
  - interval: 30s
    port: http
    path: /metrics
  namespaceSelector:
    any: true
  selector:
    matchLabels:
      app: sample-app

Enable the ServiceMonitor.
1. Log on to the ARMS console.
2. In the left-side navigation pane, choose Managed Service for Prometheus > Instances.
3. In the top navigation bar, select the region in which your Prometheus instance is deployed and click the name of the instance, which is the same as the name of your ACK cluster.
4. In the left-side navigation pane, click Service Discovery. Click the Configure tab on the right side and then click the ServiceMonitor tab.
5. Turn on the switch in the Operation column of sample-app.

Step 2: Deploy ack-alibaba-cloud-metrics-adapter

Obtain the internal HTTP API endpoint of the Prometheus instance.
1. Log on to the ARMS console.
2. In the left-side navigation pane, choose Managed Service for Prometheus > Instances.
3. In the top navigation bar of the Instances page, select the region in which the Prometheus instance is deployed and click the name of the instance. The Prometheus instance is named in the arms_metrics_{RegionId}_XXX format.
4. In the left-side navigation pane, click Settings. In the HTTP API URL (Grafana Read URL) section, record the endpoint to the right of Internal Network.
  - Optional. If access tokens are enabled, you must configure an access token for your cluster and record the access token.
  - Record the HTTP API endpoint to the right of Internal Network.

Deploy ack-alibaba-cloud-metrics-adapter.

Log on to the ACK console. In the left-side navigation pane, choose Marketplace > Marketplace.
On the App Catalog tab, find and click ack-alibaba-cloud-metrics-adapter.
In the upper-right corner of the ack-alibaba-cloud-metrics-adapter page, click Deploy.
On the Basic Information wizard page, specify Cluster and Namespace, and click Next.

On the Parameters wizard page, specify Chart Version. Configure prometheus.url and prometheus.prometheusHeader in the Parameters section based on the internal HTTP API endpoint that you recorded and click OK.

  prometheus:
    enabled: true
    # Enter the internal HTTP API endpoint, which is the URL of Managed Service for Prometheus. 
    url: http://cn-beijing-intranet.arms.aliyuncs.com:9090/api/v1/prometheus/6b4b40986a3bec4f92ea418534****/115964845466****/arms-metrics-6fae216078e4****/cn-beijing
    # If access tokens are enabled for Managed Service for Prometheus, you need to configure prometheusHeader Authorization. 
    prometheusHeader:
    - Authorization: eyJhbGciOiJIUzI1NiJ9.eyJleHAiOjIwMDc1MTY0MDksImlzcyI6Imh0dHA6******liYWJhY2xvdWQuY29tIiwiaWF0IjoxNjkyMTU2NDA5LCJqdGkiOiI3NmRkOWJkOS0zYzBkLTRjY2MtOTFkYy1lZTU1OGFkNjg3NmMifQ.gltEJ7g4j-QPao2durNk3OiEBYhv2F_nzG-cncVfFtY

Configure custom metrics.

On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Applications > Helm.
Click Update in the Actions column of alibaba-cloud-metrics-adapter.

Copy the following YAML content to overwrite the code in the editor. You need to replace requests_per_second in the following sample code with the actual metric that is used in Managed Service for Prometheus. Then, click OK.

  ......
  prometheus:
    adapter:
      rules:
        custom:
        - metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>})
          name:
            as: requests_per_second
          resources:
            overrides:
              namespace:
                resource: namespace
          seriesQuery: requests_per_second # Specify the name of the metric that is used in Managed Service for Prometheus. 
        default: false
    enabled: true    # Set the value to true to enable ack-alibaba-cloud-metrics-adapter. 
    ......

Run the following command to query detailed information about the metric:

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/requests_per_second"
{"kind":"ExternalMetricValueList","apiVersion":"external.metrics.k8s.io/v1beta1","metadata":{},"items":[{"metricName":"requests_per_second","metricLabels":{},"timestamp":"2023-08-15T07:59:09Z","value":"10"}]}

Step 3: Deploy AHPA

Create the following AHPA resources.

Configure external.metric to specify the metric name and matchLabels. The metric name must be the same as that of the metric specified in Configure custom metrics. In this example, the requests_per_second metric is specified.
Set the threshold. In this example, AverageValue is set to 10. The application is scaled out when the number requests per second exceeds 10.

Click to view YAML content

apiVersion: autoscaling.alibabacloud.com/v1beta1
kind: AdvancedHorizontalPodAutoscaler
metadata:
  name: customer-deployment
  namespace: default
spec:
  metrics:
  - external:
      metric:
        name: requests_per_second
        selector:
          matchLabels:
            namespace: default
            service: sample-app
      target:
        type: AverageValue
        averageValue: 10
    type: External
  minReplicas: 0
  maxReplicas: 50
  prediction:
    quantile: 95
    scaleUpForward: 180
  scaleStrategy: observer
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sample-app
  instanceBounds:
  - startTime: "2023-08-01 00:00:00"
    endTime: "2033-08-01 00:00:00"
    bounds:
    - cron: "* 0-8 ? * MON-FRI"
      maxReplicas: 50
      minReplicas: 4
    - cron: "* 9-15 ? * MON-FRI"
      maxReplicas: 50
      minReplicas: 5
    - cron: "* 16-23 ? * MON-FRI"
      maxReplicas: 50
      minReplicas: 1

Run the following command to query the scaling result:
```
custom-metric % kubectl get ahpa
NAME                  STRATEGY   REFERENCE                   METRIC                TARGETS     DESIREDPODS   REPLICAS   MINPODS   MAXPODS   AGE
customer-deployment   observer   Deployment/sample-app       requests_per_second   60000m/10   6             1          1         50        7h53m
```
The m or k unit is used when Kubernetes requires a higher precision. For example, 1001m equals 1.001 and 60000m equals 60 in this example. The output indicates that the number of requests per second is 60, which exceeds the threshold 10. The expected number of pods (DESIREDPODS) is 6.