Configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller - Container Service for Kubernetes

You can deploy an application in multiple pods to improve application stability. However, this method increases costs and causes resource waste during off-peak hours. You can also manually scale the pods of your application. However, this method increases your O&M workload and pods cannot be scaled in real time. To resolve the preceding issues, you can configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller. This way, the pods of applications are automatically scaled based on loads. This method improves the stability and resilience of applications, optimizes resource usage, and reduces costs. This topic describes how to configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller.

Prerequisites

To configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller, you must convert Managed Service for Prometheus metrics to metrics that are supported by the Horizontal Pod Autoscaler (HPA) and deploy the required components.

The Managed Service for Prometheus component is installed. For more information, see Enable Managed Service for Prometheus.

alibaba-cloud-metrics-adapter is installed. For more information, see Horizontal pod scaling based on Managed Service for Prometheus metrics.
The stress testing tool Apache Benchmark is installed. For more information, see Apache Benchmark.

Background information

To automatically scale the pods of an application based on the number of requests in a production environment, you can use the http_requests_total metric to collect the number of requests. We recommend that you configure horizontal pod autoscaling based on the metrics of the NGINX Ingress controller.

An Ingress is a Kubernetes API object. An Ingress forwards client requests to Services based on the hosts and URL paths of the requests. Then, the Services route the requests to the backend pods.

The NGINX Ingress controller is deployed in a Container Service for Kubernetes (ACK) cluster to control the Ingresses in the cluster. The NGINX Ingress controller provides high-performance and custom traffic management. The NGINX Ingress controller provided by ACK is developed based on the open source version and is integrated with various features of Alibaba Cloud services to provide a simplified user experience.

Procedure

In this topic, two ClusterIP Services are created to forward the external requests received by the NGINX Ingress controller. In the following example, the HPA is configured to automatically scale pods based on the nginx_ingress_controller_requests metric, which indicates the traffic loads.

Use the following YAML template to create a Deployment and a Service.

Create a file named nginx1.yaml based on the following content. Then, run the kubectl apply -f nginx1.yaml command to create an application named test-app and a Service also named test-app.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-app
  labels:
    app: test-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: test-app
  template:
    metadata:
      labels:
        app: test-app
    spec:
      containers:
      - image: skto/sample-app:v2
        name: metrics-provider
        ports:
        - name: http
          containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: test-app
  namespace: default
  labels:
    app: test-app
spec:
  ports:
    - port: 8080
      name: http
      protocol: TCP
      targetPort: 8080
  selector:
    app: test-app
  type: ClusterIP

Create a file named nginx2.yaml based on the following content. Then, run the kubectl apply -f nginx2.yaml command to create an application named sample-app and a Service also named sample-app.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-app
  labels:
    app: sample-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
      - image: skto/sample-app:v2
        name: metrics-provider
        ports:
        - name: http
          containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: sample-app
  namespace: default
  labels:
    app: sample-app
spec:
  ports:
    - port: 80
      name: http
      protocol: TCP
      targetPort: 8080
  selector:
    app: sample-app
  type: ClusterIP

Create a file named ingress.yaml based on the following content. Then, run the kubectl apply -f ingress.yaml command to create an Ingress.
```
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: test-ingress
  namespace: default
spec:
  ingressClassName: nginx
  rules:
    - host: test.example.com
      http:
        paths:
          - backend:
              service:
                name: sample-app
                port:
                  number: 80
            path: /
            pathType: ImplementationSpecific
          - backend:
              service:
                name: test-app
                port:
                  number: 8080
            path: /home
            pathType: ImplementationSpecific
```
- host: the domain name that is used to enable external access to the backend Service. In this example, test.example.com is used.
- path: the URL paths that are used to match requests. The requests received by the Ingress are matched against the Ingress rules and forwarded to the corresponding Service. Then, the Service routes the requests to the backend pods.
- backend: the name and port of the Service to which the requests that match the path parameter are forwarded.

Run the following command to query the Ingress:

kubectl get ingress -o wide

Expected output:

NAME           CLASS   HOSTS              ADDRESS       PORTS   AGE                                                  
test-ingress   nginx   test.example.com   10.10.10.10   80      55s

After you deploy the preceding resource objects, you can send requests to the / and /home URL paths to access the specified host. The NGINX Ingress controller automatically routes your requests to the test-app and sample-app applications based on the URL paths of the requests. You can obtain information about the requests to each application from the Managed Service for Prometheus metric nginx_ingress_controller_requests.
Modify the adapter.config file of the alibaba-cloud-metrics-adapter component to convert the Prometheus metric to a metric that is supported by the HPA.
Note
Before you modify the adapter.config file, make sure that the alibaba-cloud-metrics-adapter component is installed in your cluster and the prometheus.url file is configured.
1. Log on to the ACK console. In the left-side navigation pane, click Clusters.
2. On the Clusters page, click the name of the cluster that you want to manage and choose Applications > Helm in the left-side navigation pane.
3. Click ack-alibaba-cloud-metrics-adapter.
4. In the Resource section, click adapter-config.
5. On the adapter-config page, click Edit YAML in the upper-right corner.
6. Replace the code in Value with the following content, and then click OK in the lower part of the page.
  For more information about how to configure the ConfigMap, see Horizontal pod scaling based on Managed Service for Prometheus metrics.
```
rules:
- metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m]))
  name:
    as: ${1}_per_second
    matches: ^(.*)_requests
  resources:
    namespaced: false
    overrides:
      controller_namespace:
        resource: namespace
  seriesQuery: nginx_ingress_controller_requests
```

Run the following command to query a metric:

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/nginx_ingress_controller_per_second" | jq .

Expected output:

{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "metricName": "nginx_ingress_controller_per_second",
      "metricLabels": {},
      "timestamp": "2022-03-31T10:11:37Z",
      "value": "0"
    }
  ]
}

Create a file named hpa.yaml based on the following content. Then, run the kubectl apply -f hpa.yaml command to configure the HPA for both the sample-app and test-app applications.

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: sample-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sample-app
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: External
      external:
        metric:
          name: nginx_ingress_controller_per_second
          selector:
            matchLabels:
# You can configure this parameter to filter metrics. The value of this parameter is passed to the <<.LabelMatchers>> field in the adapter.config file. 
              service: sample-app
# You can specify only thresholds of the Value or AverageValue type for external metrics. 
        target:
          type: AverageValue
          averageValue: 30
----------
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: test-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-app
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: External
      external:
        metric:
          name: nginx_ingress_controller_per_second
          selector:
            matchLabels:
# You can configure this parameter to filter metrics. The value of this parameter is passed to the <<.LabelMatchers>> field in the adapter.config file. 
              service: test-app
# You can specify only thresholds of the Value or AverageValue type for external metrics. 
        target:
          type: AverageValue
          averageValue: 30

Run the following command to query the HPA information:

kubectl get hpa

Expected output:

NAME         REFERENCE               TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
sample-hpa   Deployment/sample-app   0/30 (avg)   1         10        1          74s
test-hpa     Deployment/test-app     0/30 (avg)   1         10        1          59m

After you configure the HPA, perform stress tests to check whether the pods of the applications are automatically scaled out when the number of requests increases.

Run the following command to perform stress tests on the /home URL path of the host:
```
ab -c 50 -n 5000 test.example.com/home
```

Run the following command to query the HPA information:

kubectl get hpa

Expected output:

NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
sample-hpa   Deployment/sample-app   0/30 (avg)        1         10        1          22m
test-hpa     Deployment/test-app     22096m/30 (avg)   1         10        3          80m

Run the following command to perform stress tests on the root path of the host:
```
ab -c 50 -n 5000 test.example.com/
```

Run the following command to query the HPA information:

kubectl get hpa

Expected output:

NAME         REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
sample-hpa   Deployment/sample-app   27778m/30 (avg)   1         10        2          38m
test-hpa     Deployment/test-app     0/30 (avg)        1         10        1          96m

The output shows that the pods of the applications are automatically scaled out when the number of requests exceeds the scaling threshold.

References

Multi-zone load balancing is a deployment solution commonly used in high availability scenarios for data services. If an application that is deployed across zones does not have sufficient resources to handle heavy workloads, you may want ACK to create a specific number of nodes in each zone of the application. For more information, see Configure auto scaling for cross-zone deployment.
For more information about how to create custom images to accelerate horizontal pod autoscaling in complex scenarios, see Create custom images.