You can deploy an application in multiple pods to improve application stability. However, this method increases costs and causes resource waste during off-peak hours. You can also manually scale the pods of your application. However, this method increases your O&M workload and pods cannot be scaled in real time. To resolve the preceding issues, you can configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller. This way, the pods of applications are automatically scaled based on loads. This method improves the stability and resilience of applications, optimizes resource usage, and reduces costs. This topic describes how to configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller.
Prerequisites
To configure horizontal pod autoscaling for multiple applications based on the metrics of the NGINX Ingress controller, you must convert Managed Service for Prometheus metrics to metrics that are supported by the Horizontal Pod Autoscaler (HPA) and deploy the required components.
The Managed Service for Prometheus component is installed. For more information, see Enable Managed Service for Prometheus.
alibaba-cloud-metrics-adapter is installed. For more information, see Horizontal pod scaling based on Managed Service for Prometheus metrics.
The stress testing tool Apache Benchmark is installed. For more information, see Apache Benchmark.
Background information
To automatically scale the pods of an application based on the number of requests in a production environment, you can use the http_requests_total metric to collect the number of requests. We recommend that you configure horizontal pod autoscaling based on the metrics of the NGINX Ingress controller.
An Ingress is a Kubernetes API object. An Ingress forwards client requests to Services based on the hosts and URL paths of the requests. Then, the Services route the requests to the backend pods.
The NGINX Ingress controller is deployed in a Container Service for Kubernetes (ACK) cluster to control the Ingresses in the cluster. The NGINX Ingress controller provides high-performance and custom traffic management. The NGINX Ingress controller provided by ACK is developed based on the open source version and is integrated with various features of Alibaba Cloud services to provide a simplified user experience.
Procedure
In this topic, two ClusterIP Services are created to forward the external requests received by the NGINX Ingress controller. In the following example, the HPA is configured to automatically scale pods based on the nginx_ingress_controller_requests
metric, which indicates the traffic loads.
Use the following YAML template to create a Deployment and a Service.
Create a file named nginx1.yaml based on the following content. Then, run the
kubectl apply -f nginx1.yaml
command to create an application named test-app and a Service also named test-app.apiVersion: apps/v1 kind: Deployment metadata: name: test-app labels: app: test-app spec: replicas: 1 selector: matchLabels: app: test-app template: metadata: labels: app: test-app spec: containers: - image: skto/sample-app:v2 name: metrics-provider ports: - name: http containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: test-app namespace: default labels: app: test-app spec: ports: - port: 8080 name: http protocol: TCP targetPort: 8080 selector: app: test-app type: ClusterIP
Create a file named nginx2.yaml based on the following content. Then, run the
kubectl apply -f nginx2.yaml
command to create an application named sample-app and a Service also named sample-app.apiVersion: apps/v1 kind: Deployment metadata: name: sample-app labels: app: sample-app spec: replicas: 1 selector: matchLabels: app: sample-app template: metadata: labels: app: sample-app spec: containers: - image: skto/sample-app:v2 name: metrics-provider ports: - name: http containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: sample-app namespace: default labels: app: sample-app spec: ports: - port: 80 name: http protocol: TCP targetPort: 8080 selector: app: sample-app type: ClusterIP
Create a file named ingress.yaml based on the following content. Then, run the
kubectl apply -f ingress.yaml
command to create an Ingress.apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: test-ingress namespace: default spec: ingressClassName: nginx rules: - host: test.example.com http: paths: - backend: service: name: sample-app port: number: 80 path: / pathType: ImplementationSpecific - backend: service: name: test-app port: number: 8080 path: /home pathType: ImplementationSpecific
host: the domain name that is used to enable external access to the backend Service. In this example,
test.example.com
is used.path: the URL paths that are used to match requests. The requests received by the Ingress are matched against the Ingress rules and forwarded to the corresponding Service. Then, the Service routes the requests to the backend pods.
backend: the name and port of the Service to which the requests that match the
path
parameter are forwarded.
Run the following command to query the Ingress:
kubectl get ingress -o wide
Expected output:
NAME CLASS HOSTS ADDRESS PORTS AGE test-ingress nginx test.example.com 10.10.10.10 80 55s
After you deploy the preceding resource objects, you can send requests to the
/
and/home
URL paths to access the specified host. The NGINX Ingress controller automatically routes your requests to the test-app and sample-app applications based on the URL paths of the requests. You can obtain information about the requests to each application from the Managed Service for Prometheus metricnginx_ingress_controller_requests
.Modify the adapter.config file of the alibaba-cloud-metrics-adapter component to convert the Prometheus metric to a metric that is supported by the HPA.
NoteBefore you modify the adapter.config file, make sure that the alibaba-cloud-metrics-adapter component is installed in your cluster and the prometheus.url file is configured.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage and choose in the left-side navigation pane.
Click ack-alibaba-cloud-metrics-adapter.
In the Resource section, click adapter-config.
On the adapter-config page, click Edit YAML in the upper-right corner.
Replace the code in Value with the following content, and then click OK in the lower part of the page.
For more information about how to configure the ConfigMap, see Horizontal pod scaling based on Managed Service for Prometheus metrics.
rules: - metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) name: as: ${1}_per_second matches: ^(.*)_requests resources: namespaced: false overrides: controller_namespace: resource: namespace seriesQuery: nginx_ingress_controller_requests
Run the following command to query a metric:
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/nginx_ingress_controller_per_second" | jq .
Expected output:
{ "kind": "ExternalMetricValueList", "apiVersion": "external.metrics.k8s.io/v1beta1", "metadata": {}, "items": [ { "metricName": "nginx_ingress_controller_per_second", "metricLabels": {}, "timestamp": "2022-03-31T10:11:37Z", "value": "0" } ] }
Create a file named hpa.yaml based on the following content. Then, run the
kubectl apply -f hpa.yaml
command to configure the HPA for both the sample-app and test-app applications.apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: sample-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: sample-app minReplicas: 1 maxReplicas: 10 metrics: - type: External external: metric: name: nginx_ingress_controller_per_second selector: matchLabels: # You can configure this parameter to filter metrics. The value of this parameter is passed to the <<.LabelMatchers>> field in the adapter.config file. service: sample-app # You can specify only thresholds of the Value or AverageValue type for external metrics. target: type: AverageValue averageValue: 30 ---------- apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: test-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: test-app minReplicas: 1 maxReplicas: 10 metrics: - type: External external: metric: name: nginx_ingress_controller_per_second selector: matchLabels: # You can configure this parameter to filter metrics. The value of this parameter is passed to the <<.LabelMatchers>> field in the adapter.config file. service: test-app # You can specify only thresholds of the Value or AverageValue type for external metrics. target: type: AverageValue averageValue: 30
Run the following command to query the HPA information:
kubectl get hpa
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE sample-hpa Deployment/sample-app 0/30 (avg) 1 10 1 74s test-hpa Deployment/test-app 0/30 (avg) 1 10 1 59m
After you configure the HPA, perform stress tests to check whether the pods of the applications are automatically scaled out when the number of requests increases.
Run the following command to perform stress tests on the
/home
URL path of the host:ab -c 50 -n 5000 test.example.com/home
Run the following command to query the HPA information:
kubectl get hpa
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE sample-hpa Deployment/sample-app 0/30 (avg) 1 10 1 22m test-hpa Deployment/test-app 22096m/30 (avg) 1 10 3 80m
Run the following command to perform stress tests on the root path of the host:
ab -c 50 -n 5000 test.example.com/
Run the following command to query the HPA information:
kubectl get hpa
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE sample-hpa Deployment/sample-app 27778m/30 (avg) 1 10 2 38m test-hpa Deployment/test-app 0/30 (avg) 1 10 1 96m
The output shows that the pods of the applications are automatically scaled out when the number of requests exceeds the scaling threshold.
References
Multi-zone load balancing is a deployment solution commonly used in high availability scenarios for data services. If an application that is deployed across zones does not have sufficient resources to handle heavy workloads, you may want ACK to create a specific number of nodes in each zone of the application. For more information, see Configure auto scaling for cross-zone deployment.
For more information about how to create custom images to accelerate horizontal pod autoscaling in complex scenarios, see Create custom images.