If your application needs to dynamically adjust the total amount of computing resources based on the number of requests received per unit of time, you can use the QPS data collected by an Application Load Balancer (ALB) instance to set up auto scaling for the pods of the application.
Before you start
Before you start, we recommend that you read Get started with ALB Ingresses to learn about the basic features of ALB ingresses.
How it works
Queries per second (QPS) is the number of requests received per second. ALB instances can record client access data through Simple Log Service (SLS). Horizontal Pod Autoscaler (HPA) can monitor the QPS data of the service based on these access records and scale the corresponding workloads (such as Deployment and StatefulSet).
Prerequisites
The alibaba-cloud-metrics-adapter component is installed, and the version is 2.3.0 or later. For more information, see Deploy alibaba-cloud-metrics-adapter.
The Apache Benchmark stress testing tool is installed. For more information, see the official document Compiling and Installing.
A kubectl client is connected to the ACK cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
Two switches are created in different availability zones and are in the same VPC as the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
Step 1: Create an AlbConfig and associate an SLS project
Check the SLS project associated with the cluster.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, click Cluster Information.
On the Cluster Resources tab, find the Log Service Project resource and record the name of the SLS project on the right.
Create an AlbConfig.
Create and copy the following content into alb-qps.yaml, and fill in the SLS project information in the
accessLogConfig
field.apiVersion: alibabacloud.com/v1 kind: AlbConfig metadata: name: alb-qps spec: config: name: alb-qps addressType: Internet zoneMappings: - vSwitchId: vsw-uf6ccg2a9g71hx8go**** # ID of the virtual switch - vSwitchId: vsw-uf6nun9tql5t8nh15**** accessLogConfig: logProject: <LOG_PROJECT> # Name of the log project associated with the cluster logStore: <LOG_STORE> # Custom logstore name, must start with "alb_" listeners: - port: 80 protocol: HTTP
The following table describes the fields in the preceding code block:
Field
Value
Default
Description
logStore
string
""
The name of the Simple Log Service Logstore.
logProject
string
""
The name of the Simple Log Service project.
Run the following command to create AlbConfig:
kubectl apply -f alb-qps.yaml
Expected output:
albconfig.alibabacloud.com/alb-qps created
Step 2: Create sample resources
In addition to AlbConfig, ALB Ingress requires four types of resources: Deployment, Service, IngressClass, and Ingress to work as expected. You can use the following example to quickly create these resources.
Create the qps-quickstart.yaml file with the following content:
apiVersion: networking.k8s.io/v1 kind: IngressClass metadata: name: qps-ingressclass spec: controller: ingress.k8s.alibabacloud/alb parameters: apiGroup: alibabacloud.com kind: AlbConfig name: alb-qps # Same as the name of AlbConfig --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: qps-ingress spec: ingressClassName: qps-ingressclass # Same as the name of Ingress Class rules: - host: demo.alb.ingress.top # Replace with your domain name http: paths: - path: /qps pathType: Prefix backend: service: name: qps-svc port: number: 80 --- apiVersion: v1 kind: Service metadata: name: qps-svc namespace: default spec: ports: - port: 80 protocol: TCP targetPort: 80 selector: app: qps-deploy type: NodePort --- apiVersion: apps/v1 kind: Deployment metadata: name: qps-deploy labels: app: qps-deploy spec: replicas: 2 selector: matchLabels: app: qps-deploy template: metadata: labels: app: qps-deploy spec: containers: - name: qps-container image: nginx:1.7.9 ports: - containerPort: 80
Run the following command to create the sample resources:
kubectl apply -f qps-quickstart.yaml
Step 3: Create an HPA
Create the qps-hpa.yaml file, then copy the following content into it, and save it:
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: qps-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: qps-deploy # Name of the workload controlled by HPA minReplicas: 2 # Minimum number of pods maxReplicas: 10 # Maximum number of pods metrics: - type: External external: metric: name: sls_alb_ingress_qps # Metric name for QPS data, do not modify selector: matchLabels: sls.project: <LOG_PROJECT> # Name of the log project associated with the cluster sls.logstore: <LOG_STORE> # Custom logstore name sls.ingress.route: default-qps-svc-80 # Path of the service, parameter format is <namespace>-<svc>-<port> target: type: AverageType averageValue: 2 # Expected target for the metric, in this example, the average QPS for all pods is 2
The following table describes the fields in the preceding code block:
Field
Description
scaleTargetRef
The workload used by the application. This example uses the Deployment named qps-deployment created in Step 1.
minReplicas
The minimum number of containers that the Deployment can be scale to. This value needs to be set to an integer greater than or equal to 1.
maxReplicas
The maximum number of containers that the Deployment can be scale to. This value needs to be greater than the minimum number of replicas.
external.metric.name
The QPS-based metric for HPA. Do not modify the value.
sls.project
The SLS project for the metric. Set the value to the SLS project specified in the AlbConfig.
sls.logstore
The Logstore for the metric. Set the value to the Logstore specified in the AlbConfig.
sls.ingress.route
The path of the Service. Specify the value in the <namespace>-<svc>-<port> format. This example uses the qps-svc Service created in Step 1.
external.target
The target value for the metric. In this example, the average QPS for all pods is 2. The HPA will adjust the number of pods to make the QPS as close to the target value as possible.
Run the following command to create HPA.
kubectl apply -f qps-hpa.yaml
Run the following command to check the HPA deployment status.
kubectl get hpa
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 0/2 (avg) 2 10 2 5m41s
Run the following command to view HPA configuration information.
kubectl describe hpa qps-hpa
Expected output:
Name: qps-hpa Namespace: default Labels: <none> Annotations: <none> CreationTimestamp: ******** # Timestamp of HPA, can be ignored Reference: Deployment/qps-deployment Metrics: ( current / target ) "sls_alb_ingress_qps" (target average value): 0 / 2 Min replicas: 2 Max replicas: 10 Deployment pods: 2 current / 2 desired
(Optional) Step 4: Verify auto scaling
Verify application scale-out.
Run the following command to view Ingress information.
kubectl get ingress
Expected output:
NAME CLASS HOSTS ADDRESS PORTS AGE qps-ingress qps-ingressclass demo.alb.ingress.top alb-********.alb.aliyuncs.com 80 10m31s
Record the values of
HOSTS
andADDRESS
for use in subsequent steps.Run the following command to perform stress testing on the application. Replace
demo.alb.ingress.top
andalb-********.alb.aliyuncs.com
with the values obtained in the previous step.ab -r -c 5 -n 10000 -H Host:demo.alb.ingress.top http://alb-********.alb.aliyuncs.com/qps
Run the following command to check the scaling status of the application.
kubectl get hpa
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 14375m/2 (avg) 2 10 10 15m
The result shows that
REPLICAS
is 10, indicating that as the QPS data increases, the number of pods of the application are scaled out to 10.
Verify application scale-in.
After the stress testing is complete, run the following command to check the scaling status of the application.
kubectl get hpa
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 0/2 (avg) 2 10 2 28m
The result shows that
REPLICAS
is 2, indicating that after the QPS data drops to 0, the application is scaled in to 2 pods.
References
If you need to scale your application based on pod CPU or memory load, see Horizontal pod autoscaling (HPA).
If you need to schedule the scaling of applications, see Cron horizontal pod autoscaling (CronHPA).
For node auto scaling, see Overview of node scaling.