To dynamically adjust the total amount of computing resources based on the number of requests received per unit of time, you can use the queries per second (QPS) data collected by an Application Load Balancer (ALB) instance to set up auto scaling for the application pods.
Before you start
This topic assumes a basic understanding of ALB ingress features.
How it works
QPS is the number of requests received per second. ALB instances record client access data through Simple Log Service (SLS). Horizontal Pod Autoscaler (HPA) monitors the QPS data based on these recorded data and scale the corresponding workloads (such as Deployment and StatefulSet).
Prerequisites
The alibaba-cloud-metrics-adapter component is installed, and the version is 2.3.0 or later.
The Apache Benchmark stress testing tool is installed. For more information, see the official document Compiling and Installing.
A kubectl client is connected to the ACK cluster. For more information, see Get a cluster kubeconfig and connect to the cluster using kubectl.
Two switches are created in different availability zones and are in the same VPC as the cluster. For more information, see Get a cluster kubeconfig and connect to the cluster using kubectl.
Step 1: Create an AlbConfig and associate an SLS project
Check the SLS project associated with the cluster.
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, find the target cluster and click its name. In the navigation pane on the left, click Cluster Information.
On the Basic Information tab, find the Log Service Project resource and record the name of the SLS project on the right.
Create an AlbConfig.
Create and copy the following content into
alb-qps.yaml, and fill in the SLS project information in theaccessLogConfigfield.apiVersion: alibabacloud.com/v1 kind: AlbConfig metadata: name: alb-qps spec: config: name: alb-qps addressType: Internet zoneMappings: - vSwitchId: vsw-uf6ccg2a9g71hx8go**** # ID of the virtual switch - vSwitchId: vsw-uf6nun9tql5t8nh15**** accessLogConfig: logProject: <LOG_PROJECT> # Name of the log project associated with the cluster logStore: <LOG_STORE> # Custom logstore name, must start with "alb_" listeners: - port: 80 protocol: HTTPThe following table describes the fields in the preceding code block:
Field
Type
Description
logProject
string
The name of the Simple Log Service project.
Default value:
"".logStore
string
The name of the Simple Log Service Logstore, which must start with
alb_. The SLS Logstore is automatically created If it does not exist. For more information, see Enable Simple Log Service to collect access logs.Default value:
"alb_****".Run the following command to create AlbConfig:
kubectl apply -f alb-qps.yamlExpected output:
albconfig.alibabacloud.com/alb-qps created
Step 2: Create sample resources
In addition to AlbConfig, ALB Ingress requires four types of resources to work as expected: Deployment, Service, IngressClass, and Ingress. Use the following steps to quickly create these resources:
Create the
qps-quickstart.yamlfile with the following content:apiVersion: networking.k8s.io/v1 kind: IngressClass metadata: name: qps-ingressclass spec: controller: ingress.k8s.alibabacloud/alb parameters: apiGroup: alibabacloud.com kind: AlbConfig name: alb-qps # Same as the name of AlbConfig --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: qps-ingress spec: ingressClassName: qps-ingressclass # Same as the name of Ingress Class rules: - host: demo.alb.ingress.top # Replace with your domain name http: paths: - path: /qps pathType: Prefix backend: service: name: qps-svc port: number: 80 --- apiVersion: v1 kind: Service metadata: name: qps-svc namespace: default spec: ports: - port: 80 protocol: TCP targetPort: 80 selector: app: qps-deploy type: NodePort --- apiVersion: apps/v1 kind: Deployment metadata: name: qps-deploy labels: app: qps-deploy spec: replicas: 2 selector: matchLabels: app: qps-deploy template: metadata: labels: app: qps-deploy spec: containers: - name: qps-container image: nginx:1.7.9 ports: - containerPort: 80Run the following command to create the sample resources:
kubectl apply -f qps-quickstart.yaml
Step 3: Create an HPA
Create the
qps-hpa.yamlfile, then copy the following content into it, and save it:apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: qps-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: qps-deploy # Name of the workload controlled by HPA minReplicas: 2 # Minimum number of pods maxReplicas: 10 # Maximum number of pods metrics: - type: External # External metrics (non-Kubernetes native metrics) external: metric: name: sls_alb_ingress_qps # Metric name (ALB Ingress QPS). Do not modify. selector: matchLabels: sls.project: <LOG_PROJECT> # Log Service project name (replace with actual project name) sls.logstore: <LOG_STORE> # Logstore name (replace with actual Logstore name) sls.ingress.route: default-qps-svc-80 # Service path format: <namespace>-<svc>-<port> target: type: AverageType # Target metric type (average value) averageValue: “2” # Expected target value. In this example, average QPS per pod is 2.The following table describes the fields in the preceding code block:
Field
Description
scaleTargetRef
The workload used by the application. This example uses the Deployment named
qps-deploycreated in Step 2.minReplicas
The minimum number of containers that the Deployment can be scale to. This value needs to be set to an integer greater than or equal to 1.
maxReplicas
The maximum number of containers that the Deployment can be scale to. This value needs to be greater than the minimum number of replicas.
external.metric.name
The QPS-based metric for HPA. Do not modify the value.
sls.project
The SLS project for the metric. Set the value to the SLS project specified in the AlbConfig.
sls.logstore
The Logstore for the metric. Set the value to the Logstore specified in the AlbConfig.
sls.ingress.route
The path of the Service. Specify the value in the <namespace>-<svc>-<port> format. This example uses the qps-svc Service created in Step 1.
external.target
The target value for the metric. In this example, the average QPS for all pods is 2. The HPA will adjust the number of pods to make the QPS as close to the target value as possible.
Run the following command to create HPA.
kubectl apply -f qps-hpa.yamlRun the following command to check the HPA deployment status.
kubectl get hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 0/2 (avg) 2 10 2 5m41sRun the following command to view HPA configuration information.
kubectl describe hpa qps-hpaExpected output:
Name: qps-hpa Namespace: default Labels: <none> Annotations: <none> CreationTimestamp: ******** # Timestamp of HPA, can be ignored Reference: Deployment/qps-deploy Metrics: ( current / target ) "sls_alb_ingress_qps" (target average value): 0 / 2 Min replicas: 2 Max replicas: 10 Deployment pods: 2 current / 2 desired
(Optional) Step 4: Verify auto scaling
Verify application scale-out.
Run the following command to view Ingress information.
kubectl get ingressExpected output:
NAME CLASS HOSTS ADDRESS PORTS AGE qps-ingress qps-ingressclass demo.alb.ingress.top alb-********.alb.aliyuncs.com 80 10m31sRecord the values of
HOSTSandADDRESSfor use in subsequent steps.Run the following command to perform stress testing on the application.
Replace
demo.alb.ingress.topandalb-********.alb.aliyuncs.comwith the values recorded in the previous step.ab -r -c 5 -n 10000 -H Host:demo.alb.ingress.top http://alb-********.alb.aliyuncs.com/qpsRun the following command to check the scaling status of the application.
kubectl get hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 14375m/2 (avg) 2 10 10 15mThe result shows that
REPLICASis 10, indicating that as the QPS data increases, the application scaled out to theMAXPODSvalue (10 pods).
Verify application scale-in.
After the stress testing is complete, run the following command to check the scaling status of the application.
kubectl get hpaExpected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE qps-hpa Deployment/qps-deploy 0/2 (avg) 2 10 2 28mThe result shows that
REPLICASis 2, indicating that after the QPS data drops to 0, the application scaled in to theMINPODSvalue (2 pods).
References
If you need to scale your application based on pod CPU or memory load, see Horizontal Pod Autoscaler (HPA).
If you need to schedule the scaling of applications, see Cron Horizontal Pod Autoscaler (CronHPA).
For node auto scaling, see Node scaling.