If the number of requests to an API of a Java application surges, you can configure a Horizontal Pod Autoscaler (HPA) to automatically scale the application based on the queries per second (QPS) of the API. This topic describes how to configure an HPA to automatically scale an application based on the monitoring data that is collected by using the application performance management (APM) service of Application Real-Time Monitoring Service (ARMS).
How it works
After you enable ARMS APM for a Java application in your Container Service for Kubernetes (ACK) cluster, you can view the detailed information about the API requests of the application. For more information about how to enable ARMS APM for a Java application, see Monitor applications. ARMS APM converts the collected data to metrics that are supported by Managed Service for Prometheus (Prometheus). Then, the alibaba-cloud-metrics-adapter component converts the Prometheus metrics to metrics that are supported by the HPA. This way, the HPA can scale the application based on the collected metrics.
In this example, an application named arms-springboot-demo is created. Stress tests are performed on the /demo/queryUser/10 API exposed by the application.
Prerequisites
The Prometheus component is installed. For more information, see the Step 1: Enable Managed Service for Prometheus section of the "Managed Service for Prometheus" topic.
alibaba-cloud-metrics-adapter is installed in the kube-system namespace. For more information, see the Deploy alibaba-cloud-metrics-adapter section of the "Implement horizontal auto scaling based on Alibaba Cloud metrics" topic.
A namespace is created. For more information, see Manage namespaces and resource quotas. In this example, the arms-demo namespace is created.
The Java Development Kit (JDK) is installed. For more information about the JDK versions supported by ARMS APM, see Java components and frameworks supported by ARMS.
Procedure
Step 1: Install the ARMS APM component
To enable ARMS APM for an application, you must install the one-pilot component in the cluster.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose .
On the Add-ons page, find the ack-onepilot component and click Install in the component card. In the dialog box that appears, configure the parameters and click OK.
Step 2: Authorize the cluster to access ARMS
To monitor applications in an ACK Serverless cluster or applications that are deployed on elastic container instances, you must first authorize the cluster to access ARMS on the Cloud Resource Access Authorization page. Then, restart all the pods that are created for ack-onepilot.
To monitor applications in an ACK cluster, you must first check whether ARMS Addon Token exists in the cluster.
If an ACK cluster has ARMS Addon Token, ARMS performs password-free authorization on the cluster.
NoteBy default, ARMS Addon Token exists in ACK managed clusters. However, ARMS Addon Token may not exist in some ACK managed clusters that were created a long time ago. You can perform the following steps to authorize the clusters to access ARMS.
If an ACK cluster does not have ARMS Addon Token, you must manually authorize the ACK cluster to access ARMS.
Create a custom policy and add the following content to the policy document. For more information, see the Step 1: Create a custom policy section of the "[Product Changes] Permissions of the worker RAM role of ACK managed clusters are revoked" topic.
{ "Action": "arms:*", "Resource": "*", "Effect": "Allow" }
Attach the custom policy to the worker Resource Access Management (RAM) role. For more information, see the Step 2: Attach the custom policy to the worker RAM role section of the "[Product Changes] Permissions of the worker RAM role of ACK managed clusters are revoked" topic.
Step 3: Enable ARMS APM for a Java application
When you deploy a Java application in your cluster, you can enable ARMS APM by adding labels to the application.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose .
In the upper-right corner of the Deployments page, click Create from YAML.
On the page that appears, select a template from the Sample Template drop-down list, and add the following
labels
to the spec > template > metadata section in the Template code editor:labels: armsPilotAutoEnable: "on" armsPilotCreateAppName: "<your-deployment-name>" # Replace <your-deployment-name> with the actual application name. one-agent.jdk.version: "OpenJDK11" # This parameter is required if the application uses JDK 11. armsSecAutoEnable: "on" # This parameter is required if you want to enable Application Security.
NoteFor more information about Application Security, see What is Application Security?
You are charged for using Application Security. For more information about the billing of Application Security, see Billing.
The following YAML template shows how to create a Deployment that has ARMS APM enabled:
View the monitoring data collected by ARMS APM.
On the Deployments page, find the application and check whether ARMS Console is displayed in the Actions column.
Click ARMS Console to view monitoring data. In the left-side navigation pane, click Interface Invocation to view information about requests, such as HTTP requests, to the API of the application. The following figure shows the requests to the API of the arms-springboot-demo application. The request frequency is stable.
Create a Service to use a Server Load Balancer (SLB) instance to expose the API of the arms-springboot-demo application.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose .
In the upper-right corner of the Services page, click Create. In the Create Service dialog box, configure the parameters and click OK. For more information about Service parameters, see Create a Service.
Wait until the Service is created. On the Services page, you can view the external endpoint of the arms-demo-svc Service. Example: 47.94.XX.XX:8080.
Run the following command to send requests to the /demo/queryUser/10 API by using the external endpoint of the Service:
curl http://47.94.XX.XX:8080/demo/queryUser/10
Expected output:
{"id":1,"name":"KeyOfSpectator","password":"12****"}
The expected output indicates that the request is successful.
Step 4: Configure alibaba-cloud-metrics-adapter
Make sure that the Prometheus component is installed. Otherwise, you cannot perform this step. For more information, see the Step 1: Enable Managed Service for Prometheus section of the "Managed Service for Prometheus" topic.
Make sure that alibaba-cloud-metrics-adapter is deployed in the kube-system namespace. Otherwise, you cannot perform this step. For more information, see Deploy alibaba-cloud-metrics-adapter.
Log on to the ARMS console.
In the left-side navigation pane, choose .
On the Instances page, find the instance that you want to manage and click its name. The name is in the arms_metrics_{RegionId}_XXX format. In the left-side navigation pane, click Settings. On the lower part of the Settings tab, view and record the Prometheus URL in the HTTP API URL (Grafana Read URL) section.
Specify the Prometheus URL obtained from the HTTP API Address (Grafana Read Address) section in the previous step in the configurations of ack-alibaba-cloud-metrics-adapter.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose .
On the Helm page, find ack-alibaba-cloud-metrics-adapter and click Update in the Actions column.
In the Update Release panel, add the Prometheus URL that you recorded in Step 2.
Modify the configurations of the adapter-config ConfigMap of ack-alibaba-cloud-metrics-adapter.
On the Helm page, click ack-alibaba-cloud-metrics-adapter.
On the Basic Information tab, click adapter-config.
In the upper-right corner of the adapter-config page, click Edit YAML.
Add the following configurations to the adapter-config ConfigMap:
rules: - metricsQuery: sum by (rpc) (sum_over_time(<<.Series>>{rpc="/demo/queryUser/{id}",service="arms-demo:arms-k8s-demo",prpc="__all__",ppid="__all__",endpoint="__all__",destId="__all__",<<.LabelMatchers>>}[1m])) name: as: ${1}_per_second_queryuser matches: ^(.*)_count resources: namespaced: false seriesQuery: arms_app_requests_count
Complete sample code:
Query metrics in the cluster.
Run the following command to check whether the arms_app_requests_per_second_queryuser metric exists:
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1"
Expected output:
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"external.metrics.k8s.io/v1beta1","resources":[{"name":"k8s_workload_memory_working_set","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_memory_rss","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_latency_p9999","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_memory_usage","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_traffic_rx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_inflow","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_upstream_rt","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_ratio","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_traffic_tx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_packet_rx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"ahas_sentinel_pass_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_memory_request","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_latency_avg","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_max_connection","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_cpu_limit","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_day","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_month","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_connection_utilization","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_status_5xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_cpu_request","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_network_rx_rate","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_network_rx_errors","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_week","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_memory_cache","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_cpu_request","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_percorepricing","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_packet_tx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_status_4xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_upstream_5xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_cpu_limit","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"ahas_sentinel_block_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_cpu_utilization","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_alb_ingress_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_memory_utilization","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_latency_p95","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_active_connection","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_memory_limit","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_hour","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_status_2xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_status_3xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_cpu_util","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_memory_usage","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_network_tx_rate","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"ahas_sentinel_total_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_cpu_usage","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_network_tx_errors","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_rt","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_min","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_latency_p50","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_latency_p99","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_upstream_4xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_memory_request","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"ahas_sentinel_avg_rt","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_memory_limit","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"arms_app_requests_per_second_queryuser","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]}]}
The output shows that the arms_app_requests_per_second_queryuser metric exists.
Run the following command to view the real-time metric value:
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/arms-demo/arms_app_requests_per_second_queryuser"| jq .
Expected output:
{ "kind": "ExternalMetricValueList", "apiVersion": "external.metrics.k8s.io/v1beta1", "metadata": {}, "items": [ { "metricName": "arms_app_requests_per_second_queryuser", "metricLabels": { "rpc": "/demo/queryUser/10" }, "timestamp": "2022-11-09T07:49:07Z", "value": "6" } ] }
The output shows that the metric value is returned as expected.
Step 5: Configure an HPA to automatically scale the application based on APM metrics
Create a file named hpa.yaml and add the following content to the file:
NoteThe metric name that you specify in the hpa.yaml file must be the same as that defined in ack-alibaba-cloud-metrics-adapter in the previous step.
The
target
parameter in the hpa.yaml file specifies the scale-out threshold. In this example, the HPA scales out the application when the QPS exceeds 40.
apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: test-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: arms-springboot-demo minReplicas: 1 maxReplicas: 10 metrics: - type: External external: metric: name: arms_app_requests_per_second_queryuser # You can specify only thresholds of the Value or AverageValue type for external metrics. target: type: AverageValue averageValue: 40
Run the following command to deploy an HPA for the arms-springboot-demo application:
kubectl apply -f hpa.yaml
Run the following command to query metric changes:
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/arms-demo/arms_app_requests_per_second_queryuser"| jq .
Expected output:
{ "kind": "ExternalMetricValueList", "apiVersion": "external.metrics.k8s.io/v1beta1", "metadata": {}, "items": [ { "metricName": "arms_app_requests_per_second_queryuser", "metricLabels": { "rpc": "/demo/queryUser/10" }, "timestamp": "2022-11-09T07:53:16Z", "value": "4216" } ] }
Run the following command to query detailed information about the HPA:
kubectl get hpa -n arms-demo
Expected output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE test-hpa Deployment/arms-springboot-demo 300m/40 (avg) 1 10 10 148m
A value is displayed in the Targets column, which indicates that the HPA is deployed.
Perform stress tests to verify application scaling
Run the following command to perform stress tests on the demo application:
ab -c 50 -n 2000 http://47.94.XX.XX:8080/demo/queryUser/10
Note47.94.XX.XX:8080
is the external endpoint of the arms-demo-svc Service.Check whether the application is scaled when the metric value exceeds the specified threshold.
After you perform stress tests, you can view that the number of requests to the API of the application surges in the ARMS console. The following image shows an example.
The following Prometheus dashboard shows that the HPA scales the application when the QPS of the API of the application exceeds the specified scaling threshold.
When the QPS of the API of the application exceeds the specified scaling threshold, the number of replicated pods created for the demo application is scaled out by the HPA.
You can run the
kubectl describe hpa test-hpa -n arms-demo
command to query the scaling events.