在Knative中使用HPA實現基於CPU和Memory的自動擴縮容 - Container Service for Kubernetes

Knative和HPA（Horizontal Pod Autoscaler）的結合使用可以為您的應用提供基於資源負載的自動擴縮容能力。Knative本身已經提供了基於請求量的自動擴縮容功能，但通過結合HPA，您可以根據更多指標類型（例如CPU使用率、記憶體使用量率等）進一步精細化控制擴縮容行為。

前提條件

已為叢集部署Knative。具體操作，請參見在ACK叢集部署Knative。
已通過kubectl工具串連叢集。具體操作，請參見擷取叢集KubeConfig並通過kubectl工具串連叢集。
如需使用Knative監控大盤查看Knative服務監控資料，需已將Knative接入阿里雲Prometheus監控，請參見通過阿里雲Prometheus監控查看Knative大盤。

步驟一：部署Knative Service

登入Container Service管理主控台，在左側導覽列選擇叢集。
在叢集列表頁面，單擊目的地組群名稱，然後在左側導覽列，選擇應用 > Knative。

在Knative頁面的服務管理頁簽下，選擇命名空間為default，然後單擊使用模板建立，將以下YAML樣本粘貼至模板，最後單擊建立。

建立一個名為helloworld-go-hpa的服務，以下為配置樣本。

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go-hpa # Knative服務名稱。
spec:
  template:
    metadata:
      labels:
        app: helloworld-go-hpa
      annotations:
        autoscaling.knative.dev/class: "hpa.autoscaling.knative.dev" # 指定彈性外掛程式為HPA。
        autoscaling.knative.dev/metric: "cpu" # HPA的指標類型可以設定為CPU和Memory。此處以CPU為例。
        autoscaling.knative.dev/target: "30" # 設定HPA CPU指標的閾值。根據該閾值，Knative的HPA將自動調整副本數。
        autoscaling.knative.dev/minScale: "1" # 設定彈性策略執行個體數的最小值。
        autoscaling.knative.dev/maxScale: "4" # 設定彈性策略執行個體數的最大值。
    spec:
      containers:
        - image: registry.cn-hangzhou.aliyuncs.com/knative-sample/autoscale-go:v1024
          resources:
            requests:
              cpu: '200m'

執行以下命令，檢查服務是否正常運行。

kubectl get ksvc

預期輸出如下：

NAME                   URL                                               LATESTCREATED                LATESTREADY                  READY   REASON
helloworld-go-hpa      http://helloworld-go-hpa.default.example.com      helloworld-go-hpa-00001      helloworld-go-hpa-00001      True

READY列顯示True，表明Knative服務正常運行。

步驟二：基於CPU實現自動彈性擴縮容

安裝Hey壓測工具。
關於Hey壓測工具的詳細資料，請參見Hey。

執行以下命令，類比 100 qps 持續60s內對服務發起訪問，從而進行壓測。

說明

請替換121.XX.XX.10為網關IP或網域名稱。

hey -z 60s -q 100   -host "helloworld-go-hpa.default.example.com"   "http://121.XX.XX.10?prime=40000000" # 121.199.XXX.XXX為網關IP或網域名稱。

在壓測的同時，執行以下命令即時查看Pod擴縮容情況。

kubectl get pods --watch

Pod擴縮容預期輸出如下結果：

NAME                                                     READY   STATUS    RESTARTS   AGE
# 該Pod正在運行，且容器已經就緒。
helloworld-go-hpa-00001-deployment-67cc8f979b-fxfl5      2/2     Running   0          101m
# Pod擴容為4個，且狀態為0/2 Pending，這意味著容器正在等待調度並分配資源。
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      0/2     Pending   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      0/2     Pending   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      0/2     Pending   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      0/2     Pending   0          0s
# 4個Pod的狀態變為0/2 ContainerCreating，表示容器正在建立。
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      0/2     ContainerCreating   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      0/2     ContainerCreating   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      0/2     ContainerCreating   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      0/2     ContainerCreating   0          0s
# 4個Pod的狀態變為1/2 Running和2/2 Running，表示其中一個或兩個容器已經成功建立並運行。
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      1/2     Running             0          1s
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      2/2     Running             0          1s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      1/2     Running             0          1s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      2/2     Running             0          1s

輸出結果表明，Knative具有基於請求實現自動彈性擴縮容的能力。表現為當系統負載增加，需要更多的Pod來處理請求時，Pod數從最初的1個擴充到了4個，以提高系統的處理能力和輸送量。

（可選）步驟三：查看Knative監控大盤

Knative提供開箱即用的可觀測能力，在Knative頁面單擊監控大盤頁簽，即可查看服務的監控資料情況。關於大盤資訊的詳細介紹，請參見通過阿里雲Prometheus監控查看Knative大盤。

Container Service for Kubernetes：在Knative中使用HPA實現基於CPU和Memory的自動擴縮容

前提條件

步驟一：部署Knative Service

步驟二：基於CPU實現自動彈性擴縮容

（可選）步驟三：查看Knative監控大盤

相關文檔