KnativeでHPAを使用する - Container Service for Kubernetes - Alibaba Cloud ドキュメントセンター

Alibaba Cloud Knativeは、Horizontal Pod Autoscaler (HPA) と統合して、リソース負荷に基づく自動スケーリングを有効にできます。 Knativeはリクエスト数に基づく自動スケーリングをネイティブにサポートしていますが、HPAと統合すると、CPUやメモリ使用量などの追加のメトリックを使用してきめ細かいスケーリングが可能になります。

前提条件

Knativeはクラスターにデプロイされています。詳細については、「ACKクラスターへのKnativeのデプロイ」および「ACKサーバーレスクラスターへのKnativeのデプロイ」をご参照ください。
kubectlクライアントがACKクラスターに接続されています。詳細については、「クラスターのkubeconfigファイルを取得し、kubectlを使用してクラスターに接続する」をご参照ください。
KnativeダッシュボードでKnativeモニタリングデータを表示するには、KnativeがManaged Service for Prometheusに接続されていることを確認します。詳細については、「Prometheus ServiceでのKnativeダッシュボードの表示」をご参照ください。

手順1: Knativeサービスのデプロイ

ACKコンソールにログインします。左側のナビゲーションウィンドウで、[クラスター] をクリックします。
[クラスター] ページで、管理するクラスターの名前をクリックします。左側のウィンドウで、[アプリケーション] > [ネイティブ] を選択します。

Knativeページの [サービス] タブで、[名前空間] ドロップダウンリストから [デフォルト] を選択し、[テンプレートから作成] をクリックして、次のYAMLコンテンツをコードエディターにコピーし、[作成] をクリックします。

次のサンプルコードは、helloworld-go-hpaという名前のKnative Serviceを作成します。

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go-hpa # Specify the name of the Knative Service. 
spec:
  template:
    metadata:
      labels:
        app: helloworld-go-hpa
      annotations:
        autoscaling.knative.dev/class: "hpa.autoscaling.knative.dev" # Specify HPA as the scaler. 
        autoscaling.knative.dev/metric: "cpu" # The metrics supported by HPA include CPU utilization and memory utilization. In this example, HPA is configured to work based on CPU utilization. 
        autoscaling.knative.dev/target: "30" # Specify the threshold of CPU utilization. HPA automatically scales pods for the Knative Service when the threshold is exceeded. 
        autoscaling.knative.dev/minScale: "1" # Specify the minimum number of pods that must be guaranteed. 
        autoscaling.knative.dev/maxScale: "4" # Specify the maximum number of pods that are allowed. 
    spec:
      containers:
        - image: registry.cn-hangzhou.aliyuncs.com/knative-sample/autoscale-go:v1024
          resources:
            requests:
              cpu: '200m'

次のコマンドを実行して、Knativeサービスが期待どおりに実行されるかどうかを確認します。

kubectl get ksvc

期待される出力:

NAME                   URL                                               LATESTCREATED                LATESTREADY                  READY   REASON
helloworld-go-hpa      http://helloworld-go-hpa.default.example.com      helloworld-go-hpa-00001      helloworld-go-hpa-00001      True

READY列にTrueが表示されている場合、Knativeサービスは期待どおりに実行されます。

ステップ2: CPU使用率に基づいてオートスケーリングをテストするリクエストを送信する

負荷テストツールをインストールします。
heyの詳細については、「Hey」をご参照ください。

次のコマンドを実行して、1秒あたり100クエリ (QPS) で60秒間リクエストを送信してロードテストを実行します。

説明

121.XX. XX.10をゲートウェイのIPアドレスまたはドメイン名に置き換えます。

hey -z 60s -q 100   -host "helloworld-go-hpa.default.example.com"   "http://121.XX.XX.10?prime=40000000" # 121.199.XXX.XXX is the IP address or domain name of the gateway.

ロードテスト中に、次のコマンドを実行して、ポッドがリアルタイムでスケーリングされているかどうかを確認できます。

kubectl get pods -- watch

期待される出力:

NAME                                                     READY   STATUS    RESTARTS   AGE
# The pod is running as expected and containers are in the ready state. 
helloworld-go-hpa-00001-deployment-67cc8f979b-fxfl5      2/2     Running   0          101m
# The number of pods is scaled out to four. The READY column displays 0/2 for each pod and the STATUS column displays Pending for each pod. This means that the pods are pending and resources are not allocated to the pods. 
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      0/2     Pending   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      0/2     Pending   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      0/2     Pending   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      0/2     Pending   0          0s
# The READY column displays 0/2 for each pod and the STATUS column displays ContainerCreating for each pod. This means that the containers in the pods are being created. 
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      0/2     ContainerCreating   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      0/2     ContainerCreating   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      0/2     ContainerCreating   0          0s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      0/2     ContainerCreating   0          0s
# The READY column displays 1/2 for two pods and 2/2 for two pods, and the STATUS column displays Running for each pod. This means that at least one container is created and runs as expected for each pod. 
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      1/2     Running             0          1s
helloworld-go-hpa-00001-deployment-67cc8f979b-kv6rj      2/2     Running             0          1s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      1/2     Running             0          1s
helloworld-go-hpa-00001-deployment-67cc8f979b-fxq85      2/2     Running             0          1s

出力は、HPAがKnativeサービスのポッドを自動的にスケーリングできることを示しています。負荷が増加すると、HPAはポッドの数を1から4にスケーリングして、Knativeサービスの処理能力とスループットを向上させます。

(オプション) 手順3: Knativeダッシュボードを表示する

Knativeは、Knativeサービスのすぐに使える可観測機能を提供します。 Knativeページの [モニタリングダッシュボード] タブでKnativeダッシュボードを表示できます。 Knativeダッシュボードの詳細については、「Prometheus ServiceでのKnativeダッシュボードの表示」をご参照ください。

Container Service for Kubernetes:KnativeでHPAを使用する

前提条件

手順1: Knativeサービスのデプロイ

ステップ2: CPU使用率に基づいてオートスケーリングをテストするリクエストを送信する

(オプション) 手順3: Knativeダッシュボードを表示する

関連ドキュメント