KnativeサービスのGPUリソースを構成し、GPU共有を有効にする - Container Service for Kubernetes

AIタスク、高性能コンピューティング、またはGPUリソースを必要とするその他のワークロードをKnativeにデプロイする場合、Knative ServiceでGPU高速化インスタンスタイプを指定して、GPU高速化インスタンスを作成できます。 GPU共有機能を有効にして、複数のポッドが単一のGPUを共有できるようにし、その使用量を最大化することもできます。

前提条件

Knativeがクラスターにデプロイされました。詳細については、「Knativeのデプロイ」「」をご参照ください。

GPUリソースの設定

注釈k8s.aliyun.com/eci-use-specsをKnative Serviceの設定のspec.template.metadata.annotationセクションに追加して、GPU高速化されたECSインスタンスタイプを指定できます。 nvidia.com/gpuフィールドをspec.containers.resources.limitsセクションに追加して、Knativeサービスに必要なGPUリソースの量を指定できます。

次のコードブロックは例です。

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go
spec:
  template:
    metadata:
      labels:
        app: helloworld-go
      annotations:
        k8s.aliyun.com/eci-use-specs: ecs.gn5i-c4g1.xlarge  # Specify a GPU-accelerated ECS instance type that is supported by Knative. 
    spec:
      containers:
        - image: registry.cn-hangzhou.aliyuncs.com/knative-sample/helloworld-go:73fbdd56
          ports:
          - containerPort: 8080
          resources:
            limits:
              nvidia.com/gpu: '1'    # Specify the number of GPUs that are required by the container. This field is required. If you do not specify this field, an error is returned when the pod is launched.

次のGPU高速化ECSインスタンスファミリーがサポートされています。

gn7iは、NVIDIA A10 GPUを使用するGPU高速化コンピューティング最適化インスタンスファミリーです。このインスタンスファミリーには、ecs.gn7i-c8g1.2xlargeなどのさまざまなインスタンスタイプがあります。
gn7。このインスタンスファミリーには、ecs.gn7-c12g1.3xlargeなどのさまざまなインスタンスタイプがあります。
gn6v: NVIDIA V100 GPUを使用するGPU高速化コンピューティング最適化インスタンスファミリー。このインスタンスファミリーには、ecs.gn6v-c8g1.2xlargeなどのさまざまなインスタンスタイプがあります。
gn6eは、NVIDIA V100 GPUを使用するGPU高速化コンピューティング最適化インスタンスファミリーです。このインスタンスファミリーには、ecs.gn6e-c12g1.3xlargeなどのさまざまなインスタンスタイプがあります。
gn6iは、NVIDIA T4 GPUを使用するGPU高速化コンピューティング最適化インスタンスファミリーです。このインスタンスファミリーには、ecs.gn6i-c4g1.xlargeなどのさまざまなインスタンスタイプがあります。
gn5i: NVIDIA P4 GPUを使用するGPU高速化コンピューティング最適化インスタンスファミリー。このインスタンスファミリーには、ecs.gn5i-c2g1.largeなどのさまざまなインスタンスタイプがあります。
gn5、NVIDIA P100 GPUを使用するGPU高速化コンピューティング最適化インスタンスファミリー。このインスタンスファミリーには、ecs.gn5-c4g1.xlargeなどのさまざまなインスタンスタイプがあります。
gn5インスタンスファミリーにはローカルディスクが搭載されています。ローカルディスクをelasticコンテナインスタンスにマウントできます。詳細については、「ローカルディスクが接続されたエラスティックコンテナインスタンスの作成」をご参照ください。

説明

GPUアクセラレーションelasticコンテナインスタンスでサポートされているGPUドライバーのバージョンは、NVIDIA 460.73.01です。 GPUアクセラレーションelastic containerインスタンスでサポートされているCUDA Toolkitバージョンは11.2です。
GPU高速化されたECSインスタンスファミリーの詳細については、「各リージョンで利用可能なECSインスタンスタイプ」および「インスタンスファミリーの概要」をご参照ください。

GPU共有の有効化

ノードのGPU共有機能を有効にするには、「例」をご参照ください。

Knative Serviceのaliyun.com/gpu-memフィールドを設定して、GPUメモリサイズを指定できます。例：

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go
  namespace: default
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/maxScale: "100"
        autoscaling.knative.dev/minScale: "0"
    spec:
      containerConcurrency: 1
      containers:
      - image: registry-vpc.cn-hangzhou.aliyuncs.com/hz-suoxing-test/test:helloworld-go
        name: user-container
        ports:
        - containerPort: 6666
          name: http1
          protocol: TCP
        resources:
          limits:
            aliyun.com/gpu-mem: "3" # Specify the GPU memory size.

Container Service for Kubernetes:KnativeサービスのGPUリソースを構成し、GPU共有を有効にする

前提条件

GPUリソースの設定

GPU共有の有効化

関連ドキュメント