Terraformを使用して自動スケーリングが有効になっているノードプールを作成する

デフォルトでは、Container Service for Kubernetes (ACK) のノードプールおよびマネージドノードプール内のノードは、自動的にスケールインまたはスケールアウトすることはできません。 Terraformを使用して、自動スケーリングが有効になっているノードプールを作成できます。このトピックでは、Terraformを使用して、自動スケーリングが有効になっているノードプールを作成する方法について説明します。

説明

このトピックのサンプルコードは、数回クリックするだけで実行できます。詳細については、「Terraform Explorer」をご参照ください。

前提条件

自動スケーリング機能は、Alibaba Cloudサービスのauto scalingに依存しています。したがって、ノードの自動スケーリングを有効にする前に、Auto Scalingを有効にし、Auto Scalingのデフォルトロールをアカウントに割り当てる必要があります。詳細については、「Auto Scalingの有効化」をご参照ください。
説明
以前にalicloud_cs_kubernetes_autoscalerコンポーネントを使用していた場合、Auto Scalingが有効になります。
CloudOps Orchestration Service (OOS) にアクセスする権限が付与されます。 OOSアクセス権限を提供するAliyunOOSLifecycleHook4CSRoleロールを作成するには、次の手順を実行します。
1. AliyunOOSLifecycleHook4CSRoleをクリックします。
  説明
  現在のアカウントがAlibaba Cloudアカウントの場合、AliyunOOSLifecycleHook4CSRoleをクリックします。
  現在のアカウントがRAMユーザーの場合、Alibaba CloudアカウントにAliyunOOSLifecycleHook4CSRoleロールが割り当てられていることを確認します。次に、AliyunRAMReadOnlyAccessポリシーをRAMユーザーにアタッチします。詳細については、「RAM ユーザーへの権限の付与」をご参照ください。
2. [クラウドリソースアクセス権限付与] ページで、[権限付与に同意する] をクリックします。
Terraformのランタイム環境は、次のいずれかの方法を使用して準備されます。
- Terraform ExplorerでTerraformを使用する: Alibaba Cloudは、Terraformのオンラインランタイム環境を提供します。 Terraformをインストールしなくても、环境にログオンしてTerraformを使用できます。この方法は、低コストで効率的かつ便利な方法でTerraformを使用およびデバッグする必要があるシナリオに適しています。
- Cloud ShellでTerraformを使用: Cloud ShellはTerraformとともにプリインストールされ、ID認証情報で構成されています。 Cloud ShellでTerraformコマンドを実行できます。この方法は、低コストで効率的かつ便利な方法でTerraformを使用してアクセスする必要があるシナリオに適しています。
- オンプレミスマシンにTerraformをインストールして構成する: この方法は、ネットワーク接続が不安定な場合やカスタム開発環境が必要な場合に適しています。

背景情報

Terraformは、Terraformプロバイダーを通じて新しいインフラストラクチャをサポートするオープンソースツールです。 Terraformを使用して、クラウドのインフラストラクチャとリソースをプレビュー、設定、管理できます。詳細については、「」をご参照ください。Terraformとは何ですか?

Alibaba Cloud Providerの以前のバージョンでは、ACKはalicloud_cs_kubernetes_autoscalerという名前のコンポーネントを提供します。 alicloud_cs_kubernetes_autoscalerコンポーネントを使用して、ノードの自動スケーリングを有効にできます。ただし、次の制限が適用されます。

構成は複雑であり、コストが高い。
スケーリングされる各ノードはデフォルトのノードプールに追加され、個別に維持することはできません。
一部のパラメータは変更できません。

Alibaba Cloud Provider 1.111.0以降では、alicloud_cs_kubernetes_node_poolコンポーネントを使用して、自動スケーリングが有効になっているノードプールを作成できます。このコンポーネントには次の利点があります。

簡単なスケーリング設定を提供します。スケーリンググループのノード数の下限と上限を設定するだけです。
オプションパラメーターのデフォルト設定を使用して、ノード間の環境の不一致を防ぎます。これにより、ユーザーエラーが防止されます。たとえば、ノードごとに異なるOSイメージを設定できます。
ACKコンソールでノードプール内のノードの変更を明示的に表示できます。

Resources

説明

特定のリソースに対して課金されます。リソースが不要になった場合は、できるだけ早い機会にリソースを解放または購読解除する必要があります。

alicloud_instance_types: 指定された条件を満たすECS (Elastic Compute Service) インスタンスタイプを照会します。
alicloud_vpc: 仮想プライベートクラウド (VPC) を作成します。
alicloud_vswitch: VPCにvswitchを作成し、VPCのサブネットを作成します。
alicloud_cs_managed_kubernetes: ACK管理クラスターを作成します。
alicloud_cs_kubernetes_node_pool: ACK管理クラスターのノードプールを作成します。

alicloud_cs_kubernetes_autoscalerが以前に使用されました

以前にalicloud_cs_kubernetes_autoscalerコンポーネントを使用していた場合、クラスターにAuto Scalingへのアクセスを許可し、次の手順を実行してalicloud_cs_kubernetes_node_poolコンポーネントに切り替えます。次に、クラスターで自動スケーリングが有効になっているノードプールを作成できます。

autoscaler-meta ConfigMapを変更します。
1. ACKコンソールにログインします。左側のナビゲーションウィンドウで、[クラスター] をクリックします。
2. [クラスター] ページで、管理するクラスターの名前をクリックします。左側のナビゲーションウィンドウで、[設定] > [設定] を選択します。
3. ConfigMapページの左上隅で、名前空間ドロップダウンリストからkube-systemを選択します。 autoscaler-meta ConfigMapを見つけて、[操作] 列の [編集] をクリックします。
4. [編集] パネルで、autoscaler-meta ConfigMapの値を変更します。
  tainsの値を文字列型から配列型に変更する必要があります。この場合、[値] テキストボックスで "taints":"" を "taints":[] に変更します。
5. クリックOK.
ノードプールを同期します。
1. 詳細ページの左側のナビゲーションウィンドウで、[ノード] > [ノードプール] を選択します。
2. の右上隅にノードプールページをクリックします。同期ノードプール.

alicloud_cs_kubernetes_autoscalerは以前に使用されていません

Terraformを使用して、自動スケーリングが有効になっているノードプールを作成できます

ノードプール設定ファイルを作成します。

既存のACKクラスターで自動スケーリングが有効になっているノードプールを作成します。

次のコードは、既存のACKクラスターで自動スケーリングが有効になっているノードプールを作成する方法の例を示しています。

provider "alicloud" {
}
# Create a node pool that has auto scaling enabled in an existing ACK cluster. 
resource "alicloud_cs_kubernetes_node_pool" "at1" {
  # The ID of the ACK cluster where you want to create the node pool. 
  cluster_id           = ""
  name                 = "np-test"
  # The vSwitches that are used by nodes in the node pool. You must specify at least one vSwitch. 
  vswitch_ids          = ["vsw-bp1mdigyhmilu2h4v****"]
  instance_types       = ["ecs.e3.medium"]
  password             = "Hello1234"
 
  scaling_config {
    # The minimum number of nodes in the node pool. 
    min_size     = 1
    # The maximum number of nodes in the node pool. 
    max_size     = 5
  }

}

自動スケーリングが有効になっているノードプールを作成する

次のコードでは、自動スケーリングが有効になっているノードプールを含むクラスターを作成する方法の例を示します。

provider "alicloud" {
  region = var.region_id
}

variable "region_id" {
  type    = string
  default = "cn-shenzhen"
}

variable "cluster_spec" {
  type        = string
  description = "The cluster specifications of kubernetes cluster,which can be empty. Valid values:ack.standard : Standard managed clusters; ack.pro.small : Professional managed clusters."
  default     = "ack.pro.small"
}

# Specify the zones of vSwitches. 
variable "availability_zone" {
  description = "The availability zones of vswitches."
  default     = ["cn-shenzhen-c", "cn-shenzhen-e", "cn-shenzhen-f"]
}

# The CIDR blocks used to create vSwitches. 
variable "node_vswitch_cidrs" {
  type        = list(string)
  default     = ["172.16.0.0/23", "172.16.2.0/23", "172.16.4.0/23"]
}

# This variable specifies the CIDR blocks in which Terway vSwitches are created. 
variable "terway_vswitch_cidrs" {
  type        = list(string)
  default     = ["172.16.208.0/20", "172.16.224.0/20", "172.16.240.0/20"]
}

# Specify the ECS instance types of worker nodes. 
variable "worker_instance_types" {
  description = "The ecs instance types used to launch worker nodes."
  default     = ["ecs.g6.2xlarge", "ecs.g6.xlarge"]
}

# Specify a password for the worker node.
variable "password" {
  description = "The password of ECS instance."
  default     = "Test123456"
}

# Specify the prefix of the name of the ACK managed cluster. 
variable "k8s_name_prefix" {
  description = "The name prefix used to create managed kubernetes cluster."
  default     = "tf-ack-shenzhen"
}

# Specify the components that you want to install in the ACK managed cluster. The components include Terway (network plug-in), csi-plugin (volume plug-in), csi-provisioner (volume plug-in), logtail-ds (logging plug-in), the NGINX Ingress controller, ack-arms-prometheus (monitoring plug-in), and ack-node-problem-detector (node diagnostics plug-in). 
variable "cluster_addons" {
  type = list(object({
    name   = string
    config = string
  }))

  default = [
    {
      "name"   = "terway-eniip",
      "config" = "",
    },
    {
      "name"   = "logtail-ds",
      "config" = "{\"IngressDashboardEnabled\":\"true\"}",
    },
    {
      "name"   = "nginx-ingress-controller",
      "config" = "{\"IngressSlbNetworkType\":\"internet\"}",
    },
    {
      "name"   = "arms-prometheus",
      "config" = "",
    },
    {
      "name"   = "ack-node-problem-detector",
      "config" = "{\"sls_project_name\":\"\"}",
    },
    {
      "name"   = "csi-plugin",
      "config" = "",
    },
    {
      "name"   = "csi-provisioner",
      "config" = "",
    }
  ]
}

# The default resource names. 
locals {
  k8s_name_terway = "k8s_name_terway_${random_integer.default.result}"
  vpc_name = "vpc_name_${random_integer.default.result}"
  autoscale_nodepool_name = "autoscale-node-pool-${random_integer.default.result}"
}

# The ECS instance specifications of the worker nodes. Terraform searches for ECS instance types that fulfill the CPU and memory requests. 
data "alicloud_instance_types" "default" {
  cpu_core_count       = 8
  memory_size          = 32
  availability_zone    = var.availability_zone[0]
  kubernetes_node_role = "Worker"
}

resource "random_integer" "default" {
  min = 10000
  max = 99999
}

# The VPC. 
resource "alicloud_vpc" "default" {
  vpc_name   = local.vpc_name
  cidr_block = "172.16.0.0/12"
}

# The node vSwitch. 
resource "alicloud_vswitch" "vswitches" {
  count      = length(var.node_vswitch_cidrs)
  vpc_id     = alicloud_vpc.default.id
  cidr_block = element(var.node_vswitch_cidrs, count.index)
  zone_id    = element(var.availability_zone, count.index)
}

# The pod vSwitch. 
resource "alicloud_vswitch" "terway_vswitches" {
  count      = length(var.terway_vswitch_cidrs)
  vpc_id     = alicloud_vpc.default.id
  cidr_block = element(var.terway_vswitch_cidrs, count.index)
  zone_id    = element(var.availability_zone, count.index)
}

# The ACK managed cluster. 
resource "alicloud_cs_managed_kubernetes" "default" {
  name                         = local.k8s_name_terway # The ACK cluster name. 
  cluster_spec                 = var.cluster_spec      # Create an ACK Pro cluster. 
  worker_vswitch_ids           = split(",", join(",", alicloud_vswitch.vswitches.*.id))        # The vSwitch to which the node pool belongs. Specify one or more vSwitch IDs. The vSwitches must reside in the zone specified by availability_zone. 
  pod_vswitch_ids              = split(",", join(",", alicloud_vswitch.terway_vswitches.*.id)) # The vSwitch of the pod. 
  new_nat_gateway              = true                                                          # Specify whether to create a NAT gateway when the Kubernetes cluster is created. Default value: true. 
  service_cidr                 = "10.11.0.0/16"                                                # The pod CIDR block. If you set the cluster_network_type parameter to flannel, this parameter is required. The pod CIDR block cannot be the same as the VPC CIDR block or the CIDR blocks of other Kubernetes clusters in the VPC. You cannot change the pod CIDR block after the cluster is created. Maximum number of hosts in the cluster: 256. 
  slb_internet_enabled         = true                                                          # Specify whether to create an Internet-facing SLB instance for the API server of the cluster. Default value: false. 
  enable_rrsa                  = true
  control_plane_log_components = ["apiserver", "kcm", "scheduler", "ccm"] # The control plane logs. 
  dynamic "addons" {                                                      # Component management. 
    for_each = var.cluster_addons
    content {
      name   = lookup(addons.value, "name", var.cluster_addons)
      config = lookup(addons.value, "config", var.cluster_addons)
    }
  }
}

# Create a node pool for which auto scaling is enabled. The node pool can be scaled out to a maximum of 10 nodes and must contain at least 1 node. 
resource "alicloud_cs_kubernetes_node_pool" "autoscale_node_pool" {
  cluster_id     = alicloud_cs_managed_kubernetes.default.id
  node_pool_name = local.autoscale_nodepool_name
  vswitch_ids    = split(",", join(",", alicloud_vswitch.vswitches.*.id))

  scaling_config {
    min_size = 1
    max_size = 10
  }

  instance_types        = var.worker_instance_types
  password              = var.password # The password that is used to log on to the cluster by using SSH. 
  install_cloud_monitor = true         # Specify whether to install the CloudMonitor agent on the nodes in the cluster. 
  system_disk_category  = "cloud_efficiency"
  system_disk_size      = 100
  image_type            = "AliyunLinux3"

  data_disks {              # The data disk configuration of the node. 
    category = "cloud_essd" # The disk category. 
    size     = 120          # The disk size. 
  }
}

次のコマンドを実行して、Terraformランタイム環境を初期化します。

terraform init

次の情報が返されると、Terraformは初期化されます。

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

を実行します。Run theterraform applyコマンドを実行して、ノードプールを作成します。
結果を確認します。
ノードプールの作成後、[ノードプール] ページでノードプールを見つけることができます。ノードプールの名前の下にAuto Scaling Enabledが表示されます。

リソースのクリア

Terraformによって作成または管理された前述のリソースが不要になった場合は、terraform destroyコマンドを実行してリソースを解放します。 terraform destroyコマンドの詳細については、「一般的なコマンド」をご参照ください。

terraform destroy

サンプルコード

説明

このトピックのサンプルコードは、数回クリックするだけで実行できます。詳細については、「Terraform Explorer」をご参照ください。

コード全体

provider "alicloud" {
  region = var.region_id
}

variable "region_id" {
  type    = string
  default = "cn-shenzhen"
}

variable "cluster_spec" {
  type        = string
  description = "The cluster specifications of kubernetes cluster,which can be empty. Valid values:ack.standard : Standard managed clusters; ack.pro.small : Professional managed clusters."
  default     = "ack.pro.small"
}

# Specify the zones of vSwitches. 
variable "availability_zone" {
  description = "The availability zones of vswitches."
  default     = ["cn-shenzhen-c", "cn-shenzhen-e", "cn-shenzhen-f"]
}

# The CIDR blocks used to create vSwitches. 
variable "node_vswitch_cidrs" {
  type        = list(string)
  default     = ["172.16.0.0/23", "172.16.2.0/23", "172.16.4.0/23"]
}

# This variable specifies the CIDR blocks in which Terway vSwitches are created. 
variable "terway_vswitch_cidrs" {
  type        = list(string)
  default     = ["172.16.208.0/20", "172.16.224.0/20", "172.16.240.0/20"]
}

# Specify the ECS instance types of worker nodes. 
variable "worker_instance_types" {
  description = "The ecs instance types used to launch worker nodes."
  default     = ["ecs.g6.2xlarge", "ecs.g6.xlarge"]
}

# Specify a password for the worker node.
variable "password" {
  description = "The password of ECS instance."
  default     = "Test123456"
}

# Specify the prefix of the name of the ACK managed cluster. 
variable "k8s_name_prefix" {
  description = "The name prefix used to create managed kubernetes cluster."
  default     = "tf-ack-shenzhen"
}

# Specify the components that you want to install in the ACK managed cluster. The components include Terway (network plug-in), csi-plugin (volume plug-in), csi-provisioner (volume plug-in), logtail-ds (logging plug-in), the NGINX Ingress controller, ack-arms-prometheus (monitoring plug-in), and ack-node-problem-detector (node diagnostics plug-in). 
variable "cluster_addons" {
  type = list(object({
    name   = string
    config = string
  }))

  default = [
    {
      "name"   = "terway-eniip",
      "config" = "",
    },
    {
      "name"   = "logtail-ds",
      "config" = "{\"IngressDashboardEnabled\":\"true\"}",
    },
    {
      "name"   = "nginx-ingress-controller",
      "config" = "{\"IngressSlbNetworkType\":\"internet\"}",
    },
    {
      "name"   = "arms-prometheus",
      "config" = "",
    },
    {
      "name"   = "ack-node-problem-detector",
      "config" = "{\"sls_project_name\":\"\"}",
    },
    {
      "name"   = "csi-plugin",
      "config" = "",
    },
    {
      "name"   = "csi-provisioner",
      "config" = "",
    }
  ]
}

# The default resource names. 
locals {
  k8s_name_terway = "k8s_name_terway_${random_integer.default.result}"
  vpc_name = "vpc_name_${random_integer.default.result}"
  autoscale_nodepool_name = "autoscale-node-pool-${random_integer.default.result}"
}

# The ECS instance specifications of the worker nodes. Terraform searches for ECS instance types that fulfill the CPU and memory requests. 
data "alicloud_instance_types" "default" {
  cpu_core_count       = 8
  memory_size          = 32
  availability_zone    = var.availability_zone[0]
  kubernetes_node_role = "Worker"
}

resource "random_integer" "default" {
  min = 10000
  max = 99999
}

# The VPC. 
resource "alicloud_vpc" "default" {
  vpc_name   = local.vpc_name
  cidr_block = "172.16.0.0/12"
}

# The node vSwitch. 
resource "alicloud_vswitch" "vswitches" {
  count      = length(var.node_vswitch_cidrs)
  vpc_id     = alicloud_vpc.default.id
  cidr_block = element(var.node_vswitch_cidrs, count.index)
  zone_id    = element(var.availability_zone, count.index)
}

# The pod vSwitch. 
resource "alicloud_vswitch" "terway_vswitches" {
  count      = length(var.terway_vswitch_cidrs)
  vpc_id     = alicloud_vpc.default.id
  cidr_block = element(var.terway_vswitch_cidrs, count.index)
  zone_id    = element(var.availability_zone, count.index)
}

# The ACK managed cluster. 
resource "alicloud_cs_managed_kubernetes" "default" {
  name                         = local.k8s_name_terway # The ACK cluster name. 
  cluster_spec                 = var.cluster_spec      # Create an ACK Pro cluster. 
  worker_vswitch_ids           = split(",", join(",", alicloud_vswitch.vswitches.*.id))        # The vSwitch to which the node pool belongs. Specify one or more vSwitch IDs. The vSwitches must reside in the zone specified by availability_zone. 
  pod_vswitch_ids              = split(",", join(",", alicloud_vswitch.terway_vswitches.*.id)) # The vSwitch of the pod. 
  new_nat_gateway              = true                                                          # Specify whether to create a NAT gateway when the Kubernetes cluster is created. Default value: true. 
  service_cidr                 = "10.11.0.0/16"                                                # The pod CIDR block. If you set the cluster_network_type parameter to flannel, this parameter is required. The pod CIDR block cannot be the same as the VPC CIDR block or the CIDR blocks of other Kubernetes clusters in the VPC. You cannot change the pod CIDR block after the cluster is created. Maximum number of hosts in the cluster: 256. 
  slb_internet_enabled         = true                                                          # Specify whether to create an Internet-facing SLB instance for the API server of the cluster. Default value: false. 
  enable_rrsa                  = true
  control_plane_log_components = ["apiserver", "kcm", "scheduler", "ccm"] # The control plane logs. 
  dynamic "addons" {                                                      # Component management. 
    for_each = var.cluster_addons
    content {
      name   = lookup(addons.value, "name", var.cluster_addons)
      config = lookup(addons.value, "config", var.cluster_addons)
    }
  }
}

# Create a node pool for which auto scaling is enabled. The node pool can be scaled out to a maximum of 10 nodes and must contain at least 1 node. 
resource "alicloud_cs_kubernetes_node_pool" "autoscale_node_pool" {
  cluster_id     = alicloud_cs_managed_kubernetes.default.id
  node_pool_name = local.autoscale_nodepool_name
  vswitch_ids    = split(",", join(",", alicloud_vswitch.vswitches.*.id))

  scaling_config {
    min_size = 1
    max_size = 10
  }

  instance_types        = var.worker_instance_types
  password              = var.password # The password that is used to log on to the cluster by using SSH. 
  install_cloud_monitor = true         # Specify whether to install the CloudMonitor agent on the nodes in the cluster. 
  system_disk_category  = "cloud_efficiency"
  system_disk_size      = 100
  image_type            = "AliyunLinux3"

  data_disks {              # The data disk configuration of the node. 
    category = "cloud_essd" # The disk category. 
    size     = 120          # The disk size. 
  }
}

より完全な例を表示したい場合は、Landing with Terraformページで対応するサービスのディレクトリにアクセスしてください。