Use Terraform to create a node pool that has auto scaling enabled

By default, nodes in node pools and managed node pools of Container Service for Kubernetes (ACK) cannot automatically scale in or out. You can use Terraform to create a node pool that has auto scaling enabled. This topic describes how to use Terraform to create a node pool that has auto scaling enabled.

Note

You can run the sample code in this topic with a few clicks. For more information, visit Terraform Explorer.

Prerequisites

The auto scaling feature is reliant on the Alibaba Cloud service Auto Scaling. Therefore, you must activate Auto Scaling and assign the default role for Auto Scaling to your account before you enable auto scaling for nodes. For more information, see Activate Auto Scaling.
Note
If you previously used the alicloud_cs_kubernetes_autoscaler component, Auto Scaling is activated.
Permissions are granted to access CloudOps Orchestration Service (OOS). You can perform the following steps to create the AliyunOOSLifecycleHook4CSRole role that provides OOS access permissions.
1. Click AliyunOOSLifecycleHook4CSRole.
  Note
  If the current account is an Alibaba Cloud account, click AliyunOOSLifecycleHook4CSRole.
  If the current account is a RAM user, make sure that your Alibaba Cloud account is assigned the AliyunOOSLifecycleHook4CSRole role. Then, attach the AliyunRAMReadOnlyAccess policy to the RAM user. For more information, see Grant permissions to a RAM user.
2. On the RAM Quick Authorization page, click Authorize.
The runtime environment for Terraform is prepared by using one of the following methods:
- Use Terraform in Terraform Explorer: Alibaba Cloud provides an online runtime environment for Terraform. You can log on to the environment and use Terraform without needing to install it. Suitable for scenarios where you need to use and debug Terraform in a low-cost, efficient, and convenient manner.
- Use Terraform in Cloud Shell: Cloud Shell is preinstalled with Terraform and configured with your identity credentials. You can run Terraform commands in Cloud Shell. Suitable for scenarios where you need to use and access Terraform in a low-cost, efficient, and convenient manner.
- Install and configure Terraform on your on-premises machine: Suitable for scenarios where network connections are unstable or a custom development environment is needed.

Background information

Terraform is an open source tool that supports new infrastructures through Terraform providers. You can use Terraform to preview, configure, and manage cloud infrastructures and resources. For more information, see What is Terraform?

In earlier versions of Alibaba Cloud Provider, ACK provides a component named alicloud_cs_kubernetes_autoscaler. The alicloud_cs_kubernetes_autoscaler component can be used to enable auto scaling for nodes. However, the following limits apply:

The configuration is complex and the cost is high.
Each node to be scaled is added to the default node pool and cannot be separately maintained.
Some parameters cannot be modified.

Alibaba Cloud Provider 1.111.0 and later allow you to create node pools that have auto scaling enabled by using the alicloud_cs_kubernetes_node_pool component. This component has the following benefits:

Provides simple scaling configurations. You only need to set the lower and upper limits of the node quantity in the scaling group.
Uses default settings for optional parameters to prevent inconsistent environments among nodes. This prevents user errors. For example, you may configure different OS images for different nodes.
Allows you to explicitly view the changes of nodes in a node pool in the ACK console.

Resources

Note

You are charged for specific resources. If you no longer require the resources, you must release or unsubscribe from the resources at the earliest opportunity.

alicloud_instance_types: queries Elastic Compute Service (ECS) instance types that meet the specified conditions.
alicloud_vpc: creates a virtual private cloud (VPC).
alicloud_vswitch: creates vSwitches in a VPC to create subnets for the VPC.
alicloud_cs_managed_kubernetes: creates an ACK managed cluster.
alicloud_cs_kubernetes_node_pool: creates a node pool for an ACK managed cluster.

Generate Terraform request parameters by using the ACK console

If the request parameters are incorrect or if the configuration you need is not included in the examples below, you can generate the required request parameters for creating a node pool by using the ACK console.

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster to manage and click its name. In the left-side navigation pane, choose Nodes > Node Pools.
On the Node Pools page, click Create Node Pool. In the page that appears, configure the parameters and click Confirm. In the dialog box that appears, click Generate API Request Parameters.
In the OpenAPI Best Practices panel, click the Terraform tab. The parameters used to create node pools are displayed. You can copy these parameters.

alicloud_cs_kubernetes_autoscaler is previously used

If you previously used the alicloud_cs_kubernetes_autoscaler component, authorize your cluster to access Auto Scaling and perform the following steps to switch to the alicloud_cs_kubernetes_node_pool component. Then, you can create node pools that have auto scaling enabled in your cluster.

Modify the autoscaler-meta ConfigMap.
1. Log on to the ACK console. In the left-side navigation pane, click Clusters.
2. On the Clusters page, click the name of the one you want to change. In the left-side navigation pane, choose Configurations > ConfigMaps.
3. In the upper-left corner of the ConfigMap page, select kube-system from the Namespace drop-down list. Find the autoscaler-meta ConfigMap and click Edit in the Actions column.
4. In the Edit panel, modify the value of the autoscaler-meta ConfigMap.
  You need to change the value of taints from the string type to the array type. In this case, change "taints":"" to "taints":[] in the Value text box.
5. Click OK.
Synchronize the node pool.
1. In the left-side navigation pane of the details page, choose Nodes > Node Pools.
2. In the upper-right corner of the Node Pools page, click Sync Node Pool.

alicloud_cs_kubernetes_autoscaler is not previously used

You can use Terraform to create a node pool that has auto scaling enabled

Create a node pool configuration file.

Create a node pool that has auto scaling enabled in an existing ACK cluster.

Create a node pool that has auto scaling enabled

The following code provides an example on how to create a node pool that has auto scaling enabled in an existing ACK cluster:

provider "alicloud" {
}
# Create a node pool that has auto scaling enabled in an existing ACK cluster. 
resource "alicloud_cs_kubernetes_node_pool" "at1" {
  # The ID of the ACK cluster where you want to create the node pool. 
  cluster_id           = ""
  name                 = "np-test"
  # The vSwitches that are used by nodes in the node pool. You must specify at least one vSwitch. 
  vswitch_ids          = ["vsw-bp1mdigyhmilu2h4v****"]
  instance_types       = ["ecs.e3.medium"]
  password             = "Hello1234"
 
  scaling_config {
    # The minimum number of nodes in the node pool. 
    min_size     = 1
    # The maximum number of nodes in the node pool. 
    max_size     = 5
  }

}

The following code provides an example on how to create a cluster that contains a node pool with auto scaling enabled:

provider "alicloud" {
  region = var.region_id
}

variable "region_id" {
  type    = string
  default = "cn-shenzhen"
}

variable "cluster_spec" {
  type        = string
  description = "The cluster specifications of kubernetes cluster,which can be empty. Valid values:ack.standard : Standard managed clusters; ack.pro.small : Professional managed clusters."
  default     = "ack.pro.small"
}

# Specify the zones of vSwitches. 
variable "availability_zone" {
  description = "The availability zones of vswitches."
  default     = ["cn-shenzhen-c", "cn-shenzhen-e", "cn-shenzhen-f"]
}

# The CIDR blocks used to create vSwitches. 
variable "node_vswitch_cidrs" {
  type        = list(string)
  default     = ["172.16.0.0/23", "172.16.2.0/23", "172.16.4.0/23"]
}

# This variable specifies the CIDR blocks in which Terway vSwitches are created. 
variable "terway_vswitch_cidrs" {
  type        = list(string)
  default     = ["172.16.208.0/20", "172.16.224.0/20", "172.16.240.0/20"]
}

# Specify the ECS instance types of worker nodes. 
variable "worker_instance_types" {
  description = "The ecs instance types used to launch worker nodes."
  default     = ["ecs.g6.2xlarge", "ecs.g6.xlarge"]
}

# Specify a password for the worker node.
variable "password" {
  description = "The password of ECS instance."
  default     = "Test123456"
}

# Specify the prefix of the name of the ACK managed cluster. 
variable "k8s_name_prefix" {
  description = "The name prefix used to create managed kubernetes cluster."
  default     = "tf-ack-shenzhen"
}

# Specify the components that you want to install in the ACK managed cluster. The components include Terway (network plug-in), csi-plugin (volume plug-in), csi-provisioner (volume plug-in), logtail-ds (logging plug-in), the NGINX Ingress controller, ack-arms-prometheus (monitoring plug-in), and ack-node-problem-detector (node diagnostics plug-in). 
variable "cluster_addons" {
  type = list(object({
    name   = string
    config = string
  }))

  default = [
    {
      "name"   = "terway-eniip",
      "config" = "",
    },
    {
      "name"   = "logtail-ds",
      "config" = "{\"IngressDashboardEnabled\":\"true\"}",
    },
    {
      "name"   = "nginx-ingress-controller",
      "config" = "{\"IngressSlbNetworkType\":\"internet\"}",
    },
    {
      "name"   = "arms-prometheus",
      "config" = "",
    },
    {
      "name"   = "ack-node-problem-detector",
      "config" = "{\"sls_project_name\":\"\"}",
    },
    {
      "name"   = "csi-plugin",
      "config" = "",
    },
    {
      "name"   = "csi-provisioner",
      "config" = "",
    }
  ]
}

# The default resource names. 
locals {
  k8s_name_terway = "k8s_name_terway_${random_integer.default.result}"
  vpc_name = "vpc_name_${random_integer.default.result}"
  autoscale_nodepool_name = "autoscale-node-pool-${random_integer.default.result}"
}

# The ECS instance specifications of the worker nodes. Terraform searches for ECS instance types that fulfill the CPU and memory requests. 
data "alicloud_instance_types" "default" {
  cpu_core_count       = 8
  memory_size          = 32
  availability_zone    = var.availability_zone[0]
  kubernetes_node_role = "Worker"
}

resource "random_integer" "default" {
  min = 10000
  max = 99999
}

# The VPC. 
resource "alicloud_vpc" "default" {
  vpc_name   = local.vpc_name
  cidr_block = "172.16.0.0/12"
}

# The node vSwitch. 
resource "alicloud_vswitch" "vswitches" {
  count      = length(var.node_vswitch_cidrs)
  vpc_id     = alicloud_vpc.default.id
  cidr_block = element(var.node_vswitch_cidrs, count.index)
  zone_id    = element(var.availability_zone, count.index)
}

# The pod vSwitch. 
resource "alicloud_vswitch" "terway_vswitches" {
  count      = length(var.terway_vswitch_cidrs)
  vpc_id     = alicloud_vpc.default.id
  cidr_block = element(var.terway_vswitch_cidrs, count.index)
  zone_id    = element(var.availability_zone, count.index)
}

# The ACK managed cluster. 
resource "alicloud_cs_managed_kubernetes" "default" {
  name                         = local.k8s_name_terway # The ACK cluster name. 
  cluster_spec                 = var.cluster_spec      # Create an ACK Pro cluster. 
  worker_vswitch_ids           = split(",", join(",", alicloud_vswitch.vswitches.*.id))        # The vSwitch to which the node pool belongs. Specify one or more vSwitch IDs. The vSwitches must reside in the zone specified by availability_zone. 
  pod_vswitch_ids              = split(",", join(",", alicloud_vswitch.terway_vswitches.*.id)) # The vSwitch of the pod. 
  new_nat_gateway              = true                                                          # Specify whether to create a NAT gateway when the Kubernetes cluster is created. Default value: true. 
  service_cidr                 = "10.11.0.0/16"                                                # The pod CIDR block. If you set the cluster_network_type parameter to flannel, this parameter is required. The pod CIDR block cannot be the same as the VPC CIDR block or the CIDR blocks of other Kubernetes clusters in the VPC. You cannot change the pod CIDR block after the cluster is created. Maximum number of hosts in the cluster: 256. 
  slb_internet_enabled         = true                                                          # Specify whether to create an Internet-facing SLB instance for the API server of the cluster. Default value: false. 
  enable_rrsa                  = true
  control_plane_log_components = ["apiserver", "kcm", "scheduler", "ccm"] # The control plane logs. 
  dynamic "addons" {                                                      # Component management. 
    for_each = var.cluster_addons
    content {
      name   = lookup(addons.value, "name", var.cluster_addons)
      config = lookup(addons.value, "config", var.cluster_addons)
    }
  }
}

# Create a node pool for which auto scaling is enabled. The node pool can be scaled out to a maximum of 10 nodes and must contain at least 1 node. 
resource "alicloud_cs_kubernetes_node_pool" "autoscale_node_pool" {
  cluster_id     = alicloud_cs_managed_kubernetes.default.id
  node_pool_name = local.autoscale_nodepool_name
  vswitch_ids    = split(",", join(",", alicloud_vswitch.vswitches.*.id))

  scaling_config {
    min_size = 1
    max_size = 10
  }

  instance_types        = var.worker_instance_types
  password              = var.password # The password that is used to log on to the cluster by using SSH. 
  install_cloud_monitor = true         # Specify whether to install the CloudMonitor agent on the nodes in the cluster. 
  system_disk_category  = "cloud_efficiency"
  system_disk_size      = 100
  image_type            = "AliyunLinux3"

  data_disks {              # The data disk configuration of the node. 
    category = "cloud_essd" # The disk category. 
    size     = 120          # The disk size. 
  }
}

Run the following command to initialize the Terraform runtime environment:

terraform init

If the following information is returned, Terraform is initialized:

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

Run the terraform apply command to create the node pool.
Verify the result.
After the node pool is created, you can find the node pool on the Node Pools page. Auto Scaling Enabled appears below the name of the node pool.

Clear resources

If you no longer require the preceding resources created or managed by Terraform, run the terraform destroy command to release the resources. For more information about the terraform destroy command, see Common commands.

terraform destroy

Sample code

Note

You can run the sample code in this topic with a few clicks.

Full code

provider "alicloud" {
  region = var.region_id
}

variable "region_id" {
  type    = string
  default = "cn-shenzhen"
}

variable "cluster_spec" {
  type        = string
  description = "The cluster specifications of kubernetes cluster,which can be empty. Valid values:ack.standard : Standard managed clusters; ack.pro.small : Professional managed clusters."
  default     = "ack.pro.small"
}

# Specify the zones of vSwitches. 
variable "availability_zone" {
  description = "The availability zones of vswitches."
  default     = ["cn-shenzhen-c", "cn-shenzhen-e", "cn-shenzhen-f"]
}

# The CIDR blocks used to create vSwitches. 
variable "node_vswitch_cidrs" {
  type        = list(string)
  default     = ["172.16.0.0/23", "172.16.2.0/23", "172.16.4.0/23"]
}

# This variable specifies the CIDR blocks in which Terway vSwitches are created. 
variable "terway_vswitch_cidrs" {
  type        = list(string)
  default     = ["172.16.208.0/20", "172.16.224.0/20", "172.16.240.0/20"]
}

# Specify the ECS instance types of worker nodes. 
variable "worker_instance_types" {
  description = "The ecs instance types used to launch worker nodes."
  default     = ["ecs.g6.2xlarge", "ecs.g6.xlarge"]
}

# Specify a password for the worker node.
variable "password" {
  description = "The password of ECS instance."
  default     = "Test123456"
}

# Specify the prefix of the name of the ACK managed cluster. 
variable "k8s_name_prefix" {
  description = "The name prefix used to create managed kubernetes cluster."
  default     = "tf-ack-shenzhen"
}

# Specify the components that you want to install in the ACK managed cluster. The components include Terway (network plug-in), csi-plugin (volume plug-in), csi-provisioner (volume plug-in), logtail-ds (logging plug-in), the NGINX Ingress controller, ack-arms-prometheus (monitoring plug-in), and ack-node-problem-detector (node diagnostics plug-in). 
variable "cluster_addons" {
  type = list(object({
    name   = string
    config = string
  }))

  default = [
    {
      "name"   = "terway-eniip",
      "config" = "",
    },
    {
      "name"   = "logtail-ds",
      "config" = "{\"IngressDashboardEnabled\":\"true\"}",
    },
    {
      "name"   = "nginx-ingress-controller",
      "config" = "{\"IngressSlbNetworkType\":\"internet\"}",
    },
    {
      "name"   = "arms-prometheus",
      "config" = "",
    },
    {
      "name"   = "ack-node-problem-detector",
      "config" = "{\"sls_project_name\":\"\"}",
    },
    {
      "name"   = "csi-plugin",
      "config" = "",
    },
    {
      "name"   = "csi-provisioner",
      "config" = "",
    }
  ]
}

# The default resource names. 
locals {
  k8s_name_terway = "k8s_name_terway_${random_integer.default.result}"
  vpc_name = "vpc_name_${random_integer.default.result}"
  autoscale_nodepool_name = "autoscale-node-pool-${random_integer.default.result}"
}

# The ECS instance specifications of the worker nodes. Terraform searches for ECS instance types that fulfill the CPU and memory requests. 
data "alicloud_instance_types" "default" {
  cpu_core_count       = 8
  memory_size          = 32
  availability_zone    = var.availability_zone[0]
  kubernetes_node_role = "Worker"
}

resource "random_integer" "default" {
  min = 10000
  max = 99999
}

# The VPC. 
resource "alicloud_vpc" "default" {
  vpc_name   = local.vpc_name
  cidr_block = "172.16.0.0/12"
}

# The node vSwitch. 
resource "alicloud_vswitch" "vswitches" {
  count      = length(var.node_vswitch_cidrs)
  vpc_id     = alicloud_vpc.default.id
  cidr_block = element(var.node_vswitch_cidrs, count.index)
  zone_id    = element(var.availability_zone, count.index)
}

# The pod vSwitch. 
resource "alicloud_vswitch" "terway_vswitches" {
  count      = length(var.terway_vswitch_cidrs)
  vpc_id     = alicloud_vpc.default.id
  cidr_block = element(var.terway_vswitch_cidrs, count.index)
  zone_id    = element(var.availability_zone, count.index)
}

# The ACK managed cluster. 
resource "alicloud_cs_managed_kubernetes" "default" {
  name                         = local.k8s_name_terway # The ACK cluster name. 
  cluster_spec                 = var.cluster_spec      # Create an ACK Pro cluster. 
  worker_vswitch_ids           = split(",", join(",", alicloud_vswitch.vswitches.*.id))        # The vSwitch to which the node pool belongs. Specify one or more vSwitch IDs. The vSwitches must reside in the zone specified by availability_zone. 
  pod_vswitch_ids              = split(",", join(",", alicloud_vswitch.terway_vswitches.*.id)) # The vSwitch of the pod. 
  new_nat_gateway              = true                                                          # Specify whether to create a NAT gateway when the Kubernetes cluster is created. Default value: true. 
  service_cidr                 = "10.11.0.0/16"                                                # The pod CIDR block. If you set the cluster_network_type parameter to flannel, this parameter is required. The pod CIDR block cannot be the same as the VPC CIDR block or the CIDR blocks of other Kubernetes clusters in the VPC. You cannot change the pod CIDR block after the cluster is created. Maximum number of hosts in the cluster: 256. 
  slb_internet_enabled         = true                                                          # Specify whether to create an Internet-facing SLB instance for the API server of the cluster. Default value: false. 
  enable_rrsa                  = true
  control_plane_log_components = ["apiserver", "kcm", "scheduler", "ccm"] # The control plane logs. 
  dynamic "addons" {                                                      # Component management. 
    for_each = var.cluster_addons
    content {
      name   = lookup(addons.value, "name", var.cluster_addons)
      config = lookup(addons.value, "config", var.cluster_addons)
    }
  }
}

# Create a node pool for which auto scaling is enabled. The node pool can be scaled out to a maximum of 10 nodes and must contain at least 1 node. 
resource "alicloud_cs_kubernetes_node_pool" "autoscale_node_pool" {
  cluster_id     = alicloud_cs_managed_kubernetes.default.id
  node_pool_name = local.autoscale_nodepool_name
  vswitch_ids    = split(",", join(",", alicloud_vswitch.vswitches.*.id))

  scaling_config {
    min_size = 1
    max_size = 10
  }

  instance_types        = var.worker_instance_types
  password              = var.password # The password that is used to log on to the cluster by using SSH. 
  install_cloud_monitor = true         # Specify whether to install the CloudMonitor agent on the nodes in the cluster. 
  system_disk_category  = "cloud_efficiency"
  system_disk_size      = 100
  image_type            = "AliyunLinux3"

  data_disks {              # The data disk configuration of the node. 
    category = "cloud_essd" # The disk category. 
    size     = 120          # The disk size. 
  }
}

If you want to view more complete examples, visit the directory of the corresponding service on the Landing with Terraform page.

Use Terraform to create a node pool that has auto scaling enabled

Prerequisites

Background information

Resources

Generate Terraform request parameters by using the ACK console