All Products
Search
Document Center

Container Service for Kubernetes:FAQ about nodes and node pools

Last Updated:Sep 13, 2024

This topic provides answers to some frequently asked questions (FAQ) about nodes and node pools. For example, you can obtain answers to questions such as how to change the maximum number of pods that are supported by a node, how to change the operating system for a node pool, and how to solve the timeout error related to a node.

How do I change the operating system for a node pool?

The method used to change the operating system for a node pool is similar to the method used to update a node pool. To change the operating system for a node pool, perform the following steps:

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Nodes > Node Pools.

  3. On the Node Pools page, find the node pool that you want to modify and choose More > Upgrade in the Actions column.

  4. Select Change Operating System, select the image that is used to replace the original image, and then click Start Update.

    Note

    By default, Kubelet Update and Upgrade Node Pool by Replacing System Disk are selected when you change the operating system for a node pool. Select Create Snapshot before Update based on your business requirements.

Can I leave the Expected Nodes parameter empty when I create a node pool?

No, you cannot leave the Expected Nodes parameter empty when you create a node pool.

For more information about how to remove or release a node, see Remove a node. For more information about how to add a node, see Add existing ECS instances to an ACK cluster. After you remove nodes from or add existing nodes to a cluster, the value of the Expected Nodes parameter is automatically set to the actual number of nodes after the modification.

What are the differences between node pools that are configured with the Expected Nodes parameter and those that are not configured with this parameter?

The Expected Nodes parameter specifies the number of nodes that you want to keep in a node pool. You can change the value of this parameter to modify the number of nodes in the node pool. This feature is disabled for existing node pools that are not configured with the Expected Nodes parameter.

Node pools that are configured with the Expected Nodes parameter and those that are not configured with this parameter have different reactions to operations such as removing nodes and releasing Elastic Compute Service (ECS) instances. The following table describes the details.

Operation

Node pool that is configured with the Expected Nodes parameter

Node pool that is not configured with the Expected Nodes parameter

Suggestion

Decrease the expected number of nodes by calling the API operations of Container Service for Kubernetes (ACK) or using the ACK console.

After you decrease the expected number of nodes, the nodes in the node pool are reduced until the number of existing nodes in the node pool is equal to the specified expected number of nodes.

If the number of existing nodes in the node pool is greater than the expected number of nodes, the system reduces the nodes in the node pool until the number of existing nodes in the node pool is equal to the expected number of nodes. At the same time, the system enables the Expected Nodes feature.

None.

Remove specific nodes in the ACK console or by calling the API operations of ACK.

The value of the Expected Nodes parameter automatically changes based on the number of nodes that you removed. For example, the value of the Expected Nodes parameter is 10 before you remove nodes. After you remove three nodes, the value of this parameter is changed to 7.

The specified nodes are removed as expected.

None.

Remove nodes by running the kubectl delete node command.

The value of the Expected Nodes parameter remains unchanged.

The nodes are not removed.

We recommend that you do not use this method to remove nodes.

Manually release ECS instances in the ECS console or by calling the API operations of ECS.

New ECS instances are automatically added to the node pool to keep the expected number of nodes.

The node pool does not respond to the operation. No ECS instances are added to the node pool. After the subscriptions to ECS instances expire, the nodes remain in the Unknown state before they are removed from the Nodes list of the node pool details page in the ACK console.

We recommend that you use the recommended method instead of this method to remove nodes. Otherwise, the data of ACK and Auto Scaling may be inconsistent with the actual data. For more information, see Remove nodes.

The subscriptions to ECS instances expire.

New ECS instances are automatically added to the node pool to keep the expected number of nodes.

The node pool does not respond to the operation. No ECS instances are added to the node pool. Nodes that are deleted from the node pool remain in the Unknown state for a period of time.

We recommend that you use the recommended method instead of this method to remove nodes. Otherwise, the data of ACK and Auto Scaling may be inconsistent with the actual data. For more information, see Remove nodes.

Manually enable the health check feature of Auto Scaling for ECS instances in a scaling group and the ECS instances fail to pass health checks due to reasons such as that the ECS instances are suspended.

New ECS instances are automatically added to the node pool to keep the expected number of nodes.

New ECS instances are automatically added to replace the ECS instances that are suspended.

We recommend that you do not perform operations on the scaling group of a node pool.

Remove ECS instances from a scaling group by using Auto Scaling without modifying the expected number of nodes.

New ECS instances are automatically added to the node pool to keep the expected number of nodes.

No ECS instances are added to the node pool.

We recommend that you do not perform operations on the scaling group of a node pool.

How do I add free nodes to a node pool?

Free nodes exist in clusters created before the node pool feature was released. If you no longer need free nodes, you can release the Elastic Compute Service (ECS) instances that are used to deploy the nodes. If you want to retain free nodes, we recommend that you add them to node pools. This way, you can manage the nodes in groups.

You can create and scale out a node pool, remove free nodes, and then add the corresponding ECS instances to the node pool. For more information, see Add free nodes to a node pool.

How do I use preemptible instances in a node pool?

You can use preemptible instances when you create a node pool. You can also use preemptible in a node pool by using the spot-instance-advisor command-line tool. For more information, see Best practices for preemptible instance-based node pools.

Note

When you create a cluster, you cannot select preemptible instances for the node pool of the cluster.

How do I adjust the maximum number of pods that can be used when the number of pods reaches the upper limit?

The maximum number of pods on a worker node varies based on the network plug-in and cannot be adjusted in most cases. In Terway mode, the maximum number of pods on a node depends on the number of elastic network interfaces (ENIs) provided by the Elastic Compute Service (ECS) instance. In Flannel mode, the maximum number of pods on a node depends on the cluster configurations that you specify when you create the cluster. The upper limit cannot be modified after the cluster is created. When the number of pods in your cluster reaches the upper limit, we recommend that you scale out the node pool in the cluster to increase the number of pods in the cluster.

For more information, see Adjust the number of pods that can be used.

How do I modify the configurations of a node?

To ensure smooth business operation and facilitate node management:

  • You cannot modify some of the configuration items such as the container runtime and the virtual private cloud (VPC) to which the node belongs after the node pool is created.

  • You can modify some of the configuration items but the modification is limited. For example, if you change the operating system of a node, you can only upgrade the original image to the latest version and cannot change the image type.

  • You can modify some of the configuration items after the node pool is created. For example, you can change the vSwitches, billing method, and instance types of a node pool after it is created.

In addition, modifications on specific configuration items take effect only on nodes that are added to the node pool after the modifications are made. For example, if you change the public IP addresses used by a node pool or install or uninstall the CloudMonitor agent for a node pool, the modification takes effect only on nodes that are added to the node pool after the modification is made. For more information, see Modify a node pool.

If you want to run a new node, we recommend that you create a node pool based on the configurations of the new node, set the nodes in the old node pool to the Unschedulable state, and then drain the old nodes. After you run your business on the new node, release the old nodes.

How do I release a specific ECS instance?

You can release a specific ECS instance by removing the corresponding node. After an ECS instance is released, the expected number of nodes automatically changes to the actual number of nodes. You do not need to modify the expected number of nodes. You cannot release an ECS instance by modifying the expected number of nodes.

How do I update the container runtime of a worker node that does not belong to a node pool?

Perform the following steps:

  1. Remove the worker node. When you remove the worker node, the system sets the node to the Unschedulable state and drains the node. If the node fails to be drained, the system stops removing the node. If the node is drained, the system continues to remove the node from the cluster.

  2. Add the node to a node pool. You can add the node to an existing node pool. Alternatively, you can create an empty node pool and add the node to the node pool. After the node is added to a node pool, the container runtime of the node automatically becomes the same as that of the node pool.

    Note

    Node pools are free of charge. However, you are charged for the cloud resources such as ECS instances that are used in node pools. For more information, see Cloud service billing.

What do I do if a timeout error occurs after I add an existing node?

Check whether the network of the node and the network of the Classic Load Balancer (CLB) instance of the API server are connected. Check whether the security groups meet the requirement. For more information about the limits on security groups, see Limits on security groups. For more information about other network connectivity issues, see FAQ about network management.

How do I change the hostname of a worker node in an ACK cluster?

After you create an ACK cluster, you cannot directly change the hostnames of worker nodes. If you want to change the hostname of a worker node, modify the node naming rule of the relevant node pool, remove the worker node from the node pool, and then add the worker node to the node pool again.

Note

When you create an ACK cluster, you can modify the hostnames of worker nodes in the Custom Node Name section. For more information, see Create an ACK managed cluster.

  1. Remove the worker node.

    1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

    2. In the left-side navigation pane of the details page, choose Nodes > Nodes.

    3. On the Nodes page, find the worker node that you want to remove and choose More > Remove in the Actions column.

    4. In the dialog box that appears, select I understand the above information and want to remove the node(s). and click OK.

  2. Add the worker node to the node pool again. For more information, see Manually add ECS instances.

    Then, the worker node is renamed based on the new node naming rule of the node pool.

How do I manually update the kernel version of GPU-accelerated nodes in a cluster?

To manually update the kernel version of GPU-accelerated nodes in a cluster, perform the following steps:

Note

The current kernel version is earlier than 3.10.0-957.21.3.

Confirm the kernel version to which you want to update. Proceed with caution when you perform the update.

The following procedure shows how to update the NVIDIA driver. Details about how to update the kernel version are not shown.

  1. Obtain the kubeconfig file of the cluster and use kubectl to connect to the cluster.

  2. Set the GPU-accelerated node that you want to manage to the Unschedulable state. In this example, the node cn-beijing.i-2ze19qyi8votgjz12345 is used.

    kubectl cordon cn-beijing.i-2ze19qyi8votgjz12345
    
    node/cn-beijing.i-2ze19qyi8votgjz12345 already cordoned
  3. Migrate the pods on the GPU-accelerated node to other nodes.

    kubectl drain cn-beijing.i-2ze19qyi8votgjz12345 --grace-period=120 --ignore-daemonsets=true
    
    node/cn-beijing.i-2ze19qyi8votgjz12345 cordoned
    WARNING: Ignoring DaemonSet-managed pods: flexvolume-9scb4, kube-flannel-ds-r2qmh, kube-proxy-worker-l62sf, logtail-ds-f9vbg
    pod/nginx-ingress-controller-78d847fb96-5fkkw evicted
  4. Uninstall the existing nvidia-driver.

    Note

    In this example, the uninstalled driver version is 384.111. If your driver version is not 384.111, download the installation package of your driver from the official NVIDIA website and update the driver to 384.111 first.

    1. Log on to the GPU-accelerated node and run the nvidia-smi command to check the driver version.

      sudo nvidia-smi -a | grep 'Driver Version'
      Driver Version                      : 384.111
    2. Download the driver installation package.

      sudo cd /tmp/
      sudo curl -O https://cn.download.nvidia.cn/tesla/384.111/NVIDIA-Linux-x86_64-384.111.run
      Note

      The installation package is required for uninstalling the NVIDIA driver.

    3. Uninstall the driver.

      sudo chmod u+x NVIDIA-Linux-x86_64-384.111.run
      sudo sh ./NVIDIA-Linux-x86_64-384.111.run --uninstall -a -s -q
  5. Update the kernel.

    Update the kernel version based on your business requirements.

  6. Restart the GPU-accelerated node.

    sudo reboot
  7. Log on to the GPU node and run the following command to install the kernel-devel package.

    sudo yum install -y kernel-devel-$(uname -r)
  8. Go to the official NVIDIA website to download the required driver and install it on the GPU-accelerated node. In this example, the driver version 410.79 is used.

    sudo cd /tmp/
    sudo curl -O https://cn.download.nvidia.cn/tesla/410.79/NVIDIA-Linux-x86_64-410.79.run
    sudo chmod u+x NVIDIA-Linux-x86_64-410.79.run
    sudo sh ./NVIDIA-Linux-x86_64-410.79.run -a -s -q
    
    warm up GPU
    sudo nvidia-smi -pm 1 || true
    sudo nvidia-smi -acp 0 || true
    sudo nvidia-smi --auto-boost-default=0 || true
    sudo nvidia-smi --auto-boost-permission=0 || true
    sudo nvidia-modprobe -u -c=0 -m || true
  9. Make sure that the /etc/rc.d/rc.local file includes the following configurations. Otherwise, add the following configurations to the file.

    sudo nvidia-smi -pm 1 || true
    sudo nvidia-smi -acp 0 || true
    sudo nvidia-smi --auto-boost-default=0 || true
    sudo nvidia-smi --auto-boost-permission=0 || true
    sudo nvidia-modprobe -u -c=0 -m || true
  10. Restart the kubelet and Docker.

    sudo service kubelet stop
    sudo service docker restart
    sudo service kubelet start
  11. Set the GPU-accelerated node to schedulable.

    kubectl uncordon cn-beijing.i-2ze19qyi8votgjz12345
    
    node/cn-beijing.i-2ze19qyi8votgjz12345 already uncordoned
  12. Run the following command in the nvidia-device-plugin container to check the version of the driver installed on the GPU-accelerated node.

    kubectl exec -n kube-system -t nvidia-device-plugin-cn-beijing.i-2ze19qyi8votgjz12345 nvidia-smi
    Thu Jan 17 00:33:27 2019
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 410.79       Driver Version: 410.79       CUDA Version: N/A      |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |===============================+======================+======================|
    |   0  Tesla P100-PCIE...  On   | 00000000:00:09.0 Off |                    0 |
    | N/A   27C    P0    28W / 250W |      0MiB / 16280MiB |      0%      Default |
    +-------------------------------+----------------------+----------------------+
    
    +-----------------------------------------------------------------------------+
    | Processes:                                                       GPU Memory |
    |  GPU       PID   Type   Process name                             Usage      |
    |=============================================================================|
    |  No running processes found                                                 |
    +-----------------------------------------------------------------------------+
    Note

    If no container is launched on the GPU-accelerated node after you run the docker ps command, see What do I do if no container is launched on a GPU-accelerated node?

What do I do if no container is launched on a GPU-accelerated node?

For specific Kubernetes versions, after you restart the kubelet and Docker on GPU-accelerated nodes, no container is started on the nodes.

sudo service kubelet stop
Redirecting to /bin/systemctl stop kubelet.service
sudo service docker stop
Redirecting to /bin/systemctl stop docker.service
sudo service docker start
Redirecting to /bin/systemctl start docker.service
sudo service kubelet start
Redirecting to /bin/systemctl start kubelet.service

sudo docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

Run the following command to check the cgroup driver:

sudo docker info | grep -i cgroup
Cgroup Driver: cgroupfs

The returned results indicate that the cgroup driver is set to cgroupfs.

To resolve the issue, perform the following steps:

  1. Create a copy of /etc/docker/daemon.json. Then, run the following commands to update /etc/docker/daemon.json.

    sudo cat >/etc/docker/daemon.json <<-EOF
    {
        "default-runtime": "nvidia",
        "runtimes": {
            "nvidia": {
                "path": "/usr/bin/nvidia-container-runtime",
                "runtimeArgs": []
            }
        },
        "exec-opts": ["native.cgroupdriver=systemd"],
        "log-driver": "json-file",
        "log-opts": {
            "max-size": "100m",
            "max-file": "10"
        },
        "oom-score-adjust": -1000,
        "storage-driver": "overlay2",
        "storage-opts":["overlay2.override_kernel_check=true"],
        "live-restore": true
    }
    EOF
  2. Run the following commands to restart the Docker runtime and the kubelet:

    sudo service kubelet stop
    Redirecting to /bin/systemctl stop kubelet.service
    sudo service docker restart
    Redirecting to /bin/systemctl restart docker.service
    sudo service kubelet start
    Redirecting to /bin/systemctl start kubelet.service
  3. Run the following command to check whether the cgroup driver is set to systemd.

    sudo docker info | grep -i cgroup
    Cgroup Driver: systemd

How do I migrate multiple pods to other nodes when a node fails?

You can set the faulty node to unschedulable and drain the node. This way, ACK migrates application pods from the faulty node to other nodes.

  1. Log on to the ACK console. On the Nodes page, choose More > Drain in the Actions column. ACK sets the node to unschedulable and migrates applications from the node to other nodes.

  2. Troubleshoot node exceptions. For more information, see Troubleshoot node exceptions.

    You can also submit a ticket to contact the ACK technical team.

When a cluster that contains nodes in different zones fails, how does the cluster evict pods from nodes?

In most scenarios, when a node fails, the node controller evicts pods from the node. The value of --node-eviction-rate is 0.1 pod per second, which indicates that pods are evicted from at most one node every 10 seconds.

When an ACK cluster that contains nodes residing in multiple zones fails, the node controller determines how to evict pods based on the zone status and the cluster size.

A zone can be in one of the following states:

  • FullDisruption: No healthy node resides in the zone and at least one unhealthy node exists.

  • PartialDisruption: At least two unhealthy nodes exist in the zone, and the ratio of unhealthy nodes (unhealthy nodes/(unhealthy nodes + healthy nodes) is greater than 0.55.

  • Normal: All nodes in the zone are healthy.

A cluster can be classified into two types based on the cluster size:

  • Large cluster: The cluster contains more than 50 nodes.

  • Small cluster: The cluster contains 50 or fewer nodes.

The eviction rate of the node controller is calculated based on the following rules:

  • If all zones are in the FullDisruption state, the eviction feature is disabled for all zones.

  • If not all zones are in the FullDisruption state, the eviction rate is determined in the following ways.

    • If a zone is in the FullDisruption state, the eviction rate is set to the default value (0.1), regardless of the cluster size.

    • If a zone is in the PartialDisruption state, the eviction rate depends on the cluster size. In a large cluster, the eviction rate of the zone is 0.01. In a small cluster, the eviction rate of the zone is 0, which indicates that no pod is evicted.

    • If a zone is in the Normal state, the eviction rate is set to the default value (0.1), regardless of the cluster size.

For more information, see Rate limits on eviction.

What is the path of the kubelet in an ACK cluster? Can I use a custom path?

ACK does not allow you to customize the path of the kubelet. The default path of the kubelet is /var/lib/kubelet. Do not change the path.

Can I mount a data disk to a custom directory on a node in a node pool?

This feature is in canary release. To use this feature, submit a ticket. After you enable this feature, you can automatically format the data disks attached to a node pool and mount the data disks to specified custom directories on the operating system. When you use this feature, the following limits apply:

  • Do not mount data disks to the following directories on the operating system:

    • /

    • /etc

    • /var/run

    • /run

    • /boot

  • Do not mount data disks to the following directories or their subdirectories used by the system and the container runtime:

    • /usr

    • /bin

    • /sbin

    • /lib

    • /lib64

    • /ostree

    • /sysroot

    • /proc

    • /sys

    • /dev

    • /var/lib/kubelet

    • /var/lib/docker

    • /var/lib/containerd

    • /var/lib/container

  • Multiple data disks cannot be mounted to the same directory.

  • The mount directory must be an absolute path that starts with a forward slash (/).

  • The mount directory cannot contain carriage returns (the \r escape character in C) or line feeds (the \n escape character in C), and cannot end with a backslash (\).

How do I modify the maximum number of file handles?

The maximum number of file handles equals the maximum number of files that can be opened. Alibaba Cloud Linux and CentOS have two file handle limits:

  • System level: The maximum number of files that can be opened simultaneously by all user processes.

  • User level: The maximum number of files that can be opened by a single user process.

In a container environment, there is an additional file handle limit, which limits the maximum number of file handles for a single process within a container.

Note

When you update a node pool, the maximum number of file handles modified by using the CLI may be overwritten. We recommend that you use the User Data parameter to set the upper limit.

Modify the maximum number of system-level file handles for a node

For more information, see Customize the OS parameters of a node pool.

Modify the maximum number of file handles for a single process on a node

  1. Log on to the node and view the /etc/security/limits.conf file.

    cat /etc/security/limits.conf

    Use the following parameters to configure the maximum number of file handles for a single process on a node:

    ...
    root soft nofile 65535
    root hard nofile 65535
    * soft nofile 65535
    * hard nofile 65535
  2. Run the sed command to modify the maximum number of file handles. We recommend that you set the maximum number of file handles to 65535.

    sed -i "s/nofile.[0-9]*$/nofile 65535/g" /etc/security/limits.conf
  3. Log on to the node again and run the following command to check whether the modification takes effect:

    If the returned value is the same as the modified value, the modification takes effect.

    # ulimit -n
    65535

Modify the maximum number of file handles for a container

Important

If you modify the maximum number of file handles for a container, the Docker or containerd processes are restarted. Perform operations during off-peak hours.

  1. Log on to the node and run the following command to view the configuration file:

    • Nodes that use containerd: cat /etc/systemd/system/containerd.service

    • Nodes that use Docker: cat /etc/systemd/system/docker.service

    Configure the maximum number of file handles for a single process in a container by using the following parameters:

    ...
    LimitNOFILE=1048576 ****** Maximum number of file handles for a single process
    LimitNPROC=1048576 ****** Maximum number of processes
    ...
  2. Run the following command to modify the value of the corresponding parameter. We recommend that you set the maximum number of file handles to 1048576.

    • Nodes that use containerd:

       sed -i "s/LimitNOFILE=[0-9a-Z]*$/LimitNOFILE=65536/g" /etc/systemd/system/containerd.service;sed -i "s/LimitNPROC=[0-9a-Z]*$/LimitNPROC=65537/g" /etc/systemd/system/containerd.service && systemctl daemon-reload && systemctl restart containerd
    • Nodes that use Docker:

      sed -i "s/LimitNOFILE=[0-9a-Z]*$/LimitNOFILE=1048576/g" /etc/systemd/system/docker.service;sed -i "s/LimitNPROC=[0-9a-Z]*$/LimitNPROC=1048576/g" /etc/systemd/system/docker.service && systemctl daemon-reload && systemctl restart docker
  3. Run the following command to view the maximum number of file handles for a single process in a container:

    If the returned value is the same as the modified value, the modification takes effect.

    • Nodes that use containerd:

      # cat /proc/`pidof containerd`/limits | grep files
      Max open files            1048576              1048576              files
    • Nodes that use Docker:

      # cat /proc/`pidof dockerd`/limits | grep files
      Max open files            1048576              1048576              files