Overview of operations on ACK-supported K8s worker nodes - Container Service for Kubernetes

This topic provides a summary of common operations for managing worker nodes in the Container Service for Kubernetes (ACK) Console. You can read this topic for detailed operations and the relevant usage notes.

Most operations are accessible on the Nodes page.

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Nodes > Nodes.

Node logon

For scenarios such as node troubleshooting, performance monitoring, or executing custom scripts, you can log on to the corresponding ECS instance of the node.

Workbench connection: In the Actions column of the node list, choose More > Workbench Connection.
VNC connection: In the Actions column of the node list, select More > VNC Connection.

For additional remote connection methods to ECS instances, see Methods for connecting to an ECS instance.

Note

If your operating system is ContainerOS, to mitigate security risks, ContainerOS does not support direct logon for untraceable operations and lacks SSH functionality. For necessary maintenance operations, see Work with the administrative container of ContainerOS.

Node draining and scheduling status

Node draining

In the Actions column of the node list, select More > Node Draining, and follow the on-screen prompts to drain the node. This process involves evacuating the existing pods from the node and marking it as unschedulable, ensuring that no new pods will be scheduled on it.

Please note the following precautions.

Ensure sufficient resources on other nodes in the cluster to prevent application pods from becoming unschedulable.
Verify the node affinity rules and scheduling policies for pods on the node to be removed, to ensure their continued schedulability on other nodes after the node's removal.
Pods managed by DaemonSet will not be evicted.

Change node scheduling status

From the node list, select the desired node, and then click Set Scheduling Status at the page's bottom. Please read the precautions in the dialog box carefully, and follow the on-page prompts to finalize the operation.

Please note the following precautions.

You should perform this operation during off-peak hours as it may impact business operations.
Once a node is set to unschedulable, it will be labeled as SchedulingDisabled. While existing pods on the node will continue to serve externally, new pods will not be scheduled to this node.
Pods managed by DaemonSet will not be removed.

Node removal

If you no longer require a worker node, you can remove it from the node pool or cluster via the ACK Console during off-peak hours. In the Actions column of the node list, choose More > Remove, or select the node and click Batch Remove in the lower part of the page. Then, simply follow the on-screen prompts to complete the process.

For related precautions and feature details, see Remove a node.

Node monitoring

Click Monitor in the Actions column to install the component and enable Managed Service for Prometheus for node resource dashboard viewing. For configuring monitoring alerts with Managed Service for Prometheus, see Step 3: (Optional) Configure alert rules in Managed Service for Prometheus.

For creating custom PromQL alert rules for abnormal node status, see Best practices for configuring alert rules in Prometheus.

Node fault diagnosis

For diagnosing issues with an abnormal node, click Exception Diagnosis in the Actions column. This will initiate an inspection and provide a repair plan. For details of supported diagnostic scenarios, inspection items, and repair plans, see Node diagnostics.

Manage node labels and taints

To manage and schedule cluster resources via labels and taints, navigate to the Nodes page, click Manage Labels and Taints, and follow the guide to configure label names and values. For more information, see Manage node labels and taints.

Batch operations on nodes

To perform batch operations on worker nodes within an ACK cluster, such as updating the operating system kernel for security or installing custom monitoring, security, and audit packages, select the desired nodes from the node list. Then click Batch Operations at the page's bottom and follow the console guide to complete the process. For more information, see Manage nodes in batches.

View node information

In the Actions column of the node list, select More > View in YAML to view the YAML template of the node.

In the Actions column of the node list, select More > Details to view the node information.

CPU and memory usage
- CPU request = sum(requested CPU resources by all pods on the node)/total CPU resources on the node
- CPU utilization = sum(used CPU resources by all pods on the node)/total CPU resources on the node
- Memory request = sum(requested memory resources by all pods on the node)/total memory resources on the node
- Memory utilization = sum(used memory resources by all pods on the node)/total memory resources on the node
Note
Allocatable resources = Resource capacity - Reserved resources - Eviction threshold. For more details, see Resource reservation policy.
Basic node information
Includes node name, IP address, instance ID, container runtime version, operating system, kernel, etc.
Other informations
Details of node CPU and memory resource allocation (Request and Limit), node status, pod list, node events, and more.

References

You can use the resource profiling feature provided by ACK to get resource configuration suggestions for containers based on the historical data of resource usage. This simplifies the configuration of resource requests and limits for containers. For more information, see Resource profiling.
For more information about how to configure resources for application pods, see Create a Stateless Application by Using a Deployment.
Configure node labels and a node selector to schedule application pods to specific nodes. For more information, see Schedule pods to specific nodes.
For guidance on scaling up or down worker node resources, see Upgrade the configurations of a worker node.
To add a data disk to a node for storing resources like the container runtime and kubelet, see Attach data disks to nodes.
For more information about how to resize the data disk or system disk, see Extend the system disk of a node.
Node upgrades, including kubelet and runtime versions, are managed at the node pool level. For more information, see Update a node pool.