How to configure edge node autonomy - Container Service for Kubernetes

ACK Edge clusters support edge node autonomy. Edge node autonomy ensures that applications on edge nodes can still run as expected without being evicted or migrated to other edge nodes during cloud-edge network disconnections. When a node with node autonomy disabled disconnects from the cloud, the application pods on the node are evicted after the tolerance time ends. This topic describes how to configure node autonomy for edge nodes in the Container Service for Kubernetes (ACK) console.

Prerequisites

An ACK Edge cluster has been created. For more information, see Create an ACK Edge cluster in the console.
Edge nodes have been added to the cluster. For more information, see Add edge nodes.

Background information

You can enable or disable node autonomy for edge nodes. By default, node autonomy is disabled for edge nodes that are newly added to a cluster.

When an edge node with node autonomy enabled disconnects from the cloud, the system ensures that the application pods on the node are not evicted and that the applications can automatically recover. Node autonomy is suitable for edge computing scenarios where the network connection is weak.
When an edge node with node autonomy disabled disconnects from the cloud, the node cannot send heartbeats to the control planes in the cloud. As a result, the status of the node changes to Not ready and the application pods on the node are evicted or migrated to other edge nodes after the toleration time ends.

Procedure

Enable node autonomy through the console

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Nodes > Nodes.
On the Nodes page, find the node that you want to manage, and choose More > Node Autonomy Settings in the Actions column.
Note
The Node Autonomy Settings button is displayed only when the current node is an edge node.
In the popped-up Node Autonomy Settings dialog box, click OK.

Enable node autonomy through the kubectl

Add the following annotation to the edge node:

kubectl annotate node xxx node.beta.openyurt.io/autonomy=true --overwrite

Check the node autonomy status

On the Clusters page, find the cluster that you want to manage and click its name. In the left-side navigation pane, choose Nodes > Nodes.
On the Nodes page, find the node that you want to manage, and choose More > Details in the Actions column.
Under the Overview tab, scroll down and find the Status section. If the type is Autonomy and the corresponding status is True, autonomy has been successfully enabled.

Configure the cache component

EdgeHub caches the data required by the components on the node to ensure that these components can run as expected during cloud-edge network disconnections. The cache directory is /etc/kubernetes/cache.

Note

The cached data refers to the data that interacts with the API server, such as pod and ConfigMap resource information, and does not include business data.

When the edge node is disconnected from the network, if you have components that rely on the data from the API Server, you can configure the edge node as follows:

Obtain the User-Agent information from the Developer Tools in your browser or API server logs.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Configurations > ConfigMaps.
Select kube-system from the Namespace drop-down list, find the ConfigMap named edge-hub-cfg in the Name column, and click Edit YAML in the Actions column.
Add your User-Agent to the key cache_agents, and click OK.
Log on to the node, go to the /etc/kubernetes/cache directory, and check if there is a directory named after your User-Agent.

After setting up this configuration, the data that interacts between the components and the API server will be saved to the disk on the node. If node autonomy is enabled, the components will retrieve data from the local disk to ensure normal operations during network disconnections.