Restore a cluster - Elastic High Performance Computing - Alibaba Cloud Documentation Center

If an Elastic High Performance Computing (E-HPC) cluster or its nodes is in the Exception state, you can restore the cluster. This topic describes how to restore a cluster.

Prerequisites

The cluster restoration feature is enabled. By default, the feature is disabled. To enable it, submit a ticket.
Job data is exported.

Precautions

When you restore a cluster, take note of the following impacts:

When a cluster is being restored, the system disks of all nodes are changed. By default, new system disks are configured based on the settings that you specified when the cluster was created.
The self-managed queues in the cluster are deleted. All nodes are retained and migrated to the default queue of the cluster.
After a cluster is restored, the data on the system disks and data disks of all cluster nodes is lost. The data includes user information, job information, scheduler queue information, and configuration data of auto-scaling queues. However, the data on File Storage NAS file systems is retained.

Procedure

Log on to the E-HPC console.
In the left part of the top navigation bar, select a region.
In the left-side navigation pane, click Cluster.
On the Cluster page, select the cluster that you want to restore. Choose More > Recover.
In the dialog box that appears, specify Image Type, Image, Scheduler, and Domain Service for the cluster.
Other settings are configured based on the settings that you specified when you created the cluster.
Click OK.

Result

After the cluster is restored, the Cluster page automatically appears. You can check the status of the cluster. If the cluster is in the Uninitialized state, the cluster is being restored. If the cluster is in the Running state, the cluster has been restored.