Archive data sources - Cloud Backup - Alibaba Cloud Documentation Center

You must add a data source the first time you archive data. The following data sources are supported: on-premises NAS file systems such as Isilon (PowerScale), Hadoop Distributed File System (HDFS) file systems, and S3-Compatible Storage buckets. This topic describes how to use Cloud Backup to add a data source for archiving.

Prerequisites

Cloud Backup is authorized, and a Cloud Backup client is installed. For more information, see Install a Cloud Backup client.

Procedure

Log on to the Cloud Backup console.
In the left-side navigation pane, click Archive.
In the top navigation bar, select a region.
On the Analyze and Archive tab, click Create Data Source.

In the Create Data Source panel, configure the parameters and click Next.

Source Type: Local NAS

Configure the parameters described in the following table.

Parameter	Description
Source Type	The type of the data source. Select Local NAS.
Data Source Name	Enter a name for the data source that you want to add.
NAS Type	The type of the NAS file system. Valid values: Isilon (PowerScale): Isilon (PowerScale) NAS file systems Other NAS: other NAS file systems
NAS Management Username	This parameter is required only if you set the NAS Type parameter to Isilon (PowerScale). This parameter specifies the username of the account that is used to manage the NAS file system.
NAS Management Password	This parameter is required only if you set the NAS Type parameter to Isilon (PowerScale). This parameter specifies the password of the account.
NAS Network Address	The network address of the NAS file system.
NAS Management Port	This parameter is required only if you set the NAS Type parameter to Isilon (PowerScale). This parameter specifies the number of the port that is used to manage the NAS file system.
NAS Share Path	The name of the NAS directory from which you want to archive data. The name can contain letters, digits, and the following special characters: `, - _ = / . : \`. If you set the NAS Type parameter to Isilon (PowerScale), the directory is relative to the /ifs directory. For example, if you enter /myshare, Cloud Backup archives data from the /ifs/myshare directory. If you set the NAS Type parameter to Other NAS, the directory is relative to the / root directory. For example, if you enter /myshare, Cloud Backup archives data from the /myshare directory.
Protocol Type	The protocol type of the NAS file system. Valid values: NFS: Network File System (NFS) SMB: Server Message Block (SMB) Important If you want to archive data from a File Storage NAS file system, configure the vers parameter in the Advanced Settings section.

Optional. Click Advanced Settings and then click +Set Mount Parameters.

The following table describes the mount parameters that you can configure.

Parameter	Description
vers	The protocol version of the file system. vers=3: uses NFSv3 to mount the file system. vers=4: uses NFSv4 to mount the file system. vers=4.0: uses NFSv4.0 to mount the file system. Note Each protocol version of File Storage NAS has differences in terms of features, security, and namespace. For more information, see Differences between NFSv3 and NFSv4.0.
nolock	Specifies whether to enable file locking.
proto	The protocol that you want to use to mount the file system.
rsize	The size of each data block that a client can read from the file system. Recommended value: 1048576. Unit: bytes.
wsize	The size of each data block that a client can write to the file system. Recommended value: 1048576. Unit: bytes.
hard	Specifies that applications no longer access the file system when the file system is unavailable and access the file system again when the file system becomes available. We recommend that you enable this parameter.
timeo	The period of time for which the NFS client waits before the client retries to send a request. Unit: deciseconds (tenths of a second). Recommended value: 600 (60 seconds).
retrans	The number of retries after the NFS client fails to send a request. Recommended value: 2.
noresvport	Specifies that a new TCP port is used to ensure network continuity between the file system and the ECS instance when the network recovers from a failure. We recommend that you enable this parameter.

Source Type: HDFS

Configure the parameters described in the following table.

Parameter	Description
Source Type	The type of the data source. Select HDFS.
Data Source Name	The name of the HDFS data source. You can specify a name based on your business requirements. Example: back-end-hdfs.
NameNode Network Address	The network address of an HDFS primary server. NameNode serves as a primary server that is used to manage the namespaces of HDFS file systems and control access from clients to files in the file systems.
NameNode Port	The port number of the HDFS primary server.
Secondary NameNode Network Address	The network address of a secondary HDFS node. A secondary HDFS node helps the primary server perform management tasks.
Secondary NameNode Port	The port number of the secondary HDFS node.
HDFS Username	The name of the HDFS user. Note Make sure that the specified HDFS user has sufficient permissions. Otherwise, you may be unable to read files when you archive data, write files when you retrieve data, or restore the information of user groups. We recommend that you set this parameter to hadoop or hdfs.

Source Type: S3

Configure the parameters described in the following table.

Parameter	Description
Source Type	The type of the data source. Select S3.
Data Source Name	The name of the S3 data source. You can specify a name based on your business requirements. Example: awss3.
Use HTTPS	Specifies whether to encrypt transmitted data by using HTTPS. Valid values: No Yes
S3 Bucket	The name of the S3-Compatible Storage bucket.
S3 Endpoint	The endpoint of the bucket that can be used to perform operations on S3 objects. Examples: s3.us-east-1.amazonaws.com and 11.238.XXX.XXX:9000.
Access Key	The key ID that is used to access the S3-Compatible Storage bucket.
Secret Key	The access key that is used to access the S3-Compatible Storage bucket.

Associate a client group and click Next.
You can add multiple clients to a client group to concurrently run an archive job. You can also select an existing client group. In this example, set the Client Group From parameter to Create Backup Client Group. Then, enter a name for Client Group Name and select the clients that you want to add to the client group.

Configure a data analysis plan and click OK.

In the Configure Analysis Plan step, configure the parameters. The following table describes the parameters.

Parameter	Description
Enable Data Source Analysis	Specifies whether to enable the data analysis feature. If you turn on Enable Data Source Analysis, Cloud Backup analyzes data after the data source is added. You can use Cloud Backup to scan, analyze, or search for a data source only if the data analysis feature is enabled for the data source. Note If you turn off Enable Data Source Analysis, Cloud Backup directly archives data from the data source.
Meta Index Start Time	The time when Cloud Backup starts to perform an index operation on metadata.
Meta Index Interval	The interval at which Cloud Backup performs index operations on metadata. Unit: days or weeks.

After you add a data source, the data source is displayed on the Analyze and Archive tab.

Related operations

After a data source is added, you can click More in the Actions column and then select the required operation. The following table describes the available operations.

Operation	Description
Configure Analysis Plan	Configures a data analysis plan for the data source.
Run Meta Indexing	Creates an index for the data source. This way, you can efficiently analyze and search for data.
View Data Source	Views the details of the data source. The details include the type, NAS network address, NAS share path, and backup client group.
Edit Data Source	Modifies the parameters of the data source.
Unregister Data Source	Unregisters a data source. If you no longer require an archive plan, you can perform this operation.
Edit Backup Client Group	Changes the name of a client or a client group.

What to do next

Analyze a data source