Alibaba Cloud Data Lake Formation (DLF) is a fully managed platform that is used to centrally store and manage data and metadata. DLF provides features such as metadata management, data storage management, permission management, storage analysis, and storage optimization. DataWorks allows you to only write data to a DLF 2.0 data source. This topic describes how to add a DLF 2.0 data source and configure a synchronization task for a DLF 2.0 data source.
Limits
DLF 2.0 data sources that you add to DataWorks can be used only for data synchronization.
Add a data source
Go to the Data Sources page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.
In the left-side navigation pane of the SettingCenter page, choose
.
On the Data Sources page, click Add Data Source. In the Add Data Source dialog box, search for Data Lake Formation 2.0 and click Data Lake Formation 2.0 (DLF 2.0). On the Add Data Lake Formation 2.0 (DLF 2.0) Data Source page, configure the parameters that are described in the following table to add a DLF 2.0 data source.
Parameter
Description
Data Source Name
Specify a name for the data source based on your business requirements. The name must be unique within the current workspace. The name can contain only letters, digits, and underscores (_) and must start with a letter.
Configuration Mode
The mode in which you want to add the data source. The value of this parameter can be only Alibaba Cloud Instance Mode.
Access Identity
The identity that you want to use to access the data source in DataWorks. Valid values:
Alibaba Cloud Account
Alibaba Cloud RAM User
Alibaba Cloud RAM Role
You can select a value based on your business requirements.
DLF Catalog
The name of a DLF catalog. By default, only DLF catalogs that reside in the same region as the DataWorks workspace are displayed. You can select the desired DLF catalog.
Database Name
The name of a database that belongs to the DLF catalog.
After the parameters are configured, you must test the network connectivity between the data source and a serverless resource group in the Connection Configuration section. If the network connectivity test is successful, you can click Complete Creation. If the network connectivity test fails, you can refer to the topics in the Network connectivity directory for troubleshooting.
Create a synchronization task
After a DLF 2.0 data source is added to DataWorks, you can configure a synchronization task to synchronize data to the data source. For more information, see the topics in the Synchronize data to DLF 2.0 directory.