All Products
Search
Document Center

DataWorks:Add an AnalyticDB for MySQL V3.0 data source

Last Updated:Sep 06, 2024

Before you can develop and manage AnalyticDB for MySQL V3.0 tasks in DataWorks, you must add an AnalyticDB for MySQL V3.0 cluster to the desired workspace as a data source. This way, you can use the AnalyticDB for MySQL V3.0 data source in different services of DataWorks and perform operations such as data synchronization, data development, and data analysis based on the data source.

Prerequisites

  • An AnalyticDB for MySQL V3.0 cluster is purchased, and an AnalyticDB for MySQL V3.0 database is created in the cluster. For more information, see Create a cluster and CREATE DATABASE.

    Note

    We recommend that you create an AnalyticDB for MySQL V3.0 cluster in the same region as the workspace to which you want to add an AnalyticDB for MySQL V3.0 data source. If the regions are different, you can add only a cross-region data source to the workspace. The data source cannot be associated with DataStudio for data development or periodic task scheduling. The data source can be used only for data synchronization.

  • The required resource group is purchased and configured.

    An AnalyticDB for MySQL V3.0 data source supports serverless resource groups and old-version exclusive resource groups. Old-version exclusive resource groups include exclusive resource groups for Data Integration, exclusive resource groups for scheduling, and exclusive resource groups for DataService Studio. We recommend that you use serverless resource groups. After the AnalyticDB for MySQL V3.0 data source is added, you can use the data source in scenarios such as data synchronization, development and scheduling of computing tasks, and generation of DataService Studio APIs.

    Note

    You must purchase and configure the required resource group based on the use scenario of the AnalyticDB for MySQL V3.0 data source and establish a network connection between the data source and the resource group in advance. For information about resource groups provided by DataWorks and how to configure a resource group, see Overview.

  • A DataWorks workspace is created, or the account that you use is added to the desired workspace as a member.

    You must add the desired AnalyticDB for MySQL V3.0 cluster to the workspace as a data source. This way, you can use the data source to perform data development operations in the workspace. In addition, you must associate the purchased resource group with the workspace and establish a network connection between the resource group and the data source. For information about how to create a workspace, see Create and manage workspaces.

    Note

    You can add the same AnalyticDB for MySQL V3.0 cluster to multiple DataWorks workspaces as a data source.

Limits

  • An AnalyticDB for MySQL V3.0 data source can be associated with DataStudio only if the data source meets the following conditions: The AnalyticDB for MySQL V3.0 cluster based on which the data source is added resides in the same region and belongs to the same Alibaba Cloud account as the workspace. This way, the AnalyticDB for MySQL V3.0 data source can be used for data development and periodic task scheduling.

  • You can add an AnalyticDB for MySQL V3.0 cluster that does not belong to the current Alibaba Cloud account to a workspace within the current Alibaba Cloud account as a data source. After the data source is added, you can use only a RAM role to access the related AnalyticDB for MySQL V3.0 cluster. AnalyticDB for MySQL V3.0 data sources that are added across accounts cannot be used for data development or periodic task scheduling. These data sources can be used only for data synchronization.

  • AnalyticDB for MySQL V3.0 data sources that are added in connection string mode cannot be used for data development or periodic task scheduling. These data sources can be used only for data synchronization. For more information, see Add a data source.

  • You can use only a serverless resource group or an exclusive resource group for Data Integration or exclusive resource group for scheduling to run an AnalyticDB for MySQL V3.0 task. We recommend that you use a serverless resource group.

  • If you want to create APIs in DataService Studio based on an AnalyticDB for MySQL V3.0 data source, we recommend that you use a serverless resource group.

Preparations: Permission description and configuration

  • If you want to add a data source as a RAM user or by using a RAM role, make sure that the RAM user or RAM role meets one of the following requirements:

  • If you want to add a data source across accounts, you must also make the following permission configurations:

    If you add an AnalyticDB for MySQL V3.0 data source across Alibaba Cloud accounts, you can use only a RAM role that is granted the required permissions to access the related AnalyticDB for MySQL V3.0 cluster.

    Sample scenario

    In this sample scenario, an AnalyticDB for MySQL V3.0 cluster that belongs to Alibaba Cloud Account B is added to a DataWorks workspace that belongs to Alibaba Cloud Account A as a data source, and Alibaba Cloud Account A needs to access the AnalyticDB for MySQL V3.0 cluster.

    • Alibaba Cloud Account A: A DataWorks workspace is created within Alibaba Cloud Account A, and Alibaba Cloud Account A needs to access an AnalyticDB for MySQL V3.0 cluster that is created within Alibaba Cloud Account B.

    • Alibaba Cloud Account B: An AnalyticDB for MySQL V3.0 cluster is created within Alibaba Cloud Account B, and an AnalyticDB for MySQL V3.0 database is created in the cluster. In addition, a RAM role is created within Alibaba Cloud Account B and is assigned to Alibaba Cloud Account A. This way, Alibaba Cloud Account A can access the AnalyticDB for MySQL V3.0 cluster by assuming the RAM role.

    After the preceding operations are performed, you can use Alibaba Cloud Account A to add the AnalyticDB for MySQL V3.0 cluster to DataWorks as a data source and then access the AnalyticDB for MySQL V3.0 cluster across Alibaba Cloud accounts.

    Create and grant permissions to a RAM role

    Requirements on a RAM role of Alibaba Cloud Account B and permission configuration for the RAM role:

    1. You must create a RAM role within Alibaba Cloud Account B and grant the RAM role the permissions to access a specified AnalyticDB for MySQL V3.0 cluster. You must add Alibaba Cloud Account A as the trusted cloud account of the RAM role to allow Alibaba Cloud Account A to assume the RAM role. For more information, see Create a RAM role for a trusted Alibaba Cloud account and Create custom policies.

    2. You must modify the trust policy of the RAM role to allow Alibaba Cloud Account A to assume the RAM role to access the AnalyticDB for MySQL V3.0 cluster. For more information, see Edit the trust policy of a RAM role. The following code shows the document of the trust policy:

      {
          "Version": "1",
          "Statement": [
              {
                  "Action": [
                      "sts:AssumeRole",
                      "adb:DescribeDBClusters",
                      "adb:DescribeDBClusterAttribute",
                      "adb:DescribeDBClusterAccessWhiteList",
                      "adb:ModifyDBClusterAccessWhiteList",
                      "adb:DescribeDBClusterNetInfo"
                  ],
                  "Effect": "Allow",
                  "Principal": {
                      "Service": [
                          "ID of Alibaba Cloud Account A@engine.dataworks.aliyuncs.com"
                      ]
                  }
              }
          ]
      }

Entry points for adding a data source

  1. Go to the Data Source page.

    1. Log on to the DataWorks console. In the top navigation bar, select the desired region. Then, click Management Center in the left-side navigation pane. On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.

    2. In the left-side navigation pane of the SettingCenter page, choose Data Sources > Data Sources.

  2. On the Data Sources page, click Add Data Source. In the Add Data Source dialog box, click AnalyticDB for MySQL (V3.0). On the page that appears, configure the parameters to add an AnalyticDB for MySQL V3.0 data source.

    You can also go to the Data Sources page in Data Integration to add an AnalyticDB for MySQL V3.0 data source. You can add a data source only to the production environment on the Data Sources page in Data Integration. After the data source is added, you must manage the data source on the Data Sources page in Management Center. You can go to Data Integration to view the types of data sources that you can add in this service.

Add a data source

DataWorks allows you to add a data source in one of the following modes.

Note

If you use a workspace in standard mode, you must separately add a data source in the development environment and production environment. For information about the modes of workspaces, see Differences between workspaces in basic mode and workspaces in standard mode.

Mode 1: Alibaba Cloud instance mode

Perform the following steps to add an AnalyticDB for MySQL V3.0 data source in this mode.

  1. Configure the parameters in the Basic Information section.image.png

    Parameter

    Description

    Data Source Name

    The name of the data source in DataWorks. The name must be unique within the current tenant.

    Configuration Mode

    The mode in which you want to add the data source. Select Alibaba Cloud Instance Mode.

    Alibaba Cloud Account

    Specifies whether the AnalyticDB for MySQL V3.0 cluster you want to use belongs to the current Alibaba Cloud account or another Alibaba Cloud account. Valid values:

    • Current Alibaba Cloud Account: The AnalyticDB for MySQL V3.0 cluster belongs to the current Alibaba Cloud account.

    • Another Alibaba Cloud Account: The AnalyticDB for MySQL V3.0 cluster belongs to another Alibaba Cloud account.

      Note

      If you set this parameter to Another Alibaba Cloud Account, you must add the AnalyticDB for MySQL V3.0 data source across accounts. After the data source is added, you can use only a RAM role to access the related AnalyticDB for MySQL V3.0 cluster. In addition, the data source cannot be associated with DataStudio for data development or periodic task scheduling. The data source can be used only in Data Integration for data synchronization.

    Other parameters that you must configure vary based on the value of the Alibaba Cloud Account parameter.

    Region

    The region in which the AnalyticDB for MySQL V3.0 cluster that you want to use resides.

    Note

    If the region that you selected is different from the region in which the workspace resides, you cannot associate the data source with DataStudio after you add the data source. In this case, the data source cannot be used for data development or periodic task scheduling. The data source can be used only in Data Integration for data synchronization.

    Other items such as the AnalyticDB for MySQL V3.0 cluster and the access identity

    Other parameters that you must configure vary based on the value of the Alibaba Cloud Account parameter.

    Value of the Alibaba Cloud Account parameter: Current Alibaba Cloud Account

    • Instance: Select the AnalyticDB for MySQL V3.0 cluster that you want to add as a data source from the Instance drop-down list.

    • Database Name, Username, Password: Enter the name of the desired AnalyticDB for MySQL V3.0 database in the Database Name field, and separately enter the username and password that are used to access the database in the Username and Password fields. You can log on to the AnalyticDB for MySQL to obtain information about the AnalyticDB for MySQL V3.0 database.

    Value of the Alibaba Cloud Account parameter: Another Alibaba Cloud Account

    • UID Of Alibaba Cloud Account, Instance ID, RAM Role: Enter the UID of the Alibaba Cloud account to which the AnalyticDB for MySQL V3.0 cluster you want to add as a data source belongs in the UID Of Alibaba Cloud Account field, enter the ID of the cluster in the Instance ID field, and then enter the RAM role that you want to use to access the cluster in the RAM Role field. The AnalyticDB for MySQL V3.0 cluster is added to DataWorks as a data source, and you must use the RAM role to access the cluster.

      Note

      For information about the permission configuration required by a RAM role to access a data source that is added across accounts, see the Preparations: Permission description and configuration section in this topic.

    • Database Name, Username, Password: Enter the name of the desired AnalyticDB for MySQL V3.0 database in the Database Name field, and separately enter the username and password that are used to access the database in the Username and Password fields. You can log on to the AnalyticDB for MySQL to obtain information about the AnalyticDB for MySQL V3.0 database.

      Note

      If you set this parameter to Another Alibaba Cloud Account, you must add the AnalyticDB for MySQL V3.0 data source across accounts. After the data source is added, you can use only a RAM role to access the related AnalyticDB for MySQL V3.0 cluster. In addition, the data source cannot be associated with DataStudio for data development or periodic task scheduling. The data source can be used only in Data Integration for data synchronization.

  2. Test the network connectivity between the data source and a resource group.

    Resource groups provided by DataWorks can be classified into resource groups for Data Integration, resource groups for scheduling, and resource groups for DataService Studio based on the use scenarios of the resource groups. For more information about resource groups, see Overview.

    You can find the resource group that you want to use in the Connection Configuration section based on the scenario in which you want to use the data source, and then test the network connectivity between the resource group and the data source. If the network connectivity test fails, tasks that use the data source cannot be run.

Mode 2: Connection string mode

AnalyticDB for MySQL V3.0 data sources that are added in connection string mode cannot be used for data development or periodic task scheduling. These data sources can be used only for data synchronization. Perform the following steps to add an AnalyticDB for MySQL V3.0 data source in this mode.

  1. Configure the parameters in the Basic Information section.image.png

    Parameter

    Description

    Data Source Name

    The name of the data source in DataWorks. The name must be unique within the current tenant.

    Configuration Mode

    Set this parameter to Connection String Mode.

    JDBC URL

    The JDBC URL that is used to connect to the AnalyticDB for MySQL V3.0 database. You can log on to the AnalyticDB for MySQL to obtain information about the AnalyticDB for MySQL database and the port number over which you can access the database.

    Username

    The username that you can use to access the database.

    Password

    The password that you can use to access the database.

  2. Test the network connectivity between the data source and a resource group.

    Resource groups provided by DataWorks can be classified into resource groups for Data Integration, resource groups for scheduling, and resource groups for DataService Studio based on the use scenarios of the resource groups. For more information about resource groups, see Overview.

    You can find the resource group that you want to use in the Connection Configuration section based on the scenario in which you want to use the data source, and then test the network connectivity between the resource group and the data source. If the network connectivity test fails, tasks that use the data source cannot be run.

What to do next

After the data source is added, you can perform the following operations based on your business requirements:

  • Develop and schedule computing tasks:

    DataWorks DataStudio and Operation Center provide the capabilities of developing and scheduling AnalyticDB for MySQL V3.0 tasks. If you want to develop AnalyticDB for MySQL V3.0 tasks based on the AnalyticDB for MySQL V3.0 data source or periodically schedule AnalyticDB for MySQL V3.0 tasks, you must go to the DataStudio page in the DataWorks console and associate the AnalyticDB for MySQL V3.0 data source with DataStudio.

    Note

    You can associate an AnalyticDB for MySQL V3.0 data source with DataStudio only if the AnalyticDB for MySQL V3.0 cluster based on which the data source is added resides in the same region and belongs to the same Alibaba Cloud account as the workspace to which the data source is added.

  • Perform data synchronization:

    DataWorks Data Integration provides AnalyticDB for MySQL V3.0 Reader and AnalyticDB for MySQL V3.0 Writer for you to read data from and write data to the AnalyticDB for MySQL V3.0 data source. You can configure a batch or real-time synchronization task for the AnalyticDB for MySQL V3.0 data source in DataStudio or configure a synchronization task for the AnalyticDB for MySQL V3.0 data source in Data Integration based on your business requirements to perform data synchronization.

  • Manage the data source: You can go to the Data Sources page in Management Center to perform management operations on the data source. For example, you can edit or delete the data source.