All Products
Search
Document Center

DataWorks:Create and use AnalyticDB for PostgreSQL nodes

Last Updated:Nov 15, 2024

DataWorks allows you to use AnalyticDB for PostgreSQL nodes to develop and periodically schedule AnalyticDB for PostgreSQL tasks and integrate AnalyticDB for PostgreSQL tasks with other types of tasks. This topic describes how to use AnalyticDB for PostgreSQL nodes to develop tasks.

Prerequisites

  • A workflow is created.

    DataStudio allows you to perform development operations for different types of compute engines based on workflows. Therefore, you must create a workflow before you create a node. For more information, see Create a workflow.

  • An AnalyticDB for PostgreSQL data source is added and associated with DataStudio.

    You must add an AnalyticDB for PostgreSQL database to a DataWorks workspace as an AnalyticDB for PostgreSQL data source and associate the data source with DataStudio. This way, you can use the AnalyticDB for PostgreSQL data source to access AnalyticDB for PostgreSQL data and perform subsequent development operations. For more information, see Add an AnalyticDB for PostgreSQL data source and Preparations before data development: Associate a data source or cluster.

    Note

    Only AnalyticDB for PostgreSQL data sources that are added in connection string mode are supported for data development.

  • A serverless resource group (recommended) or an exclusive resource group for scheduling is purchased.

Background information

AnalyticDB for PostgreSQL nodes are used to connect to AnalyticDB for PostgreSQL of Alibaba Cloud. For more information, see AnalyticDB for PostgreSQL.

Step 1: Create an AnalyticDB for PostgreSQL node

  1. Go to the DataStudio page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and Governance > Data Development. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.

  2. In the Scheduled Workflow pane, right-click the desired workflow and choose Create Node > AnalyticDB for PostgreSQL > ADB for PostgreSQL.

  3. In the Create Node dialog box, enter a node name in the Name field and click Confirm. The node is created. You can develop and configure a task based on the node.

Step 2: Develop an AnalyticDB for PostgreSQL task based on the node

(Optional) Select an AnalyticDB for PostgreSQL data source

If you created multiple AnalyticDB for PostgreSQL data sources in your workspace, you must select a suitable AnalyticDB for PostgreSQL data source on the configuration tab of the AnalyticDB for PostgreSQL node. By default, if only one AnalyticDB for PostgreSQL data source is added, the data source is used for development operations.

Note

Only AnalyticDB for PostgreSQL data sources that are added in connection string mode are supported for data development.

Develop SQL code

In the code editor of the AnalyticDB for PostgreSQL node, write SQL statements based on the syntax supported by AnalyticDB for PostgreSQL.

Step 3: Configure scheduling properties for the AnalyticDB for PostgreSQL task

If you want to periodically run the AnalyticDB for PostgreSQL task on the created node, click Properties in the right-side navigation pane of the configuration tab of the node and configure the scheduling properties for the task based on your business requirements. For more information, see Overview.

Note

You must configure the Rerun and Parent Nodes parameters on the Properties tab before you commit the task.

Step 4: Debug the task code

You can perform the following debugging operations to check whether the task is configured as expected based on your business requirements:

  1. Optional. Select a resource group and assign custom parameters to variables.

    • Click the 高级运行 icon in the top toolbar of the configuration tab of the node. In the Parameters dialog box, select a resource group for scheduling that you want to use to debug and run the task code.

    • If you use scheduling parameters in your task code, you can assign the scheduling parameters to variables as values in the task code for debugging. For information about the value assignment logic of scheduling parameters, see Debugging procedure.

  2. Save and execute the SQL statements.

    In the top toolbar, click the 保存 icon to save SQL statements. Then, click the 运行 icon to execute the SQL statements.

  3. Optional. Perform smoke testing.

    You can perform smoke testing on the task in the development environment when you commit the task or after you commit the task. For more information, see Perform smoke testing.

Step 5: Commit and deploy the task

After you configure the task on the node, you must commit and deploy the task. The task can be periodically run based on the scheduling configurations after you commit and deploy the task.

  1. Click the 保存 icon in the top toolbar to save the task.

  2. Click the 提交 icon in the top toolbar to commit the task.

    In the Submit dialog box, configure the Change description parameter. Then, determine whether to review task code after you commit the task based on your business requirements.

    Note
    • You can commit the task only after you configure the Rerun and Parent Nodes parameters.

    • You can use the code review feature to ensure the code quality of tasks and prevent task execution errors caused by invalid task code. If you enable the code review feature, the task code that is committed can be deployed only after the task code passes the code review. For more information, see Code review.

If you use a workspace in standard mode, you must deploy the task to the production environment after you commit the task. To deploy a task on a node, click Deploy in the upper-right corner of the configuration tab of the node. For more information, see Deploy nodes.

What to do next

After you commit and deploy the task, the task is periodically run based on the scheduling configurations. You can click Operation Center in the upper-right corner of the configuration tab of the node to go to Operation Center and view the scheduling status of the task. For more information, see View and manage auto triggered nodes.