All Products
Search
Document Center

DataWorks:CDH Presto node

Last Updated:Feb 05, 2026

In DataWorks, a CDH Presto node is a distributed SQL query engine used for real-time data analytics on your CDH cluster. This topic explains how to configure and use a CDH Presto node.

Prerequisites

  • You have created an Alibaba Cloud CDH cluster and bound it to a DataWorks workspace. For more information, see Data Studio: Associate a CDH computing resource.

    Important

    Ensure the Presto component is installed on your CDH cluster and that you configured its settings when you bound the cluster.

  • (Optional) If you are using a RAM user, you must add the user to the workspace and grant them the Developer or Workspace Administrator role. The Workspace Administrator role has extensive permissions and must be granted with caution. For more information about how to add members to a workspace, see Add members to a workspace.

    Note

    If you are using your root account, you can skip this step.

  • You have configured a Hive data source in DataWorks and passed the connectivity test. For more information, see Data Source Management.

Create a node

For instructions, see Create a node.

Node development

Develop your task code in the SQL editor. You can define variables in the code using the ${variable_name} format and assign values to them in Scheduling configuration > Scheduling parameters on the right side of the node editor. This allows you to dynamically pass parameters for scheduled runs. To learn more about scheduling parameters, see Sources and expressions of scheduling parameters. For example:

SHOW TABLES;

SELECT * FROM userinfo ;
-- You can use scheduling parameters.
SELECT '${var}'; 

Debug the node

  1. In Run Configuration > Compute resource, set the Compute resource and Resource group.

    1. For Compute resource, select your registered CDH cluster.

    2. For Resource group, select a scheduling resource group that passed the data source connectivity test. For more information, see Network connectivity solutions.

  2. On the node editor's toolbar, click Run.

Next steps

  • Node scheduling configuration: To run a node on a recurring schedule, configure its Time Property and related scheduling properties in the Scheduling configuration panel on the right side of the page.

  • Publish a node: To publish a node to the production environment, click the image icon. Only nodes that are published to the production environment are scheduled.

  • Task O&M: After you publish a node, you can monitor its scheduled runs in the O&M Center. For more information, see Getting started with Operation Center.