All Products
Search
Document Center

DataWorks:Develop a script template

Last Updated:Feb 14, 2025

A script template defines an SQL code process that includes multiple input and output parameters. You can introduce one or more source tables to an SQL code process and use the SQL code process to filter source table data, join source tables, and aggregate data to generate an output table required for business. This topic provides an overview of a script template and describes the definition and usage of a script template.

Overview of a script template

Introduction

In actual business scenarios, a large number of SQL code processes are similar. The input tables or output tables of these processes may have the same schema or compatible data types but different table names. In this case, developers can create a script template based on an SQL code process to reuse SQL code. The script template extracts input parameters from input tables and output parameters from output tables.

When you create an SQL Snippet node, you need to only select an existing script template from the script template list based on your business process and configure specific parameters for input tables and output tables in your business for the selected script template. This way, you do not need to repeatedly edit the code. This helps prevent repeated operations during development and improve development efficiency. You can deploy and run a created SQL Snippet node in the same manner in which you deploy and run other SQL nodes.

Permissions

To create and use a script template, you must be assigned the Development role. For information about how to assign a role, see the Add a RAM user to a workspace as a member and assign roles to the member section of the "Manage permissions on workspace-level services" topic.

Limits

  • Only DataWorks Standard Edition or a more advanced edition supports SQL Snippet nodes. For more information, see Differences among DataWorks editions.

  • The script templates that are created by using member accounts in the current workspace are available on the Workspace-Specific tab.

  • The script templates that are created within tenants are available on the Public tab.

Script template types

Script templates are classified into workspace-level and public script templates. You can specify the type of a script template when you create the script template.

  • Workspace-level script template: After this type of script template is deployed, only the members in the current DataWorks workspace can use this type of script template. To use this type of script template, you must be a member in the current DataWorks workspace. For more information, see the Add a RAM user to a workspace as a member and assign roles to the member section of the "Manage permissions on workspace-level services" topic.

  • Public script template: The developers of this type of script template can go to the Public tab and make a general-purpose script template public within the current tenant. This way, all users within the current tenant can use the script template.

Use a script template

Script templates that are created in DataWorks must work together with SQL Snippet nodes.

  1. Define a script template.

    In the Snippets pane of DataStudio, developers can abstract an SQL code process and define input and output parameters. This way, the SQL code process can process a specific input table based on input parameters to generate an output table with business value based on output parameters. The input and output parameters are configured in the @@{Parameter name} format.

    • The following types of input parameters are supported:

      • Table: Use this type of input parameters if output results are of a single type.

      • String: Use this type of input parameters if the values of input parameters are controlled based on variables.

    • The type of an output parameter must be Table.

  2. Reference the script template.

    On the DataStudio page, you can create an SQL Snippet node to reference the desired script template and replace input and output parameters in the script template to reuse code.

Define a script template

Go to the Snippets pane

  1. Go to the DataStudio page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and O&M > Data Development. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.

  2. In the left-side navigation pane of the DataStudio page, click Snippets.

    Note

    If the Snippets module is not displayed in the left-side navigation pane, click the 设置 icon in the lower-left corner and follow the instructions that are described in Configure settings in the DataStudio Modules section to add the module.

Create and configure a script template

This section describes how to create a script template in the Snippets pane. In most cases, a script template consists of input parameters, output parameters, and an SQL code process. An SQL code process defines the code that implements the features of a script template. In an SQL code process, variable input tables or input character strings are abstracted as input parameters of the script template, and variable output tables are abstracted as output parameters of the script template. This allows you to reuse SQL code. Input parameters and output parameters are configured in the @@{Variable name} format.

Note

A script template can contain multiple input parameters and output parameters. You can configure the parameters based on your business requirements.

过程体

Step 1: Create a script template

In the Snippets pane, you can use one of the methods shown in the following figures to create and name a script template.创建组件

Note
  • The script templates that are created by using member accounts in the current workspace are available on the Workspace-Specific tab.

  • The script templates that are created within tenants are available on the Public tab.

Step 2: Configure the script template

  1. Configure an SQL code process.

    An SQL code process defines the code that implements the features of a script template. You can define an abstract SQL code process to introduce input parameters and output parameters in the @@{Parameter name} format to a script template. This way, the script template can control and process an input table based on input parameters to generate an output table with business value. When you use the script template, you need to only configure specific input and output parameters. Then, executable SQL code can be generated.

  2. Configure input parameters.

    In this step, you can configure input parameters. Input parameters are of the Table or String type.

    • Table: Use this type if output results are of a single type.

      The following table describes the parameter definition for table-type input parameters.

      Item

      Description

      Example

      Parameter Definition

      The parameter definition is a text definition of the table schema, including the field names, field types, and field descriptions. When you use the script template later, you must specify an input table that contains the same number of fields and compatible field types that are defined for the table-type parameter. If you do not specify an input table that contains the same number of fields and compatible field types that are defined for the table-type parameter, an error is returned when the script template is run.

      Note

      The parameter definition is only for reference and does not trigger an immediate force check.

      We recommend that you define parameters in the following format:

      Name of Field 1 Type of Field 1 Description of Field 1 
      Name of Field 2 Type of Field 2 Description of Field 2 
      …
      Name of Field n Type of Field n Description of Field n

      Example:

      area_id string 'Region ID' 
      city_id string 'City ID' 
      order_amt double 'Order amount'
    • String: Use this type if the values of input parameters are controlled based on variables.

      The following table describes the configurations for string-type input parameters.

      Item

      Description

      Default Value

      You can specify a default value for a string-type parameter. The default value is used by default when the script template is used.

      Sample scenario

      • Scenario 1: To export only the sales amount of the top N cities in each region in an output table of an SQL code process, you can use a string-type parameter to specify the value of N.

      • Scenario 2: To export the total sales amount of a province in an output table of an SQL code process, you can use a string-type parameter to specify the name of the province and obtain the sales data of the specified province.

  3. Configure output parameters.

    In this step, you can configure output parameters to define a table to be generated in an SQL code process. To facilitate the use of a script template, you can specify the schema of an output table for reference.

    The following table describes the parameter definition for output parameters.

    Item

    Description

    Example

    Parameter Definition

    The parameter definition is a text definition of the table schema, including the field names, field types, and field descriptions. When you use the script template later, you must specify an output table that contains the same number of fields and compatible field types that are defined for the table-type parameter. If you do not specify an output table that contains the same number of fields and compatible field types that are defined for the table-type parameter, an error is returned.

    Note

    The parameter definition is only for reference and does not trigger an immediate force check.

    We recommend that you define parameters in the following format:

    Name of Field 1 Type of Field 1 Description of Field 1 
    Name of Field 2 Type of Field 2 Description of Field 2 
    …
    Name of Field n Type of Field n Description of Field n

    In addition, you can add fields that indicate aggregated output results to the parameter definition based on your business requirements. For example, you can add a ranking field or a revenue field.

    Example:

    area_id string 'Region ID' 
    city_id string 'City ID' 
    order_amt double 'Order amount' 
    rank bigint 'Ranking'

Step 3: Save and commit the script template

In the top toolbar on the configuration tab of the script template, click the 保存 icon to save the script template. Then, click the 提交 icon to commit the script template. After the script template is created, you can reference the script template in an SQL Snippet node. This helps generate a table required for your business. For more information, see the Reference a script template section in this topic.

Reference a script template

Prerequisites

Reference a script template

On the configuration tab of the SQL Snippet node, follow the steps that are shown in the following figure to reference a script template.image.png

  1. Select the script template that you want to reference.

    If no script templates are available, create a script template. For more information, see the Define a script template section in this topic.

    • If the script template that you selected has the latest version, determine whether to reference the latest version of the script template based on your business requirements. To reference the latest version of the script template, click Update Code.

    • Click Edit Snippet to view the details of the script template.

  2. Configure parameters in the script template based on your business requirements.

What to do next

After you complete the development of a task by using the created node, you can perform the following operations:

  • Configure scheduling properties: You can configure properties for periodic scheduling of the node. If you want the system to periodically schedule and run the task on the node, you must configure items for the node, such as rerun settings and scheduling dependencies. For more information, see Overview.

  • Debug the node: You can debug and test the code of the node to check whether the code logic meets your expectations. For more information, see Debugging procedure.

  • Deploy the node: After you complete all development operations, you can deploy the node. After the node is deployed, the system periodically schedules the node based on the scheduling properties of the node. For more information, see Deploy nodes.

Manage script templates

Share a script template and view the reference records of the script template

You can share a script template or view the reference records of a script template based on your business requirements.分享及查看引用记录

  • Share a script template: After a script template is deployed, the script template becomes a workspace-level script template. Only members in the current workspace can use the script template. You can click the Publish Snippet icon in the top toolbar on the configuration tab of a general-purpose script template that you developed to make the script template public within the current tenant. This way, all users within the current tenant can use the script template. The Publish Snippet icon is marked as 1 in the preceding figure.

  • View the reference records of a script template: You can view the nodes that reference the current script template on the Snippet Nodes tab. This way, you can estimate the impacts of a script template change that you want to make. The Snippet Nodes tab is marked as 2 in the preceding figure.

Upgrade a script template

Operator: Developer

You can edit the code of a script template that you developed and modify the parameter settings of the script template based on your business requirements. After the modifications are saved and the script template is recommitted, the script template is upgraded to a new version. You can view the details of each version on the Versions tab in the right-side navigation pane on the configuration tab of the script template. The Versions tab is marked as 3 in the following figure.组件升级

Impacts imposed by version changes on the usage of a script template

If a script template is upgraded and you want to reference the script template in your SQL Snippet node, you can determine whether to use the latest version of the script template. If you want to use the latest version of the script template, check whether the parameter settings in the latest version of the script template take effect for your SQL Snippet node and modify configurations based on the template description. Then, commit and deploy the SQL Snippet node. You can commit and deploy the SQL Snippet node in the same manner in which you commit and deploy other SQL nodes.image.png

Sample scenario

Developer C creates a script template of the V1.0 version. User A uses the V1.0 script template. Then, Developer C upgrades the script template from V1.0 to V2.0. User A finds that the script template has the latest version V2.0 during usage of the script template. User A can open the script template and view the details about the latest version. If the latest version of the script template can achieve a better business effect, User A can modify configurations to use the latest version.

More references

Features on the configuration tab of a script template

界面

Feature

Description

Save

Saves the settings of the current script template.

Steal Lock

Allows you to steal the lock of the current script template and then edit the script template if you are not the owner of the script template.

Submit

Commits the current script template to the development environment.

Publish Snippet

Makes a general-purpose script template public to the current tenant. This way, all users within the tenant can view and use the script template.

Parse I/O Parameters

Parses input and output parameters from the code.

Run

Runs the current script template in the development environment.

Stop

Stops the current script template that is running.

Format Code

Formats the code by keyword.

Parameters

Allows you to view the basic information and configure input and output parameters for the current script template.

Versions

Allows you to view the deployed versions of the current script template.

Snippet Nodes

Displays the reference records of the current script template.

Best practices

Prerequisites

Step 1: Define a script template

Create a script template named get_top_n. For more information, see the Define a script template section in this topic. The following descriptions show the configurations of the script template.image.png

  • Parameter settings

    Category

    Parameter

    Type

    Description

    Parameter definition

    Input parameter

    myinputtable

    table

    Specifies a sales data table.

    area_id string
    city_id string
    order_amt double
    rank bigint

    topn

    string

    Specifies the number of top cities.

    N/A

    Output parameter

    myoutput

    table

    Specifies an output table that displays the rankings of the specified top cities in each region.

    area_id string
    city_id string
    order_amt double
    rank bigint
  • Definition of an SQL code process

    INSERT OVERWRITE TABLE  @@{myoutput}  PARTITION (pt='${bizdate}')
     SELECT r3.area_id,
     r3.city_id,
     r3.order_amt,
     r3.rank
    from (
    SELECT
       area_id,
       city_id,
       rank,
       order_amt_1505468133993_sum as order_amt ,
       order_number_1505468133991_sum,
       profit_amt_1505468134000_sum
    FROM
       (SELECT
       area_id,
       city_id,
       ROW_NUMBER() OVER (PARTITION BY r1.area_id ORDER BY r1.order_amt_1505468133993_sum DESC) AS rank,
       order_amt_1505468133993_sum,
       order_number_1505468133991_sum,
       profit_amt_1505468134000_sum
    FROM
       (SELECT
       area AS area_id,
       city AS city_id,
       SUM(order_amt) AS order_amt_1505468133993_sum,
       SUM(order_number) AS order_number_1505468133991_sum,
       SUM(profit_amt) AS profit_amt_1505468134000_sum
    FROM
      @@{myinputtable}
    WHERE
       SUBSTR(pt, 1, 8) IN ( '${bizdate}' )
    GROUP BY
       area,
       city )
       r1 ) r2
    WHERE
       r2.rank >= 1 AND r2.rank <= @@{topn}
    ORDER BY
       area_id,
       rank limit 10000) r3;

Step 2: Use the script template

Create an SQL Snippet node named xc_Referenced script template_get_top_n, reference the get_top_n script template that you created in Step 1, and then configure parameters. The following figure shows the parameter configuration procedure. For more information about how to create an SQL Snippet node and reference a script template, see the Reference a script template section in this topic. image.pngParameter settings:

  • Input parameter myinputtable: specifies an input table named company_sales_record as the data source. Schema of the input table:

    company_sales_record

    CREATE TABLE IF NOT EXISTS company_sales_record
    (
        order_id         STRING COMMENT 'Order ID (PK)',
        report_date      STRING COMMENT 'Order generation date',
        customer_name    STRING COMMENT 'Customer name',
        order_level      STRING COMMENT 'Order level',
        order_number     DOUBLE COMMENT 'Number of orders',
        order_amt        DOUBLE COMMENT 'Order amount',
        back_point       DOUBLE COMMENT 'Discount',
        shipping_type    STRING COMMENT 'Transportation method',
        profit_amt       DOUBLE COMMENT 'Amount of profit',
        price            DOUBLE COMMENT 'Unit price',
        shipping_cost    DOUBLE COMMENT 'Cost of transportation',
        area             STRING COMMENT 'Region',
        province         STRING COMMENT 'Province',
        city             STRING COMMENT 'City',
        product_type     STRING COMMENT 'Product type',
        product_sub_type STRING COMMENT 'Product subtype',
        product_name     STRING COMMENT 'Product name',
        product_box      STRING COMMENT 'Product packaging',
        shipping_date    STRING COMMENT 'Date of transportation'
    ) 
    COMMENT 'Detailed sales data'
    PARTITIONED BY
    (
        pt               STRING
    )
    LIFECYCLE 365;
  • Input parameter topn: specifies the number of top cities in each region based on the rankings of total sales amount of the cities in ascending order. In this example, this parameter is set to 10.

  • Output parameter myoutput: specifies an output table named company_sales_top_n. The output table contains data processing results. Schema of the output table:

    company_sales_top_n

    CREATE TABLE IF NOT EXISTS company_sales_top_n
    ( 
    area STRING COMMENT 'Region', 
    city STRING COMMENT 'City', 
    sales_amount DOUBLE COMMENT 'Sales amount', 
    rank BIGINT COMMENT 'Ranking'
    )
    COMMENT 'Company sales rankings'
    PARTITIONED BY (pt STRING COMMENT '')
    LIFECYCLE 365;