All Products
Search
Document Center

DataWorks:Develop and deploy an extension based on a self-managed service

Last Updated:Oct 24, 2024

You can configure custom logic for an extension to manage operations performed in DataWorks, such as blocking specific operations. Extensions return processing results on specific events to implement process control in DataWorks. This topic describes how to develop and deploy an extension based on a self-managed service.

Background information

Prerequisites

The event message subscription feature is enabled. The development and deployment of an extension based on a self-managed service depends on the event message distribution capability of EventBridge. Make sure that you specify an event bus in EventBridge to receive event messages from DataWorks, and specify an on-premises or cloud-based service to receive the event messages from the event bus.

Limits

  • Only users of DataWorks Enterprise Edition can use the Extensions module.

  • The Extensions module is available in the following regions: China (Beijing), China (Hangzhou), China (Shanghai), China (Zhangjiakou), China (Shenzhen), China (Chengdu), US (Silicon Valley), US (Virginia), Germany (Frankfurt), Japan (Tokyo), China (Hong Kong), and Singapore.

Precautions

  • Only the Open Platform administrator, tenant administrator, Alibaba Cloud accounts, and RAM users to which the AliyunDataWorksFullAccess policy is attached have read and write permissions on the developer backend. For more information about permission management, see Manage permissions on global-level services and Manage permissions on the DataWorks services and the entities in the DataWorks console by using RAM policies.

  • If DataWorks Enterprise Edition expires, extensions become invalid and cannot be triggered to check extension point events. If an extension is triggered to check an event and does not complete the check before DataWorks Enterprise Edition expires, the check is terminated and the result Check Passed is returned.

  • If a combined node, such as a Platform for AI node, do-while node, or for-each node, triggers an extension check, you must wait until all inner nodes of the combined node pass the check before you can perform subsequent operations.

  • You can associate multiple extensions with the same extension point event. This way, the associated extensions are triggered when the event occurs.

Process

The following figure shows how an extension that is developed and deployed based on a self-managed service consumes event messages by using EventBridge.

image
Note

After an extension is triggered by an extension point event, the event process enters the Checking state. After the extension sends the processing result to DataWorks by calling an API operation, DataWorks determines whether to block the process.

User

Before you deploy an extension based on a self-managed service in DataWorks, you must develop an extension and deploy the extension on an on-premises or cloud-based service. You can refer to Appendix: DataWorks Open Platform sample code library to initialize the project code and obtain sample code for Open Platform from GitHub. You must develop and deploy an extension based on the type of the service that you want to use to receive event messages from the event bus that you configure.

Step 1: Configure dependencies for an extension

When you develop an extension, you must add the following dependency to the pom.xml file. EventBridge allows you to use various types of event targets to process and consume events. You can configure other dependencies based on the event targets that you specify.

DataWorks dependency library

<dependency>
 <groupId>com.aliyun</groupId>
 <artifactId>dataworks_public20200518</artifactId>
 <version>5.6.0</version>
</dependency>

Dependency packaging

<build>
        <plugins>
              <plugin>
                  <groupId>org.apache.maven.plugins</groupId>
                  <artifactId>maven-shade-plugin</artifactId>
                  <version>3.2.1</version>
                  <executions>
                    <execution>
                      <phase>package</phase>
                      <goals>
                        <goal>shade</goal>
                      </goals>
                      <configuration>
                        <filters>
                          <filter>
                            <artifact>*:*</artifact>
                            <excludes>
                              <exclude>META-INF/*.SF</exclude>
                              <exclude>META-INF/*.DSA</exclude>
                              <exclude>META-INF/*.RSA</exclude>
                            </excludes>
                          </filter>
                        </filters>
                      </configuration>
                    </execution>
                  </executions>
              </plugin>
        </plugins>
</build>

Step 2: Develop an extension

The event messages that are routed by an event bus of EventBridge are pushed to an on-premises or cloud-based service. The service receives DataWorks event messages that are pushed by the event bus and calls a specific DataWorks API operation to send the processing results to DataWorks.

  1. Develop the code of an extension.

    Parse event messages

    For information about the formats of event messages in DataWorks, see the Appendix: Message formats section of the "Development reference: Event messages and formats of event messages" topic. In an event message, the data field specifies the message content. During actual development, you can use the data.eventCode field to confirm the message type and the id field to obtain the message details.

    Note

    The OpenEvent module of DataWorks Open Platform distributes DataWorks event messages by using EventBridge. Before you develop an extension, you must enable the event message subscription feature in DataWorks.

    Write the processing logic

    Write the processing logic for event messages that are pushed by the event bus based on your business requirements. When you develop the code of an extension, you can use the following methods to improve the development efficiency and application effect:

    • Use extension parameters. For example, you can configure the extension.project.disabled parameter to prevent the extension from taking effect for the specified workspace. For more information, see Advanced feature: Configure extension parameters.

    • Configure the MessageId parameter in the GetIDEEventDetail operation for extension point events in DataStudio to obtain snapshots of extension point events.

    Note

    The MessageId parameter corresponds to the id field in an event message. For more information, see the Appendix: Message formats section of the "Development reference: Event messages and formats of event messages" topic.

    Return the processing result of the extension to DataWorks

    The service where the extension is deployed can call an API operation to return the processing results of extension point events that are specified in the extension to DataWorks. The API operation varies based on the DataWorks service to which the extension point events belong.

    • Extension point events in DataStudio: Call the UpdateIDEEventResult API operation to return the processing results.

    • Extension point events in Operation Center: Call the UpdateWorkbenchEventResult API operation to return the processing results.

    • Extension point events in other DataWorks services: Call the CallbackExtension API operation to return the processing results.

    The specific API operation returns the following response parameters:

    • CheckResult: The processing result of event messages. Valid values:

      • OK: An extension point event passes the check of an extension.

      • FAIL: An extension point event fails the check of an extension. You must check and handle the reported error at the earliest opportunity to ensure that your service runs as expected.

      • WARN: An extension point event passes the check of an extension, but a warning is reported for the event.

    • ExtensionCode: The code of an extension. After you register an extension, you can view the code of the extension in the List of Extensions section of the Extensions page.

    • MessageId: The ID of the message. This parameter corresponds to the id field in an event message. For more information, see the Appendix: Message formats section of the "Development reference: Event messages and formats of event messages" topic.

  2. Package the extension as an executable .jar file.

Step 3: Deploy the extension

After you develop and debug the code of the extension, you must deploy the code package as an application service on an Alibaba Cloud Elastic Compute Service (ECS) instance or another cloud service.

DataWorks

After you develop the code, you can register and manage the extension in DataWorks.

Step 1: Register an extension

Before you use an extension, you must register an extension in DataWorks and obtain the extension code for subsequent code development. To register an extension, perform the following steps:

  1. Go to the Open Platform page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose More > Open Platform. The Developer Backend tab appears.

  2. Register the extension.

    1. In the left-side navigation pane of the page that appears, click Extensions.

    2. Click Register Extension in the List of Extensions section. In the Select Deployment Method step of the Register Extension wizard, set the Select a deployment method for your extension parameter to Deploy Based on Self-managed Service and configure the parameters in the Register Extension step.

      Parameters

      Parameter

      Description

      Extension Name

      The name of the custom extension, which is used to identify the extension.

      Processed Extension Points

      The type of the extension point event that you want the extension to check. For more information about supported extension point events, see Supported extension point events. You can configure this parameter based on your business requirements.

      Note
      • After you configure this parameter, the system automatically configures the Event and Applicable Service parameters.

      • Extension point events are classified into tenant-level and workspace-level events. You can select only one type of event when you register an extension. For information about the types of extension point events that are supported by DataWorks, see Development reference: Event messages and formats of event messages.

      • Extensions that are deployed based on Function Compute can process only a pre-event for data download.

      Owner

      The extension owner. Users of the extension can contact the extension owner at the earliest opportunity when they encounter problems.

      Workspace for Testing

      The workspace that is used to test the extension. To check whether the extension is effective, you do not need to publish the extension because the test workspace supports an end-to-end test that allows you to check whether the extension works as expected.

      In the test workspace, developers can trigger events to check whether DataWorks sends related messages to the extension by using EventBridge and whether the extension checks the events and sends the check results to DataWorks.

      Note

      If the extension point event that you want to process is a tenant-level extension point event, you do not need to configure the Workspace for Testing parameter.

      Extension Details Address

      The URL of the extension details page.

      After you develop and deploy the extension, you can develop a web page to display how the extension works. Set this parameter to the page URL so that users can visit this web page to better understand and use the extension. For example, users can visit the web page to view the reason why a process is checked and blocked by the extension.

      Extension Documentation Address

      The URL of the extension documentation page.

      After you develop and deploy the extension, you can develop a help documentation page. Set this parameter to the page URL so that users can know the business logic and properties of the extension.

      Parameters for Extension

      The extension parameters that you want to use to improve the extension development and application efficiency.

      You can enter both the built-in parameters for typical scenarios and custom parameters for future reference.

      You can enter multiple parameters in the format of key=value. Make sure that each parameter occupies a separate line.

      Note

      For example, you can use the built-in parameter extension.project.disabled of an extension to prevent the extension from taking effect for the specified workspace. For more information about how to use these parameters, see Advanced feature: Configure extension parameters.

      Options for Extension

      The configuration items for the extension. The configuration items are provided for users to implement custom process management in different workspaces based on their business requirements. The extension developer must define each configuration item in a JSON string in the Register Extension dialog box.

      For example, the extension developer can allow users to manage the length of an SQL statement based on this parameter. For more information about the JSON format, see Advanced feature: Define options for an extension.

  3. Click OK.

    Note

    After you register the extension, you can view the extension in the List of Extensions section of the Extensions page.

Step 2: Publish the extension

After you develop, deploy, and register the extension in DataWorks, you must test the extension, submit the extension for review, and then publish the extension. Then, administrators, besides the owner of the extension, can enable the extension in Management Center. For more information, see Use an extension.

Appendix: Formats of event messages sent to EventBridge

The following sample code provides an example of a complete event message. The data parameter specifies the content of the event message. EventBridge provides other information based on the content.

{
    "datacontenttype": "application/json;charset=utf-8", //The content type of the data field. datacontenttype supports only the application/json content type. 
    "aliyunaccountid": "1111",//The ID of an Alibaba Cloud account.  
    "aliyunpublishtime": "2024-07-10T07:25:34.915Z",// The time when EventBridge receives the event message. 
    "data": {
              
             "tenantId": 28378****10656,// The ID of the tenant. Each Alibaba Cloud account in DataWorks corresponds to a tenant. Each tenant has its own tenant ID. To view the tenant ID, click the username in the upper-right corner of the DataStudio page and then click User Info in the Menu section. 
             "eventCode": "xxxx"//
            
    },
    "aliyunoriginalaccountid": "11111",
    "specversion": "1.0",
    "aliyuneventbusname": "default",// The name of the event bus that is used to receive DataWorks event messages. 
    "id": "45ef4dewdwe1-7c35-447a-bd93-fab****",// The event ID. The ID is the unique identifier of an event. 
    "source": "acs.dataworks",// The event source, which indicates the service that produces events. In this example, event messages are pushed by DataWorks. 
    "time": "2024-07-10T15:25:34.897Z",// The time when the event was generated. 
    "aliyunregionid": "cn-shanghai",// The region where the event is received. 
    "type": "dataworks:ResourcesUpload:UploadDataToTable"// The event type. You can use the event type to filter the messages pushed by DataWorks in the EventBridge console. The type of each event is different. 
}
Note

The content of the event message varies based on the event type. For more information about event messages, see Development reference: Event messages and formats of event messages.

Sample code used to develop a custom extension

After you understand the precautions for the extension development procedure, you can develop extension code based on your business requirements. The following topics provide examples on extension registration, development, and application in typical scenarios:

References