All Products
Search
Document Center

DataWorks:Develop and deploy an extension based on Function Compute

Last Updated:Oct 24, 2024

You can configure custom logic for an extension to manage operations performed in DataWorks, such as blocking specific operations. Extensions return processing results on specific events to implement process control in DataWorks. This topic describes how to develop and deploy an extension based on Function Compute.

Background information

Function Compute is an event-driven, fully-managed computing service. You can deploy an extension in Function Compute to allow DataWorks to push the messages of extension point events to the ExtensionRequest class. In the code of an extension, you can construct an ExtensionRequest object to receive the messages and context of extension point events by implementing the handleRequest method of the PojoRequestHandler interface. Then, you can write the processing logic for the extension and use the ExtensionResponse class to return the processing results of extension point events to DataWorks. This way, DataWorks automatically obtains the processing results of extension point events from the ExtensionResponse class and determines whether to block a process.

Limits

  • Only users of DataWorks Enterprise Edition can use the Extensions module.

  • The Extensions module is available in the following regions: China (Beijing), China (Hangzhou), China (Shanghai), China (Zhangjiakou), China (Shenzhen), China (Chengdu), US (Silicon Valley), US (Virginia), Germany (Frankfurt), Japan (Tokyo), China (Hong Kong), and Singapore.

Precautions

  • Only the Open Platform administrator, tenant administrator, Alibaba Cloud accounts, and RAM users to which the AliyunDataWorksFullAccess policy is attached have read and write permissions on the developer backend. For more information about permission management, see Manage permissions on global-level services and Manage permissions on the DataWorks services and the entities in the DataWorks console by using RAM policies.

  • If DataWorks Enterprise Edition expires, extensions become invalid and cannot be triggered to check extension point events. If an extension is triggered to check an event and does not complete the check before DataWorks Enterprise Edition expires, the check is terminated and the result Check Passed is returned.

  • If a combined node, such as a Platform for AI node, do-while node, or for-each node, triggers an extension check, you must wait until all inner nodes of the combined node pass the check before you can perform subsequent operations.

  • You can associate multiple extensions with the same extension point event. This way, the associated extensions are triggered when the event occurs.

  • Extensions that are deployed based on Function Compute can process only the following events: pre-events for data download, pre-events for asset publishing and unpublishing, and pre-events for data upload.

Process

The following figure shows how an extension that is developed and deployed based on Function Compute processes extension point events.

image
Note

After an extension is triggered by an extension point event, the event process enters the Checking state. After the extension sends the processing result to DataWorks by calling an API operation, DataWorks determines whether to block the process.

User

Before you deploy an extension based on Function Compute, you must develop the extension on your on-premises machine. You can use the fc-java-core library to run a handler and download fc_dataworks_demo01-1.0-SNAPSHOT.jar to obtain the sample code. For more information, see Event handlers.

Step 1: Configure dependencies for an extension

When you develop an extension, you must add the following dependencies to the pom.xml file.

DataWorks dependency library

<dependency>
 <groupId>com.aliyun</groupId>
 <artifactId>dataworks_public20200518</artifactId>
 <version>5.6.0</version>
</dependency>

Function Compute dependency library

<!-- https://mvnrepository.com/artifact/com.aliyun.fc.runtime/fc-java-core -->
<dependency>
    <groupId>com.aliyun.fc.runtime</groupId>
    <artifactId>fc-java-core</artifactId>
    <version>1.4.1</version>
</dependency>

<!-- https://mvnrepository.com/artifact/com.aliyun.fc.runtime/fc-java-event -->
<dependency>
    <groupId>com.aliyun.fc.runtime</groupId>
    <artifactId>fc-java-event</artifactId>
    <version>1.2.0</version>
</dependency>

Dependency packaging

<build>
        <plugins>
              <plugin>
                  <groupId>org.apache.maven.plugins</groupId>
                  <artifactId>maven-shade-plugin</artifactId>
                  <version>3.2.1</version>
                  <executions>
                    <execution>
                      <phase>package</phase>
                      <goals>
                        <goal>shade</goal>
                      </goals>
                      <configuration>
                        <filters>
                          <filter>
                            <artifact>*:*</artifact>
                            <excludes>
                              <exclude>META-INF/*.SF</exclude>
                              <exclude>META-INF/*.DSA</exclude>
                              <exclude>META-INF/*.RSA</exclude>
                            </excludes>
                          </filter>
                        </filters>
                      </configuration>
                    </execution>
                  </executions>
              </plugin>
        </plugins>
</build>

You can use Apache Maven Shade or Apache Maven Assembly to package dependency libraries. In the preceding sample code, Apache Maven Shade is used.

Step 2: Develop an extension

To develop an extension based on Function Compute, you must use the PojoRequestHandler interface to implement the handleRequest method to receive the requests specified by the ExtensionRequest class and the context. Then, you can write the processing logic for the extension and use the ExtensionResponse class to return the processing results of extension point events to DataWorks.

  1. Develop the code of an extension.

    Parse event messages

    For information about the formats of event messages sent by DataWorks, see Appendix: Message formats. In an event message, the messageBody field specifies the message content. During actual development, you can use the messageBody.eventCode field to confirm the message type and the messageId field to obtain the message details.

    Write the processing logic

    Write the processing logic for event messages that are pushed by DataWorks based on your business requirements. When you develop the code of an extension, you can use the following methods to improve the development efficiency and application effect.

    • Use extension parameters. For example, you can configure the extension.project.disabled parameter to prevent the extension from taking effect for the specified workspace. For more information, see Advanced feature: Configure extension parameters.

    • Configure the MessageId parameter in the GetIDEEventDetail operation for extension point events in DataStudio to obtain snapshots of extension point events.

    Note

    The MessageId parameter corresponds to the id field in an event message. For more information, see the Appendix: Message formats section of the "Development reference: Event messages and formats of event messages" topic.

    Return the processing result of the extension to DataWorks

    When you encapsulate the ExtensionResponse class, you must configure the CheckResult parameter. DataWorks automatically reads the processing results from the CheckResult parameter to determine whether to fail a process.

    • OK: An extension point event passes the check of an extension.

    • FAIL: An extension point event fails the check of an extension. You must check and handle the reported error at the earliest opportunity to ensure that your service runs as expected.

    • WARN: An extension point event passes the check of an extension, but a warning is reported for the event.

    You can download fc_dataworks_demo01-1.0-SNAPSHOT.jar and upload it to Function Compute for verification and test. The content in the following tabs describes the code details.

    Code details

    App

    The following sample code shows how to use the PojoRequestHandler interface to implement the handleRequest method to receive the requests specified by the ExtensionRequest class and the context. Then, you can write the processing logic for the extension and use the ExtensionResponse class to return the processing results of extension point events to DataWorks.

    In this example, uploading data to ADS tables is prohibited.

    package com.aliyun.example;
    
    import com.alibaba.fastjson.JSON;
    import com.alibaba.fastjson.JSONObject;
    import com.aliyun.dataworks.ExtensionRequest;
    import com.aliyun.dataworks.ExtensionResponse;
    import com.aliyun.fc.runtime.Context;
    import com.aliyun.fc.runtime.PojoRequestHandler;
    
    
    public class App implements PojoRequestHandler<ExtensionRequest, ExtensionResponse> {
    
        public ExtensionResponse handleRequest(ExtensionRequest extensionRequest, Context context) {
            // Print the request content for debugging.
             System.out.println(JSON.toJSONString(extensionRequest));
            // Create a response object.
            ExtensionResponse extensionResponse = new ExtensionResponse();
            // Check whether the value of the eventType parameter is upload-data-to-table.
            if ("upload-data-to-table".equals(extensionRequest.getEventType())) {
                try {
                    // Convert the message body into strings and parse the strings into JSON objects.
                    String messageBodyStr = JSON.toJSONString(extensionRequest.getMessageBody());
                    JSONObject messageBody = JSON.parseObject(messageBodyStr);
                    String tableGuid = messageBody.getString("tableGuid");
                    // Check whether the value of the tableGuid parameter contains ads.
                    if (tableGuid != null && tableGuid.contains("ads")) {
                        extensionResponse.setCheckResult("FAIL");
                    } else {
                        extensionResponse.setCheckResult("OK");
                    }
                } catch (Exception e) {
                    extensionResponse.setCheckResult("FAIL");
                    extensionResponse.setErrorMessage("Error processing request: " + e.getMessage());
                    return extensionResponse;
                }
            } else {
                extensionResponse.setCheckResult("FAIL");
            }
    
            // Specify an error message as the feedback information.
            extensionResponse.setErrorMessage("This is a test!");
    
            // Return the response object.
            return extensionResponse;
        }
    }
    
    

    ExtensionRequest

    Define the request structure of the extension to encapsulate the event message that is sent by DataWorks.

    public class ExtensionRequest {
        private Object messageBody;
        private String messageId;
        private String extensionBizId;
        private String extensionBizName;
        private String eventType;
        private String eventCategoryType;
        private Boolean blockBusiness;
    
        public ExtensionRequest() {
        }
    
        public Object getMessageBody() {
            return this.messageBody;
        }
    
        public void setMessageBody(Object messageBody) {
            this.messageBody = messageBody;
        }
    
        public String getMessageId() {
            return this.messageId;
        }
    
        public void setMessageId(String messageId) {
            this.messageId = messageId;
        }
    
        public String getExtensionBizId() {
            return this.extensionBizId;
        }
    
        public void setExtensionBizId(String extensionBizId) {
            this.extensionBizId = extensionBizId;
        }
    
        public String getExtensionBizName() {
            return this.extensionBizName;
        }
    
        public void setExtensionBizName(String extensionBizName) {
            this.extensionBizName = extensionBizName;
        }
    
        public String getEventType() {
            return this.eventType;
        }
    
        public void setEventType(String eventType) {
            this.eventType = eventType;
        }
    
        public String getEventCategoryType() {
            return this.eventCategoryType;
        }
    
        public void setEventCategoryType(String eventCategoryType) {
            this.eventCategoryType = eventCategoryType;
        }
    
        public Boolean getBlockBusiness() {
            return this.blockBusiness;
        }
    
        public void setBlockBusiness(Boolean blockBusiness) {
            this.blockBusiness = blockBusiness;
        }
    }

    ExtensionResponse

    Define the response structure of the extension to encapsulate the processing results of the extension.

    public class ExtensionResponse {
        private String checkResult;
        private String errorMessage;
    
        public ExtensionResponse() {
        }
    
        public String getCheckResult() {
            return this.checkResult;
        }
    
        public void setCheckResult(String checkResult) {
            this.checkResult = checkResult;
        }
    
        public String getErrorMessage() {
            return this.errorMessage;
        }
    
        public void setErrorMessage(String errorMessage) {
            this.errorMessage = errorMessage;
        }
    }
  2. Package the extension code into an executable .jar package or .zip file for subsequent use in Function Compute.

    Open the CLI of the code editor, switch to the root directory, and then run the mvn clean package command to package the extension code.

    • If the message shows that compilation failed, modify the code based on the error message.

    • If the compilation is successful, the compiled JAR file is located in the target directory in the project folder and is named java-example-1.0-SNAPSHOT.jar based on the artifactId and version fields in pom.xml.

    Note

    For macOS and Linux OSs, make sure that you have the read and execute permissions on the related code file before you package the file.

Function Compute

Step 1: Deploy the extension

  1. Log on to the Function Compute console 2.0. In the left-side navigation pane, click Tasks.

  2. Click Create Function to create a Function Compute service and the required function, and deploy the extension. The extension you developed directly sends specific event messages to the created service. For more information, see Create a service and Create a function. The following figure shows how to configure a function.image

    • Select a runtime environment based on your code. If you use the sample package fc_dataworks_demo01-1.0-SNAPSHOT.jar provided in Step 1, set the Runtime parameter to Java 8. You must download the fc_dataworks_demo01-1.0-SNAPSHOT.jar package to your on-premises machine and upload it as the code package. You can configure the parameters based on your business requirements.

    • For more information about the operations that you can perform on a function, see Manage functions.

  3. Configure the handler of the function.

    You can configure a handler for the function in the Function Compute console. For functions that use the Java language, you can configure a handler in the [Package name].[Class name]::[Method name] format. Example:

    • Package name: example

    • Class name: HelloFC

    • Method name: handleRequest

    The handler is example.HelloFC::handleRequest.

    Note

    The default handler is example.App::handleRequest. You can modify the handler based on your business requirements. For more information, see Handler.

Step 2: Test the function

After you create the function, you can perform the following operations to test the function: Find the created function and click the name of the function. On the page that appears, click the Code tab. Then, click Test Function. After the function passes the test, you can register the extension in the DataWorks console.

DataWorks

Step 1: Register the extension

After you deploy the extension in Function Compute, you must register the extension in DataWorks. To register the extension in DataWorks, perform the following steps:

  1. Go to the Open Platform page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose More > Open Platform. The Developer Backend tab appears.

  2. Register the extension.

    1. In the left-side navigation pane of the page that appears, click Extensions.

    2. Click Register Extension in the List of Extensions section. In the Select Deployment Method step of the Register Extension wizard, set the Select a deployment method for your extension parameter to Deploy Based on Function Compute (Highly Recommended) and configure the parameters in the Register Extension step.

    The following table describes the parameters for registering an extension. You can configure the parameters based on your business requirements.

    Parameters

    Parameter

    Description

    Extension Name

    The name of the custom extension, which is used to identify the extension.

    Select Function Compute and Select Function

    The Function Compute service and function on which you want to deploy the extension. The extension you developed directly sends specific event messages to the service.

    Processed Extension Points

    Only the message that is triggered by pre-events for data download, pre-events for asset publishing and unpublishing, and pre-events for data upload can be processed.

    Note

    After you configure this parameter, the system automatically configures the Event and Applicable Service parameters.

    Owner

    The extension owner. Users of the extension can contact the extension owner at the earliest opportunity when they encounter problems.

    Workspace for Testing

    The workspace that is used to test the extension. To check whether the extension is effective, you do not need to publish the extension because the test workspace supports an end-to-end test that allows you to check whether the extension works as expected.

    In the test workspace, developers can trigger events to check whether DataWorks sends related messages to the extension by using EventBridge and whether the extension checks the events and sends the check results to DataWorks.

    Note

    If the extension point event that you want to process is a tenant-level extension point event, you do not need to configure the Workspace for Testing parameter.

    Extension Details Address

    The URL of the extension details page.

    After you develop and deploy the extension, you can develop a web page to display how the extension works. Set this parameter to the page URL so that users can visit this web page to better understand and use the extension. For example, users can visit the web page to view the reason why a process is checked and blocked by the extension.

    Extension Documentation Address

    The URL of the extension documentation page.

    After you develop and deploy the extension, you can develop a help documentation page. Set this parameter to the page URL so that users can know the business logic and properties of the extension.

    Parameters for Extension

    The extension parameters that you want to use to improve the extension development and application efficiency.

    You can enter both the built-in parameters for typical scenarios and custom parameters for reference.

    You can enter multiple parameters in the format of key=value. Make sure that each parameter occupies a separate line. For more information about how to use these parameters, see Advanced feature: Configure extension parameters.

    Options for Extension

    The configuration items for the extension. The configuration items are provided for users to implement custom process management in different workspaces based on their business requirements. The extension developer must define each configuration item in a JSON string in the Register Extension dialog box.

    For example, the extension developer can allow users to manage the length of an SQL statement based on this parameter. For more information about the JSON format, see Advanced feature: Define options for an extension.

  3. Click OK.

    After you register the extension, you can view the extension in the List of Extensions section of the Extensions page.

Step 2: Publish the extension

After you develop, deploy, and register the extension in DataWorks, you must test the extension, submit the extension for review, and then publish the extension. Then, administrators, besides the owner of the extension, can enable the extension in Management Center. For more information, see Use an extension.

Appendix: Formats of event messages sent to Function Compute

The following sample code shows the general format for the types of event messages that are pushed to Function Compute. The messageBody parameter specifies the details of a DataWorks event message. The content of an event message varies based on the event type.

{
	"blockBusiness": true,
	"eventCategoryType": "resources-download",// The event category.
	"eventType": "upload-data-to-table",// The event type.
	"extensionBizId": "job_6603643923728775070",
	"messageBody": {
             // The content of an event message varies based on the event type. The following fields are the fixed fields in an event message.  
             "tenantId": 28378****10656,// The ID of the tenant. Each Alibaba Cloud account in DataWorks corresponds to a tenant. Each tenant has its own tenant ID. To view the tenant ID, click the username in the upper-right corner of the DataStudio page and then click User Info in the Menu section. 
             "eventCode": "xxxx"//
	},
	"messageId": "52d44ee7-b51f-4d4d-afeb-*******"// The event ID. The ID is the unique identifier of an event. 
}
Note

The content of the messageBody field varies based on the event type. For more information about event messages, see Appendix: Message formats.

References