You can configure custom logic for an extension to manage operations performed in DataWorks, such as blocking specific operations. Extensions return processing results on specific events to implement process control in DataWorks. This topic describes how to develop and deploy an extension based on a self-managed service.
Background information
Prerequisites
The event message subscription feature is enabled. The development and deployment of an extension based on a self-managed service depends on the event message distribution capability of EventBridge. Make sure that you specify an event bus in EventBridge to receive event messages from DataWorks, and specify an on-premises or cloud-based service to receive the event messages from the event bus.
Limits
Only users of DataWorks Enterprise Edition can use the Extensions module.
The Extensions module is available in the following regions: China (Beijing), China (Hangzhou), China (Shanghai), China (Zhangjiakou), China (Shenzhen), China (Chengdu), US (Silicon Valley), US (Virginia), Germany (Frankfurt), Japan (Tokyo), China (Hong Kong), and Singapore.
Precautions
Only the Open Platform administrator, tenant administrator, Alibaba Cloud accounts, and RAM users to which the AliyunDataWorksFullAccess policy is attached have read and write permissions on the developer backend. For more information about permission management, see Manage permissions on global-level services and Manage permissions on the DataWorks services and the entities in the DataWorks console by using RAM policies.
If DataWorks Enterprise Edition expires, extensions become invalid and cannot be triggered to check extension point events. If an extension is triggered to check an event and does not complete the check before DataWorks Enterprise Edition expires, the check is terminated and the result Check Passed is returned.
If a combined node, such as a Platform for AI node, do-while node, or for-each node, triggers an extension check, you must wait until all inner nodes of the combined node pass the check before you can perform subsequent operations.
You can associate multiple extensions with the same extension point event. This way, the associated extensions are triggered when the event occurs.
Process
The following figure shows how an extension that is developed and deployed based on a self-managed service consumes event messages by using EventBridge.
After an extension is triggered by an extension point event, the event process enters the Checking state. After the extension sends the processing result to DataWorks by calling an API operation, DataWorks determines whether to block the process.
User
Before you deploy an extension based on a self-managed service in DataWorks, you must develop an extension and deploy the extension on an on-premises or cloud-based service. You can refer to Appendix: DataWorks Open Platform sample code library to initialize the project code and obtain sample code for Open Platform from GitHub. You must develop and deploy an extension based on the type of the service that you want to use to receive event messages from the event bus that you configure.
Step 1: Configure dependencies for an extension
When you develop an extension, you must add the following dependency to the pom.xml
file. EventBridge allows you to use various types of event targets to process and consume events. You can configure other dependencies based on the event targets that you specify.
DataWorks dependency library
<dependency>
<groupId>com.aliyun</groupId>
<artifactId>dataworks_public20200518</artifactId>
<version>5.6.0</version>
</dependency>
Dependency packaging
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
Step 2: Develop an extension
The event messages that are routed by an event bus of EventBridge are pushed to an on-premises or cloud-based service. The service receives DataWorks event messages that are pushed by the event bus and calls a specific DataWorks API operation to send the processing results to DataWorks.
Develop the code of an extension.
Parse event messages
For information about the formats of event messages in DataWorks, see the Appendix: Message formats section of the "Development reference: Event messages and formats of event messages" topic. In an event message, the
data
field specifies the message content. During actual development, you can use thedata.eventCode
field to confirm the message type and theid
field to obtain the message details.NoteThe OpenEvent module of DataWorks Open Platform distributes DataWorks event messages by using EventBridge. Before you develop an extension, you must enable the event message subscription feature in DataWorks.
Write the processing logic
Write the processing logic for event messages that are pushed by the event bus based on your business requirements. When you develop the code of an extension, you can use the following methods to improve the development efficiency and application effect:
Use extension parameters. For example, you can configure the
extension.project.disabled
parameter to prevent the extension from taking effect for the specified workspace. For more information, see Advanced feature: Configure extension parameters.Configure the
MessageId
parameter in the GetIDEEventDetail operation for extension point events in DataStudio to obtain snapshots of extension point events.
NoteThe
MessageId
parameter corresponds to the id field in an event message. For more information, see the Appendix: Message formats section of the "Development reference: Event messages and formats of event messages" topic.Return the processing result of the extension to DataWorks
The service where the extension is deployed can call an API operation to return the processing results of extension point events that are specified in the extension to DataWorks. The API operation varies based on the DataWorks service to which the extension point events belong.
Extension point events in DataStudio: Call the UpdateIDEEventResult API operation to return the processing results.
Extension point events in Operation Center: Call the UpdateWorkbenchEventResult API operation to return the processing results.
Extension point events in other DataWorks services: Call the CallbackExtension API operation to return the processing results.
The specific API operation returns the following response parameters:
CheckResult: The processing result of event messages. Valid values:
OK
: An extension point event passes the check of an extension.FAIL
: An extension point event fails the check of an extension. You must check and handle the reported error at the earliest opportunity to ensure that your service runs as expected.WARN
: An extension point event passes the check of an extension, but a warning is reported for the event.
ExtensionCode: The code of an extension. After you register an extension, you can view the code of the extension in the List of Extensions section of the Extensions page.
MessageId: The ID of the message. This parameter corresponds to the id field in an event message. For more information, see the Appendix: Message formats section of the "Development reference: Event messages and formats of event messages" topic.
Package the extension as an executable
.jar
file.
Step 3: Deploy the extension
After you develop and debug the code of the extension, you must deploy the code package as an application service on an Alibaba Cloud Elastic Compute Service (ECS) instance or another cloud service.
DataWorks
After you develop the code, you can register and manage the extension in DataWorks.
Step 1: Register an extension
Before you use an extension, you must register an extension in DataWorks and obtain the extension code for subsequent code development. To register an extension, perform the following steps:
Go to the Open Platform page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . The Developer Backend tab appears.
Register the extension.
In the left-side navigation pane of the page that appears, click Extensions.
Click Register Extension in the List of Extensions section. In the Select Deployment Method step of the Register Extension wizard, set the Select a deployment method for your extension parameter to Deploy Based on Self-managed Service and configure the parameters in the Register Extension step.
Click OK.
NoteAfter you register the extension, you can view the extension in the List of Extensions section of the Extensions page.
Step 2: Publish the extension
After you develop, deploy, and register the extension in DataWorks, you must test the extension, submit the extension for review, and then publish the extension. Then, administrators, besides the owner of the extension, can enable the extension in Management Center. For more information, see Use an extension.
Appendix: Formats of event messages sent to EventBridge
The following sample code provides an example of a complete event message. The data parameter specifies the content of the event message. EventBridge provides other information based on the content.
{
"datacontenttype": "application/json;charset=utf-8", //The content type of the data field. datacontenttype supports only the application/json content type.
"aliyunaccountid": "1111",//The ID of an Alibaba Cloud account.
"aliyunpublishtime": "2024-07-10T07:25:34.915Z",// The time when EventBridge receives the event message.
"data": {
"tenantId": 28378****10656,// The ID of the tenant. Each Alibaba Cloud account in DataWorks corresponds to a tenant. Each tenant has its own tenant ID. To view the tenant ID, click the username in the upper-right corner of the DataStudio page and then click User Info in the Menu section.
"eventCode": "xxxx"//
},
"aliyunoriginalaccountid": "11111",
"specversion": "1.0",
"aliyuneventbusname": "default",// The name of the event bus that is used to receive DataWorks event messages.
"id": "45ef4dewdwe1-7c35-447a-bd93-fab****",// The event ID. The ID is the unique identifier of an event.
"source": "acs.dataworks",// The event source, which indicates the service that produces events. In this example, event messages are pushed by DataWorks.
"time": "2024-07-10T15:25:34.897Z",// The time when the event was generated.
"aliyunregionid": "cn-shanghai",// The region where the event is received.
"type": "dataworks:ResourcesUpload:UploadDataToTable"// The event type. You can use the event type to filter the messages pushed by DataWorks in the EventBridge console. The type of each event is different.
}
The content of the event message varies based on the event type. For more information about event messages, see Development reference: Event messages and formats of event messages.
Sample code used to develop a custom extension
After you understand the precautions for the extension development procedure, you can develop extension code based on your business requirements. The following topics provide examples on extension registration, development, and application in typical scenarios:
References
For information about the message formats of different types of events, see Development reference: Event messages and formats of event messages.
The OpenEvent module allows you to subscribe to extension point events by using EventBridge. For more information, see Overview.
For more information about the extension point events that can be processed by DataWorks extensions, see Overview.
You can also deploy an extension based on Function Compute. For more information, see Develop and deploy an extension based on Function Compute.