All Products
Search
Document Center

DataWorks:Release notes

Last Updated:Dec 13, 2024

This topic describes the release notes for DataWorks and provides links to the relevant references.

2024

2024-11

Feature

Description

Release date

Region

Scope

References

Data Integration

All data in an ApsaraDB for OceanBase database can be synchronized to MaxCompute in real time by OceanBase MySQL users.

2024-11-21

All regions

All users

-

Data Map

The metadata collection feature provided by Data Map can be used to collect and manage metadata of AnalyticDB for Spark data sources.

2024-11-21

All regions

Users participating in the public preview of Data Studio

Metadata collection

Data Map

A data insight task can be created on the details page of a MaxCompute table in Data Map to obtain statistics and distribution of data based on in-depth data analysis and interpretation.

2024-11-21

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Ulanqab), China (Shenzhen)

All users

MaxCompute table data

Data Asset Governance

Data Governance Center is upgraded to Data Asset Governance. Data Asset Governance can detect issues that need to be handled in the data storage, task computing, code development, and data quality dimensions based on the configured governance plans. Data Asset Governance provides health scores to evaluate the effectiveness of data governance and visualizes the governance results from various perspectives. This helps you achieve governance goals in an efficient manner. Data Asset Governance also provides features, such as business asset management, asset analysis, resource consumption details of tasks, and cost estimation, to help you better understand the usage details of various resources.

2024-11-14

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), US (Silicon Valley), and US (Virginia)

All users

-

Computing resources

AnalyticDB for Spark computing resources can be associated.

2024-11-05

All regions

Users participating in the public preview of Data Studio

-

Connect a personal development environment to a Git repository

A personal development environment can be integrated with a Git repository to facilitate code version management and team collaboration.

2024-11-04

All regions

Users participating in the public preview of Data Studio

Connect a personal development environment to a Git repository

2024-10

Feature

Description

Release date

Region

Scope

References

Image management

Custom images can be created as permanent images in DataWorks. This way, the same image environment can be used each time you run a task on a node, which frees you from repeatedly deploying an image environment. This ensures the consistency of the runtime environment and reduces task running duration, computing costs, and traffic costs.

2024.10.18

China (Beijing), China (Shanghai), China (Shenzhen), China (Hangzhou), China (Hong Kong), China (Zhangjiakou), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Japan (Tokyo), Germany (Frankfurt), UK (London), US (Silicon Valley), and US (Virginia)

All DataWorks users

Manage images

Support for serverless synchronization tasks

Serverless synchronization tasks are supported by Data Integration. You do not need to configure a resource group for a serverless synchronization task. This allows you to focus only on your business.

2024.10.12

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Hong Kong), UK (London), US (Silicon Valley), US (Virginia), Japan (Tokyo), Germany (Frankfurt), and Malaysia (Kuala Lumpur)

All DataWorks users

Configure a serverless synchronization task

2024-09

Feature

Description

Release date

Region

Scope

References

New type of real-time synchronization task

Real-time synchronization of data from a Simple Log Service Logstore to Data Lake Formation (DLF) 2.0 is supported. The data is written to DLF 2.0 in the Paimon format, and simple data processing is supported during data synchronization.

2024.9.13

All regions

All DataWorks users

Synchronize data from a Simple Log Service Logstore to DLF 2.0 in real time

Auto resource scaling during the running of a real-time synchronization task

Resources configured for a real-time synchronization task can be automatically scaled during the running of the task. You do not need to stop the task. You need to only configure an adjustment plan. The system adjusts resources for the task based on the adjustment plan during the running of the task.

2024.9.13

All regions

All DataWorks users

Dynamic scaling conditions for resource tuning

2024-08

Feature

Description

Release date

Region

Scope

References

New type of real-time synchronization task

Real-time synchronization of data from a MySQL database to SelectDB or Apache Doris is supported.

2024.8.29

All regions

All DataWorks users

Synchronize all data in a MySQL database to SelectDB in real time

Permission management on Hologres

DataWorks Security Center allows you to manage the permissions for users to access Hologres data. You can configure an authorized identity, request permissions on Hologres tables, process permission requests, and view permission request records and request processing records.

2024.8.22

All regions

All DataWorks users

Manage permissions on Hologres

Export of SQL query results to a DingTalk sheet

SQL query results in DataWorks can be exported to a DingTalk sheet. After you perform an SQL query in DataWorks, you can export the query result to a DingTalk sheet by using the DingTalk application. This prevents data security issues that are caused by downloading query results as Excel files.

2024.8.14

China (Zhangjiakou)

China (Chengdu)

All DataWorks users

-

Lower permission requirement on the default access identity used to access a MaxCompute data source

The permission requirement on the default access identity used to access a MaxCompute data source is lowered. If you want to set the Default Access Identity parameter to Alibaba Cloud RAM User when you add a MaxCompute data source, you must make sure that the related RAM user has permissions of the Admin or Super_Administrator role of the related MaxCompute project. Before the permission requirement is lowered, the related RAM user must be attached the AdministratorAccess policy.

2024.8.8

All regions

All DataWorks users

Add a MaxCompute data source

Support for CloudSSO in DataWorks Enterprise Edition

CloudSSO is supported in DataWorks Enterprise Edition. CloudSSO allows you to use a third-party or self-managed identity provider (IdP) to log on to the Alibaba Cloud Management Console to use DataWorks.

2024.8.8

All regions

All DataWorks users

Differences among DataWorks editions

2024-07

Feature

Description

Release date

Region

Scope

References

RAM policy document update

The ListUserResources permission is added to the AliyunDataWorksReadOnlyAccess policy to allow you to view information about all DataWorks resources of all users.

2024.7.10

All regions

All DataWorks users

Update on features related to the registration of E-MapReduce (EMR) clusters

The following features are supported for the registration of an EMR cluster to DataWorks:

  • Configure the Kyuubi connection information based on your business requirements.

  • Register an EMR Serverless Spark cluster.

2024.7.10

China (Zhangjiakou), which is the only region that supports EMR Serverless Spark

All DataWorks users

New node type in DataStudio

CDH Spark SQL nodes are supported in DataStudio. A CDH Spark SQL node can be used to develop and periodically schedule CDH Spark SQL tasks and integrate the tasks with other types of tasks.

2024.7.10

All regions

All DataWorks users

Create a CDH Spark SQL node

2024-06

Feature

Description

Release date

Region

Scope

References

New synchronization link in Data Integration

All data in a MySQL database can be synchronized to StarRocks in offline mode or in real time by using Data Integration.

2024.06.28

All regions

All DataWorks users

Creation of data push nodes in DataStudio

In a workflow in DataStudio, data push nodes can be created and configured as descendant nodes of nodes that are used to process data and generate data tables, and then the data push nodes can be used to periodically push the data generated by ancestor nodes to DingTalk or Lark groups in the form of a message card.

Note

To use data push nodes, you must submit a ticket to contact technical support to upgrade the resource groups for scheduling.

2024.6.28

  • China (Hangzhou)

  • China (Shanghai)

  • China (Beijing)

  • China (Shenzhen)

  • China (Chengdu)

  • China (Hong Kong)

  • Singapore

  • Malaysia (Kuala Lumpur)

  • US (Silicon Valley)

  • US (Virginia)

  • Germany (Frankfurt)

All DataWorks users

Best practice for configuring data push nodes in a workflow

Release of serverless resource groups

To facilitate the management of resources in DataWorks and improve user experience, serverless resource groups are introduced in DataWorks. A serverless resource group can implement the core features of an exclusive resource group for scheduling, an exclusive resource group for Data Integration, and an exclusive resource group for DataService Studio at the same time. Operations such as data synchronization, task scheduling and running, and API calling and management can be performed by using only one serverless resource group.

2024.6.11

  • China (Beijing)

  • China (Shanghai)

  • China (Shenzhen)

  • China (Hangzhou)

  • China (Hong Kong)

  • China (Zhangjiakou)

  • Singapore

  • Malaysia (Kuala Lumpur)

  • Indonesia (Jakarta)

  • Japan (Tokyo)

  • Germany (Frankfurt)

  • UK (London)

  • US (Silicon Valley)

  • US (Virginia)

All DataWorks users

New best practice for task development based on Lindorm Distributed Processing System (LDPS)

LDPS is compatible with CDH. Operations such as interactive SQL queries, SQL task development, and JAR task execution can be performed in DataWorks based on LDPS after you register a CDH cluster to DataWorks and configure LDPS connection information.

2024.6.5

All regions

All DataWorks users

Develop tasks based on LDPS

New data source type for data synchronization

Azure Blob Storage data sources are supported for data synchronization.

2024.6.3

All regions

All DataWorks users

Azure Blob Storage data source

2024-05

Feature

Description

Release date

Region

Scope

References

Support for reading MySQL binary logs from Object Storage Service (OSS)

MySQL binary logs can be read from OSS.

When you add a MySQL data source, you can turn on Enable Binary Log Reading from OSS if you set the Configuration Mode parameter to Alibaba Cloud Instance Mode and set the Region parameter to the region in which the current DataWorks workspace resides. After you turn on this switch, DataWorks attempts to obtain binary logs from Object Storage Service (OSS) when it cannot read binary logs from ApsaraDB RDS for MySQL. This prevents real-time synchronization tasks from being interrupted.

2024.5.24

All regions

All DataWorks users

MySQL data source

Update on the Data Quality service

The Data Quality service is updated. After the update, a specific range of data in a table can be checked based on monitoring rules. This helps optimize the process of data quality monitoring.

2024.5.21

The new version of Data Quality will be released in phases. You can view the regions where the new version is supported in the DataWorks console. If the features of the new version of Data Quality are unavailable in the region where your business is located, see Data Quality of the previous version.

All DataWorks users

New version of Data Quality

New synchronization link in Data Integration

All data in a Hologres database can be synchronized to another Hologres database in offline mode.

2024.5.20

All regions

All DataWorks users

Synchronize all data in a Hologres database to another Hologres database in offline mode

Support for remote access to a host and triggering of script running on the host by DataWorks

An SSH node can be created and used based on a specific SSH data source in DataWorks to remotely access a host that is connected to the data source and trigger script running on the host.

2024.5.15

All regions

All DataWorks users

Support for EMR Kyuubi nodes in DataStudio

EMR Kyuubi nodes are supported in DataStudio. EMR Kyuubi nodes can be used to develop and periodically schedule Kyuubi tasks and integrate Kyuubi tasks with other types of tasks.

2024.5.11

All regions

All DataWorks users

Create an EMR Kyuubi node

New types of database nodes in DataStudio

Multiple types of database nodes are supported in DataStudio, such as DRDS nodes, PolarDB for MySQL nodes, and Doris nodes. These types of nodes can be used to develop and periodically schedule the related types of tasks and integrate the tasks with other types of tasks.

2024.5.11

All regions

All DataWorks users

2024-04

Feature

Description

Release date

Region

Scope

References

SSL authentication configuration during the addition of a PostgreSQL data source

SSL authentication can be configured when a PostgreSQL data source is added for data synchronization.

2024.4.26

All regions

All DataWorks users

PostgreSQL data source

Support for Hologres data sources in Data Governance Center

Hologres data sources are supported in Data Governance Center.

Before you can use a Hologres data source in Data Governance Center, you must collect metadata of Hologres in Data Map. For more information, see Metadata collection.

2024.4.24

Hologres data sources support Data Governance Center only in the following regions: China (Beijing), China (Shanghai), China (Hangzhou), and China (Shenzhen).

All DataWorks users

Data Governance Center overview

Support for materialized views in Data Governance Center

DataWorks supports automated governance of materialized views based on intelligent recommendations. This is an intelligent and automated solution for frequent big data computing tasks that contain a large number of similar subqueries. If you enable the intelligent recommendation feature on materialized views, DataWorks can automatically identify and classify similar subqueries in MaxCompute and generate recommendations for creating materialized views. You can create a materialized view with a few clicks based on your business requirements. This significantly improves computing efficiency and saves computing resources.

2024.4.12

All regions

All DataWorks users

Automated governance of materialized views

Support for mapping between an Alibaba Cloud account or a RAM user and an OpenLDAP account of a CDH or Cloudera Data Platform (CDP) cluster

When you register a CDH or CDP cluster to DataWorks, a mapping can be configured between an Alibaba Cloud account or a RAM user and an OpenLDAP account of a CDH or CDP cluster based on your business requirements. After the mapping is configured, the CDH or CDP tasks that are submitted by the Alibaba Cloud account or the RAM user are run by the mapped OpenLDAP account. If you want to isolate permissions on the data that can be accessed by using different Alibaba Cloud accounts or RAM users in a CDH cluster, you can use the OpenLDAP account mapping type.

2024.4.8

China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Zhangjiakou), and China (Chengdu)

All DataWorks users

Configure mappings between tenant member accounts and CDH or CDP cluster accounts

2024-03

Feature

Description

Release date

Region

Scope

References

New version of the data backfill feature

A new version of the data backfill feature is released. After an auto triggered task is developed, committed, and deployed, the auto triggered task is run based on the scheduling configurations. If you want to run the auto triggered task in a specified time range, you can backfill data for the task. Data of a historical or future period of time can be backfilled for an auto triggered task to write the data to time-based partitions. The new version supports the following data backfill methods:

2024.3.28

All regions

All DataWorks users

Backfill data and view data backfill instances (new version)

Development and deployment of extensions based on Function Compute in Open Platform

Custom event message logic can be configured for an extension in DataWorks to manage user behavior, such as intercepting or blocking improper behavior. If Function Compute is used to develop and deploy extensions, specific event messages are automatically sent to the related Function Compute service. Take note of the following items:

  • The operation is simple. Only one function is required to deploy an extension.

  • You are charged for using Function Compute. For more information, see Billing overview.

  • Extensions that are deployed based on Function Compute can process only a pre-event for data download.

2024.3.19

  • China (Beijing)

  • China (Hangzhou)

  • China (Shanghai)

  • China (Zhangjiakou)

  • China (Shenzhen)

  • China (Chengdu)

  • US (Silicon Valley)

  • US (Virginia)

  • Germany (Frankfurt)

  • Japan (Tokyo)

  • China (Hong Kong)

  • Singapore

Users of DataWorks Enterprise Edition

Develop and deploy an extension based on Function Compute

Support for custom publishing policies for models in Data Modeling

A publishing policy can be defined for a model based on your business requirements in Data Modeling. After you enable a publishing policy, you can select a publishing mode that meets your business requirements based on the configurations of the policy when you publish a model.

2024.3.12

All regions

DataWorks users who activate the Data Modeling service

Publishing policy management

2024-02

Feature

Description

Release date

Region

Scope

References

Addition of usage notes for the development of CDP or CDH tasks in DataWorks

Usage notes for the development of CDP or CDH tasks in DataWorks are added. The usage notes cover the basic development process, fee description, environment preparation, and permission management.

2024.2.21

All regions

All DataWorks users

Usage notes for development of CDP or CDH tasks in DataWorks

Support for StarRocks data sources added in Alibaba Cloud instance mode in DataService Studio

After an EMR Serverless StarRocks cluster is created, the cluster can be added in Alibaba Cloud instance mode to DataWorks as a StarRocks data source. The data source can be quickly encapsulated into an API in DataWorks DataService Studio to achieve data sharing and openness.

2024.2.20

All regions

All DataWorks users

Add a data source

Search feature for data development code in Data Map

The search feature for data development code is supported in DataWorks Data Map. This feature can be used to search for data development code across workspaces and locate the desired code based on keywords. This helps improve development efficiency and reduce project redundancy.

2024.2.20

All regions

Users of DataWorks Standard Edition and more advanced editions

Search for code

Support for data upload and download feature

The data upload and download feature is supported in DataWorks. On-premises CSV files and OSS objects can be uploaded to MaxCompute for processing and analysis. The list of uploaded files and the list of files downloaded by services, such as DataWorks DataAnalysis, can also be managed.

2024.2.20

All regions

All DataWorks users

Support for CDH-related nodes in DataStudio

CDH-related nodes, such as CDH Hive, CDH Spark, CDH MR, CDH Presto, and CDH Impala nodes, are supported in DataStudio. The nodes can be used to develop and periodically schedule CDH-related tasks.

2024.2.19

All regions

All DataWorks users

New version of the System Configuration page in Data Security Guard

The following operations can be performed on the System Configuration page:

  • Enable or disable content-based sensitive data identification, and specify the identification scope.

  • Specify the retention period of watermarked files.

  • Specify whether to display the security level of identified data.

  • Specify the email address and webhook URL for receiving alert notifications.

The preceding operations can help you identify and resolve potential security risks at the earliest opportunity.

2024.2.6

All regions

All DataWorks users

Configure system settings

2024-01

Feature

Description

Release date

Region

Scope

References

Display of masked query results in DataStudio and DataAnalysis

Categorization and sensitivity level classification, sensitive data identification, and display of masked query results for data in EMR tables are supported in Data Security Guard.

If the results obtained after you execute an SQL statement in DataStudio or DataAnalysis to query data contain sensitive data, the system automatically masks or encrypts the sensitive data based on specific data masking rules and returns the masked query results. This helps improve enterprise data security.

2024.1.25

All regions

All DataWorks users

Display of data lineages involved in real-time synchronization links in Data Map

Data lineages involved in the following real-time synchronization links can be parsed and displayed in Data Map:

  • Real-time data synchronization from MySQL to MaxCompute or Hologres

  • Real-time data synchronization from Kafka to MaxCompute or Hologres

  • Real-time data synchronization from LogHub to MaxCompute or Hologres

  • Real-time data synchronization from PolarDB to MaxCompute

Combination analysis of real-time synchronization lineages and batch synchronization lineages can help you comprehensively understand the data forwarding situation.

2024.1.15

All regions

All DataWorks users

View lineages

2023

2023-12

Feature

Description

Release date

Region

Scope

References

Association of data sources with DataStudio

Data sources or clusters can be associated with DataStudio. After the association, you can use the data sources or clusters to perform data modeling or to periodically schedule tasks in Operation Center. You can also read data in the data sources or clusters and perform data development operations.

2023.12.29

All regions

All DataWorks users

Preparations before data development: Associate a data source or a cluster with DataStudio

New version of data sources

MaxCompute, Hologres, AnalyticDB for PostgreSQL, AnalyticDB for MySQL, and ClickHouse compute engines are managed as data sources, and EMR and CDH or CDP compute engines are managed as open source clusters. This can help improve user experience. After the change, operations that are related to compute engines, such as creating and modifying compute engines, are performed on the Data Sources or Open Source Clusters page in Management Center of the DataWorks console.

2023.12.29

All regions

All DataWorks users

New extension point events

  • The following extension point events are added in Open Platform: DeleteProject: a pre-event for workspace deletion within a tenant

  • ProjectDeleted: a post-event for workspace deletion within a tenant

  • DownloadResources: a pre-event for data download

2023.12.27

All regions

All DataWorks users

New application scopes of extension point events

The following application scopes of extension point events are added:

  • Tenant level: A tenant-level event takes effect for the tenant.

  • Workspace level: A workspace-level event takes effect for a specific workspace.

When you register an extension, you can select only a single type of extension point event.

2023.12.22

  • China (Beijing)

  • China (Hangzhou)

  • China (Shanghai)

  • China (Zhangjiakou)

  • China (Shenzhen)

  • China (Chengdu)

  • US (Silicon Valley)

  • US (Virginia)

  • Germany (Frankfurt)

  • Japan (Tokyo)

  • China (Hong Kong)

  • Singapore

All DataWorks users

New check items for SQL efficiency optimization in Data Governance Center

Five check items are added in Data Governance Center. The check items include DescartesChecker for MaxCompute, EMR Hive, and EMR Spark SQL tasks, MasterTableOnConditionChecker, and Force Scan Checker. The check items can help you perform pre-event checks and timely optimization in the R&D stage, improve computing efficiency, reduce the waste of a large number of computing resources, and ensure the timeliness of data output.

2023.12.22

All regions

All DataWorks users

Configure check items

Full support for StarRocks data sources

StarRocks data sources are fully supported in the following services of DataWorks:

  • Data Integration: Data synchronization from or to a StarRocks data source is supported.

  • DataStudio: Creation and periodic scheduling of a StarRocks task are supported.

  • DataAnalysis: Query and analysis of StarRocks data are supported.

  • DataService Studio: A StarRocks data table can be encapsulated into an API.

  • Data Map: StarRocks metadata can be managed, searched, and displayed in Data Map.

2023.12.15

All regions

All DataWorks users

Support for new EMR Hadoop cluster versions

The following EMR Hadoop cluster versions are supported in DataWorks:

  • EMR-3.26.3

  • EMR-3.27.2

  • EMR-3.29.0

  • EMR-3.32.0

  • EMR-3.35.0

  • EMR-3.38.2

  • EMR-3.38.3

  • EMR-4.3.0

  • EMR-4.4.1

  • EMR-4.5.0

  • EMR-4.5.1

  • EMR-4.6.0

  • EMR-4.8.0

  • EMR-4.9.0

  • EMR-5.2.1

  • EMR-5.4.3

  • EMR-5.6.0

2023.12.15

All regions

All DataWorks users

Usage notes for development of EMR nodes in DataWorks

New check items for Check nodes in DataStudio

New check items are supported for Check nodes in DataStudio. You can use a Check node to check the availability of MaxCompute partitioned tables, FTP files, and OSS objects based on check policies. If the condition that is specified in the check policy for a Check node is met, the task on the Check node is successfully run.

If the running of a task depends on an item, you can use a Check node to check the availability of the item and configure the task as a descendant task of the Check node. If the condition that is specified in the check policy for the Check node is met, the task on the Check node is successfully run and then its descendant task is triggered to run.

2023.12.08

All regions

All DataWorks users

Configure a Check node

Support for PAI DLC nodes in DataStudio

PAI DLC nodes are supported in DataStudio. PAI DLC nodes can be used to periodically schedule DLC tasks.

2023.12.08

All regions

All DataWorks users

Create and use a PAI DLC node

Support for risk identification rules in Security Center

Risk identification rules are supported in Security Center. Security Center allows an administrator to register risk identification capabilities to DataWorks as extensions. This way, the extensions can be used as risk identification rules to identify risks in user operations.

You can use a default or custom risk identification rule to identify risks on data download operations and configure a blocking or approval response policy based on your business requirements.

2023.12.08

All regions

All DataWorks users

Risk identification rules

2023-11

Feature

Description

Release date

Region

Scope

References

Support for Check nodes in DataStudio

Check nodes are supported in DataStudio. You can use a Check node to check whether a specific partition exists in a MaxCompute partitioned table or whether data is written to the partition. If a task depends on a MaxCompute partitioned table, you can use a Check node to check whether the partition data in the table is available first. This prevents invalid data from being used.

2023.11.20

  • China (Chengdu)

  • China (Zhangjiakou)

  • China (Beijing)

  • China (Shanghai)

  • Malaysia (Kuala Lumpur)

All DataWorks users

Configure a Check node

2023-08

Feature

Description

Release date

Region

Scope

References

Support for specifying a scheduling cycle

The scheduling calendar feature is supported. You can specify a scheduling cycle by marking dates on a scheduling calendar as scheduling days or non-scheduling days.

2023.08.24

All regions

Users of DataWorks Enterprise Edition

Configure a scheduling calendar

Support for governance along the data lake development link in Data Governance Center

Proactive governance is supported for issues identified along the data lake development link, which consists of EMR, Data Lake Formation (DLF), and DataWorks. The following governance capabilities are supported:

  • Assessment of the health score for governance

  • Automatic identification of governance issues in the R&D and storage dimensions

  • Pre-event check based on Hive SQL and Spark SQL

2023.08.24

  • China North 2 Ali Gov

  • China East 2 Finance

  • China (Shanghai)

  • China (Hangzhou)

  • China (Beijing)

  • China (Shenzhen)

  • China (Chengdu)

  • China (Hong Kong)

  • Singapore

  • US (Silicon Valley)

  • Germany (Frankfurt)

  • Indonesia (Jakarta)

Users of DataWorks Enterprise Edition or a more advanced edition

Data Governance Center overview

2023-06

Feature

Description

Release date

Region

Scope

References

Real-time synchronization from Kafka to Hologres to implement the extract, transform, load (ETL) process

  • In real-time data synchronization from Kafka to Hologres, JSON-formatted Kafka data can be parsed and other basic data processing operations can be performed to implement the ETL process.

  • Keys and values can be obtained from a specified JSON path and can be dynamically extended. This is suitable for scenarios in which the format of messages from the Kafka data source changes.

  • Simulated running during task configuration is supported. This way, you can check whether the data that you want to write to the destination is correctly processed in advance.

2023.06.01

All regions

All DataWorks users

Kafka data source

Real-time synchronization of all data from a MySQL database to an OSS data lake in the Hudi format

All data in a MySQL database can be synchronized to an OSS data lake in real time. The data is written to the data lake in the Hudi format. The following capabilities are supported:

  • Automatic integration with Alibaba Cloud DLF to generate metadata for management.

  • Instance-level synchronization. You can select multiple source MySQL databases at a time.

  • Selection of source MySQL databases and tables based on a regular expression.

  • Automatic database and table addition. After source MySQL databases and tables are added, data is automatically synchronized to OSS without manual intervention.

2023.06.01

All regions

All DataWorks users

OSS data source

Support for Amazon Relational Database Service (Amazon RDS) data sources for data synchronization

An Amazon RDS data source can be added for data synchronization in the same way as a MySQL data source. An Amazon RDS data source provides the same capabilities as a MySQL data source.

2023.06.01

All regions

All DataWorks users

MySQL data source

2023-04

Feature

Description

Release date

Region

Scope

References

Support for saving data analysis results as MaxCompute tables

Data analysis results can be directly saved as MaxCompute tables for subsequent queries or joint analysis, without the need to run code to create tables for saving the data analysis results.

2023.4.20

All regions

All DataWorks users

Perform operations on the query results

Support for downloading millions of SQL query result records in DataAnalysis

By default, a maximum of 10,000 SQL query result records can be downloaded. Administrators can modify the upper limit for different editions in Security Center: 200,000 for DataWorks Standard Edition, 2,000,000 for DataWorks Professional Edition, and 5,000,000 for DataWorks Enterprise Edition and DataWorks Ultimate Edition. The download feature can be disabled.

2023.4.18

All regions

All DataWorks users

SQL query

Launch of public datasets for big data services

Terabytes of data can be quickly analyzed by DataWorks and MaxCompute based on public datasets for big data and AI services in different platforms, such as Taobao, Fliggy, Ali Music, GitHub, and TPC.

2023.4.11

All regions

All DataWorks users

SQL query

2023-03

Feature

Description

Release date

Region

Scope

References

Support for notifying governance issues in Data Governance Center

Notifications can be configured for daily governance issues by administrators and individual users. This way, the system can send the notifications to the related engineers by system message, email, DingTalk group message, or webhook URL. This facilitates the handling of the governance issues.

2023.3.15

All regions

All DataWorks users

Configure a periodic notification for governance issues

Support for the long-lifecycle governance item in the storage dimension in Data Governance Center

The long-lifecycle governance item is supported in the storage dimension in Data Governance Center. The governance item can help users specify an appropriate lifecycle for MaxCompute partitioned tables to reduce the waste of storage resources.

2023.3.15

All regions

All DataWorks users

Handle governance issues

Commercialization of Acceleration Service provided by DataService Studio

The Acceleration Service solution is introduced in DataService Studio. You can use the solution to create an online API to accelerate the query of MaxCompute data without exporting data from MaxCompute. This improves the query performance and efficiency and meets online query requirements.

2023.3.1

China (Shanghai), China (Beijing), China (Hangzhou), and China (Shenzhen)

All DataWorks users

Acceleration solutions for API-based data queries

2023-01

Feature

Description

Release date

Region

Scope

References

Support for managing purchased resources in DataWorks

The following features are supported:

All resources that are not released can be displayed in DataWorks. This way, you can perform operations such as upgrading or downgrading specifications, applying for refunds, and renewal on the resources in an efficient manner.

2023.1.11

All regions

All DataWorks users

Billing overview

Support for graceful undeployment of multiple tasks at a time in Data Governance Center

The following features are supported:

  • A scenario-specific governance plan is provided. The plan is suitable for scenarios in which multiple invalid or repeated tasks need to be undeployed in a secure manner at a time.

  • A graceful undeployment governance plan is provided. The personnel responsible for governance can confirm affected users and assets by selecting governance objects.

  • A phased undeployment plan is provided for you to undeploy tasks in an orderly and smooth manner. An undeployment plan involves the scheduling suspension, scheduling delay, and undeployment phases, and phase status notification.

2023.1.9

All regions

All DataWorks users

Graceful undeployment

Support for code review in DataStudio in a workspace in basic mode

The following features are supported:

The code review feature is supported in DataStudio in a workspace in basic mode. If the forcible code review feature is enabled, the code of a node can take effect in the production environment only after the code of the node passes the code review.

2023.1.5

All regions

All DataWorks users

Code review

2022

2022-11

Feature

Description

Release date

Region

Scope

References

Support for data source-oriented API encapsulation in the development and production environments in DataService Studio

The following features are supported in a workspace in standard mode:

  • Advanced parameters can be configured for an API based on the environment type of a data source that you select. The environment type can be development or production.

  • APIs can be tested based on the data in databases in the development environment and can be published based on the data in databases in the production environment.

2022.11.29

All regions

All DataWorks users

Create an API by using the codeless UI

Support for requesting permissions on Hive tables in Data Map

The Request Permissions button is added on the details page of an EMR Hive table in Data Map. You can click this button to request permissions on the table in Security Center.

2022.11.29

All regions

All DataWorks users

Which types of Hive tables can be previewed in Data Map?

Support for data albums in Data Map

The Data Album page is added in Data Map. The following features are provided:

  • Data tables can be organized and managed in a data album based on business categories, data sensitivity levels, and data categories.

  • Specific tables, such as frequently used tables, team tables, and popular tables, can be added to a data album. This way, you can search for the data tables in an efficient manner.

2022.11.16

All regions

All DataWorks users

Table management from the business perspective: Data albums

Brand-new SQL query experience in the upgraded DataAnalysis service

The following features are supported in the upgraded DataAnalysis service:

  • All SQL files and all data table sets that are commonly used for data retrieval within your account can be managed in a centralized manner.

  • Users that have the required permissions can execute SQL statements to extract business data.

  • Secondary processing of SQL query results and display of processed results in charts are supported.

2022.11.15

All regions

All DataWorks users

SQL query

Support for parsing request and response parameters in an API that is created by using the advanced SQL syntax in script mode in DataService Studio

The following features are supported in DataService Studio:

  • After you create an API by using the advanced SQL syntax in script mode, you can separately click Parse Parameter on the Request Param and Response Param tabs in the right-side navigation pane on the configuration tab of the API to parse request and response parameters.

  • Workloads generated by manually specifying parameters are reduced.

2022.11.10

All regions

All DataWorks users

None

2022-10

Feature

Description

Release date

Region

Scope

References

Support for the EMR Hive compute engine in Data Modeling

The following features are supported by the Dimensional Modeling module in Data Modeling. The features enable Data Modeling to provide the same modeling capabilities as MaxCompute.

  • Models can be published to EMR Hive, and an ETL code framework can be generated.

  • Reverse modeling can be performed on existing EMR Hive tables.

2022.11.25

All regions

All DataWorks users

Support for version management of models in Data Modeling

The following features are supported by the Dimensional Modeling module in Data Modeling.

  • Version management of models is supported. Only submitted models can be published.

  • Version comparison and rollback are supported for a model.

2022.11.25

All regions

All DataWorks users

Materialize a table to a compute engine

Display of API call addresses generated based on domain names on the API details page in DataService Studio

The call addresses that are separately generated for an API based on the Internet domain name, VPC domain name, and independent domain name can be displayed on the details page of the API. You can select an address to call the API based on your business requirements.

2022.10.21

All regions

All DataWorks users

View the details of an API

Upgrade of the lineage feature in Data Map

The lineage feature of Data Map is upgraded to provide a better user experience in data lineage analysis. On the lineage details tab, you can perform the following operations:

  • View the ancestor and descendant tables of a table and the ancestor and descendant fields of a table field.

  • View the data source of a table and the destination to which the table data flows.

  • Perform impact analysis for the required levels of descendant tables of a table.

2022.10.21

All regions

All DataWorks users

View the details of a table

Support for new check items in the R&D dimension in Data Governance Center

The following types of check items are added in the R&D dimension in Data Governance Center:

  • Check the type consistency in JOIN conditions

  • Prohibit the usage of specific assets

  • Check the usage of UDFs with the same name

  • Check the limits on data write in the development environment

The following features are provided:

  • Check capabilities can be managed.

    You can enable and configure new check items on the Setting tab and view the usage details of the check items on the Knowledge tab in Data Governance Center.

  • Proactive data governance in the R&D dimension is supported when tasks are committed and deployed in DataStudio.

2022.10.20

All regions

All DataWorks users

Configure check items

Support for code review in DataStudio in a workspace in basic mode

If the forcible code review feature is enabled in a workspace in basic mode, the code of a node can take effect in the production environment only after the code of the node passes the code review.

2022.9.22

All regions

All DataWorks users

Code review

2022-8

Feature

Description

Release date

Region

Scope

References

Task management from the workflow perspective in Operation Center

In Operation Center, the status of tasks can be viewed and operations such as rerunning, freezing, and terminating tasks can be performed from the workflow perspective.

2022.8.22

All regions

All DataWorks users

View and manage auto triggered instances from the workflow perspective

MaxCompute data source-oriented query acceleration in DataService Studio

An online API can be created in DataService Studio by using an acceleration solution to accelerate the query of MaxCompute data, without the need to export data from MaxCompute. This improves the query performance and efficiency and meets online query requirements. The following acceleration solutions are provided:

  • Acceleration based on an Hologres foreign table

  • Acceleration based on the MaxCompute Query Acceleration (MCQA) feature

2022.8.17

China (Shanghai) and China (Shenzhen)

All DataWorks users

Acceleration solutions for API-based data queries

Intelligent diagnostics and analysis of an API call link in DataService Studio

API call logs can be analyzed in DataService Studio. You can use the log analysis feature to analyze the link of a single API call request. If the API call request fails, you can use this feature to troubleshoot issues at the earliest opportunity and obtain diagnostic results and suggestions.

2022.8.7

All regions

DataWorks users

View and analyze API call logs (public preview)

Fine-grained permission management at the project and table levels in Data Map

Various policies can be configured to manage permissions on metadata at different granularities in Data Map.

  • You can determine whether to allow members in other MaxCompute projects to view the metadata of the current MaxCompute project.

  • You can determine whether to allow users who are not members of a MaxCompute project, users who are not table owners, and the workspace administrator to view table metadata.

2022.8.5

All regions

DataWorks users

Appendix: Overview of permission management in Data Map

Support for creation of a batch synchronization task by using the codeless UI to synchronize data from or to a Dameng database in Data Integration

A batch synchronization task can be created by using the codeless UI to synchronize data from or to a Dameng database in Data Integration. The codeless UI is more convenient than the code editor.

2022.8.2

All regions

DataWorks users

Configure a batch synchronization task by using the codeless UI

2022-7

Feature

Description

Release date

Region

Scope

References

Support for dimensional modeling in Data Modeling

The following features are supported in Data Modeling:

  • You can reference the fields and partition information of an existing Hologres table or view as the fields of a model during the template design.

  • You can perform one-click filling for fields whose display name is empty or whose description is empty during the template design.

    In most cases, a physical table has field descriptions. If no display name is specified for a field, you can use this feature to quickly specify a display name for the field to improve modeling efficiency.

  • You can create a node or associate an existing node in DataStudio in the template development process to improve the ETL development efficiency of a model.

2022.7.29

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, China East 2 Finance, China South 1 Finance, China North 2 Ali Gov, Germany (Frankfurt), and US (Silicon Valley)

All DataWorks users

Support for viewing information about associated fields in Data Modeling

The following operations can be performed to view the information about associated fields: Go to the configuration tab of a derived metric or an atomic metric. Click Associate Tables in the right-side navigation pane to view the names of the fields that are associated with the current metric. You can also go to the details page of the table to which an associated field belongs to manage the association.

2022.7.29

All DataWorks users

Derived metric

Support for the configuration of a naming rule checker for tables and derived metrics in Data Modeling

A checker at a data layer can be configured to define a naming convention and unify the naming formats of tables and derived metrics at the data layer. When you design tables and derived metrics, the checker can constrain and verify entity names to improve naming compliance throughout the development process.

Configuration for rules defined in checkers:

  • Strength type: If you set the Strength Type parameter to Strong Rule for the rules defined in checkers at a data layer, you can select a checker to generate a name for a table or a derived metric when you create the table or the derived metric at the data layer and to perform a forceful check. If you set the Strength Type parameter to Weak Rule for the rules defined in checkers at a data layer, you can select a checker to generate a name for a table or a derived metric when you create the table or the derived metric at the data layer.

  • Rule definition: The naming convention. You can configure the organization order and composition of names based on various factors.

2022.7.29

All DataWorks users

Configure and use a checker at a data layer

Support for the configuration of exclusive resource groups used by tasks of different compute engine types in DataAnalysis

An Alibaba Cloud account can be used to configure the exclusive resource groups used by tasks of different compute engine types on the System Management page in DataAnalysis. You can perform SQL queries on a specific exclusive resource group.

2022.7.29

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and Germany (Frankfurt)

All DataWorks users

System management

Support for synchronization of data in PostgreSQL databases

Synchronization of data in PostgreSQL databases is supported. Two-factor authentication based on the .crt and .key files is supported when SSL authentication is performed.

2022.7.26

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), UK (London), US (Silicon Valley), US (Virginia), and UAE (Dubai)

All DataWorks users

Data Integration overview

Support for EMR DataLake clusters

EMR DataLake clusters can be used as compute engines in DataWorks. The following full-lifecycle capabilities that work based on an EMR DataLake compute engine can be implemented: data synchronization, data modeling, data development and scheduling, data quality monitoring, data map, data security, data analysis (related tasks must be run on exclusive resource groups), and data services.

2022.7.8

China (Chengdu), China (Zhangjiakou), China (Shenzhen), China (Beijing), China (Shanghai), China (Hangzhou), China (Hong Kong), Japan (Tokyo), Germany (Frankfurt), US (Virginia), US (Silicon Valley), Indonesia (Jakarta), UK (London), Singapore, Malaysia (Kuala Lumpur), and UAE (Dubai)

All DataWorks users

Usage notes for development of EMR nodes in DataWorks

Support for field insertion in a visualized manner and verification of permissions on tables by using the intelligent code editor in DataStudio

  • Queries of table fields in code can be automatically identified by the intelligent code editor. You can move the pointer over a table name, select the field that you want to query, and then confirm the selection. The field name is automatically inserted into the code.

  • The permissions on tables can be verified by the intelligent code editor. You can request permissions on tables as prompted.

2022.7.2

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and China (Hong Kong)

All DataWorks users

DataStudio overview

2022-6

Feature

Description

Release date

Region

Scope

References

Support for Data Governance Center

Data Governance Center is available and provides the following features:

  • Calculates the health score of governance results based on a quantitative assessment model from the storage, computing, development, quality, and security dimensions, and identifies and prevents various types of data governance issues.

  • Estimates the costs of a single task and allows you to view the details of resources consumed by a task and the overall trend of resource consumption. This helps you optimize resource usage and reduce the costs of various types of resources.

Note
  • Data Governance Center is available from July 5, 2022, and a one-month trial period is provided.

  • From August 5, 2022, all capabilities of Data Governance Center are available in DataWorks Enterprise Edition.

2022.6.27

China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), Singapore, and US (Silicon Valley)

All DataWorks users

Data Governance Center overview

Support for a panoramic view of a task in Data Governance Center

A panoramic view of a task is provided on the Task 360 page. You can view the following information about a task on the page: governance issues that are identified on the task, operation records of the task, baselines that are affected by the task, and task execution information. The information helps you perform data governance operations on the task.

2022.6.24

China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), Singapore, and US (Silicon Valley)

All DataWorks users

Obtain a panoramic view of a task

Support for search and creation of views in Data Modeling

  • Existing view fields and partition information can be referenced as fields in a model during model design.

  • A model can be materialized into a view after the model is designed.

2022.6.22

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, US (Silicon Valley), Germany (Frankfurt), China East 2 Finance, China South 1 Finance, and China North 2 Ali Gov 1

All DataWorks users

Materialize a table to a compute engine

Support for generation of models based on table name keywords in Data Modeling

The reverse modeling feature can be used to generate logical models based on fuzzy match of table name keywords.

2022.6.19

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, US (Silicon Valley), Germany (Frankfurt), China East 2 Finance, China South 1 Finance, and China North 2 Ali Gov 1

All DataWorks users

Perform reverse modeling on physical tables

Support for management of operations performed on data synchronization tasks in Approval Center

Request processing policies for data synchronization tasks can be configured in Approval Center to ensure the security of data during data transmission. You can use a combination of a source and a destination to specify a data synchronization task on which an operation request must be processed. For example, if a data synchronization task is saved, the related request processing procedure is triggered. This way, you can manage the data synchronization process in a flexible manner.

2022.6.15

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, Indonesia (Jakarta), Malaysia (Kuala Lumpur), US (Silicon Valley), US (Virginia), and Germany (Frankfurt)

All DataWorks users

Request processing policies for data synchronization tasks

Support for lineage graphs of sensitive data in Data Security Guard

The sensitive data lineage graph feature is supported. The feature supports the following sub-features:

  • The lineages between sensitive fields in data can be parsed based on the production information of the data. Identification results can be disseminated based on the lineages between sensitive fields that have the same sensitive field type. This greatly improves the efficiency of sensitive data identification.

  • A lineage graph of sensitive data can be drawn based on the lineages between sensitive fields. The lineage graph helps you understand the source and destination of sensitive data.

Note

This feature is available only in DataWorks Enterprise Edition.

2022.6.14

China (Hangzhou) and China (Shanghai)

All DataWorks users

Data lineage

Support for analysis of abnormal lineages between fields in Data Security Guard

The abnormal lineage analysis feature is supported. The feature provides the following capabilities:

  • Analyzes abnormal associations between sensitive fields based on the lineages of the fields. This way, users cannot bypass sensitive data identification and sensitive data use audit by concatenating or disassembling characters.

  • Helps you identify fields that are associated with the queried field but have different sensitive field types from the queried field.

2022.6.14

China (Hangzhou) and China (Shanghai)

All DataWorks users

Data lineage

2022-5

Feature

Description

Release date

Region

Scope

References

New version of the risk identification rule management feature

A new version of the risk identification rule management feature is released. The new version of the feature provides built-in risk identification scenarios. Risk identification from various dimensions, such as the category and sensitivity level of data, operation method, and user permissions, is supported. Alert judgment based on the aggregation degree of alert events is supported to prevent false positive alerts. Fine-grained management for high, medium, and low-level risks is supported. This helps you identify various data risks in your enterprise in an all-around manner.

Note
  • You can use the risk identification rule management feature only in DataWorks Professional Edition or a more advanced edition.

  • The old version of the risk identification rule management feature is no longer supported after June 30, 2022. After the old version goes offline, the system clears all identification rules created by using the old version and the identified risk data. We recommend that you export the identification rules and the identified risk data that you require at the earliest opportunity.

  • To use the new version, you must switch to the new version and create identification rules based on your business requirements.

2022.5.16

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and China (Hong Kong)

All DataWorks users

Risk identification rule management (new version)

2022-4

Feature

Description

Release date

Region

Scope

References

Optimization of the features on the DataStudio page and the display pattern of the status of nodes on the DataStudio page

  • The node types that are recently used can be displayed after a node is created. You can directly select the node type that you want to use.

  • The My Favorites feature is provided. You can add nodes that you frequently use to favorites. This way, you can modify the nodes in an efficient manner or collaborate with other users to modify the nodes.

  • The display pattern of the status of nodes in the Scheduled Workflow pane is optimized. The Commit and Deploy icons are displayed to the left of the nodes that are not committed or deployed. This allows you to quickly commit or deploy nodes.

2022.4.7

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and China (Hong Kong)

All DataWorks users

Features on the DataStudio page

Support for the rule list feature and management of multiple monitoring rules in Data Quality

The rule list feature is supported and provides the following functionalities:

  • Monitoring rules created in the current workspace can be displayed in a list. You can enable, disable, and subscribe to multiple monitoring rules at the same time and associate multiple monitoring rules with an auto triggered task at the same time. You can also change the strength of a monitoring rule based on your business requirements.

  • A monitoring rule can be created for multiple tables based on a rule template at the same time, and multiple monitoring rules can be managed in a centralized and efficient manner to resolve different types of data quality issues for enterprises.

2022.4.11

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and China (Hong Kong)

All DataWorks users

Procedure of configuring a monitoring rule in Data Quality

Support for more flexible alert settings for baselines in the intelligent baseline feature in Operation Center

The intelligent baseline feature is optimized in the following aspects:

  • Baselines, baseline instances, and events are displayed on the Smart Baseline page. You can manage an object on the related tab of the Smart Baseline page.

  • You can configure an alert notification method, such as text message, email, or phone number, for each baseline. You can also associate an alert rule with a shift schedule. This prevents complex operations caused by frequent changes of owners.

2022.4.26

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and China (Hong Kong)

All DataWorks users

Intelligent baseline overview

2022-3

Feature

Description

Release date

Region

Scope

References

Support for cross-workspace deployment of objects on the Deploy page and optimized management of deploy operations

Objects, such as tasks, resources, and functions, in a workspace can be deployed to another workspace.

2022.3.2

All regions

Users who require strong control on deploy operations, such as users in the finance sector or public service sectors

Create a deployment package to deploy objects in the deployment package across workspaces

Integration of DataAnalysis with ActionTrail and monitoring of operation records in DataAnalysis by ActionTrail

DataAnalysis is integrated with ActionTrail and the following types of operation records can be monitored by ActionTrail:

  • Operation record for executing MaxCompute SQL statements

  • Operation record for downloading SQL execution results

  • Operation record for downloading workbooks

2022.3.20

All regions

All DataWorks users

Optimization of the ranking feature for check items and governance items in Data Governance Center

The ranking feature is optimized in the following aspects:

  • Governance items can be filtered by role.

  • Governance items and check items can be sorted from different dimensions.

  • The check items and governance items of all members in a workspace can be displayed.

  • Governance items are displayed in a list. You can view the details of a governance item.

  • The historical trend chart for resource usage and usage details of MaxCompute resources are displayed.

2022.3.21

All regions

Users who participate in the invitational preview of Data Governance Center

View data governance results

Configuration optimization of synchronization tasks in Data Integration

More than 1,000 tables can be synchronized when a real-time synchronization task is run to synchronize data to MaxCompute or a real-time synchronization task is run to synchronize data to Hologres on an exclusive resource group for Data Integration. This improves the efficiency of data synchronization.

2022.3.25

All regions

Users who need to synchronize large amounts of data, such as users of Software as a Service (SaaS) platforms or users in the finance sector

Prepare a PostgreSQL environment

2021

2021-12

Feature

Description

Release date

Region

References

Support for configuration of a monitoring rule for multiple tables at the same time based on a rule template in Data Quality

A rule template can be selected to configure a monitoring rule for multiple tables at the same time. This simplifies the configuration.

  • You can select a table-level rule template to configure a monitoring rule for multiple tables at the same time.

  • You can select a field-level rule template to configure a monitoring rule for multiple fields at the same time.

2021.12.14

All regions

Configure a monitoring rule for multiple tables based on a template

Support for the resource usage analysis feature in Data Governance Center

The resource usage analysis feature is provided by DataWorks Data Governance Center. The feature allows you to view the overall resource consumption, resource consumption changes, and resource consumption details in the following dimensions: MaxCompute storage resource consumption, MaxCompute computing resource consumption, resource consumption of DataWorks task scheduling, and resource consumption of DataWorks batch synchronization.

2021.12.9

All regions

Data pivoting from the resource type perspective

2021-11

Feature

Description

Release date

Region

References

Support for the resource group orchestration feature in DataStudio

The resource group orchestration feature is supported. The feature allows you to change resource groups for the scheduling of multiple nodes in a workflow at the same time. If multiple resource groups for scheduling exist in your workspace, you can change the resource groups for scheduling of nodes in the workspace based on your business requirements. This can facilitate reasonable resource usage.

2021.11.30

All regions

Change resource groups for scheduling for nodes

Support for the batch operation feature in DataStudio

Operations can be performed on multiple DataWorks objects at the same time. DataWorks allows you to modify configurations, such as the owners of multiple nodes, resources, or functions, at the same time. After the modification, you can commit and deploy the nodes, resources, or functions to the production environment for the modifications to take effect.

2021.11.11

All regions

Perform operations on multiple DataWorks objects at the same time

2021-10

Feature

Description

Release date

Region

References

Support for the reverse modeling and naming dictionary features in Data Modeling

  • The naming dictionary feature is supported. A naming dictionary is used to manage the roots and morphemes of business terms, physical tables, and fields and the standardized translation of the roots and morphemes.

  • The reverse modeling feature is supported. The feature is used to apply the models generated by using other modeling tools to DataWorks Dimensional Modeling.

2021.10.30

The features are in public preview in the following regions: China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Zhangjiakou), China (Chengdu), Singapore, US (Silicon Valley), Germany (Frankfurt), China (Hong Kong), China East 2 Finance, and China South 1 Finance.

Support for the code search feature in DataStudio

The code search feature is supported. The feature allows you to query code snippets in the code of nodes by keyword. The search results show the details of each code snippet and the nodes whose code contains the code snippets. You can use the feature to trace the node that causes changes in a table.

2021.10.27

All regions

Code search

2021-09

Feature

Description

Release date

Region

References

Support for the display of DataService Studio APIs in Data Map

DataService Studio API assets, such as wizard APIs, script APIs, and registration APIs, can be displayed in Data Map. You can search for and manage APIs from the global perspective or based on business scenarios. In Data Map, you can perform specific operations on APIs, such as globally searching for an API, viewing statistics on popular APIs, viewing information about an API on the details page of the API, and viewing the API distribution that belongs to each data source type.

2021.09.30

All regions

Release of Data Governance Center of the latest version

Issues that must be handled in the data storage, task computing, code development, data quality, and security dimensions can be detected by Data Governance Center from the global, workspace, and personal perspectives. Data Governance Center provides health scores to evaluate the effectiveness of data governance and visualizes the governance results by providing governance reports and the rankings of governance issues. This helps you troubleshoot issues in an efficient manner and achieve governance goals.

2021.09.12

Data Governance Center of the latest version is in public preview in the China (Shanghai), China (Hangzhou), China (Beijing), and China (Shenzhen) regions.

Data Governance Center overview

2021-08

Feature

Description

Release date

Region

References

Exclusive resource groups for DataService Studio in the China (Hangzhou) and China (Shanghai) regions

Exclusive resource groups for DataService Studio are available in the China (Hangzhou) and China (Shanghai) regions. If high queries per second (QPS) and service level agreement (SLA) guarantees are required when you call APIs in DataService Studio, you can use exclusive resource groups for DataService Studio to ensure successful API calls. Exclusive resource groups for DataService Studio can meet the requirements of highly concurrent, frequent API calls and help ensure that responses are returned at the earliest opportunity.

2021.08.06

China (Hangzhou) and China (Shanghai)

Exclusive resource groups for DataService Studio

Commercial release of DataWorks Migration Assistant

Migration Assistant can be used to migrate data development objects across different DataWorks editions, Alibaba Cloud accounts, regions, and workspaces. You can export the data objects in your workspace, including auto triggered tasks, manually triggered tasks, resources, functions, data sources, table metadata, ad hoc queries, and SQL script templates. You can also create full export tasks, incremental export tasks, or custom export tasks to export your data objects in DataWorks based on your business requirements.

2021.08.01

All regions

Overview

2021-07

Feature

Description

Release date

Region

References

Release of Approval Center in Data Governance Center

The DataWorks Approval Center feature is released. You can use this feature to manage permissions on data and manage high-risk operations. You can also use this feature to specify the scope of requests and customize request processing procedures to meet the request processing requirements of your enterprise in different compliance scenarios.

2021.07.16

All regions

Approval Center overview

Support for issuing tasks to EMR gateway nodes

Parameters on the Advanced Settings tab can be configured to issue tasks to EMR gateway nodes to balance loads. You can issue workspace-level tasks to nodes in the future.

2021.07

All regions

Create an EMR Hive node

2021-06

Feature

Description

Release date

Region

References

Development and O&M of EMR Spark Streaming nodes

EMR Spark Streaming and EMR Streaming SQL nodes are supported in DataWorks.

You can develop an EMR Spark Streaming or EMR Streaming SQL node, test the node, and then commit the node to the production environment. You can rerun the node if the node fails to run. You can also perform the following operations: view the status and details of the node, start, terminate, or undeploy the node, monitor the node, and send notifications if errors occur on the node.

2021.06

All regions

Create an EMR Spark Streaming node

Migration of EMR data development tasks to DataWorks

Workflows (nodes and scheduling settings), manually executed jobs, resources, and data sources can be migrated from an EMR cluster to a DataWorks workspace by using Migration Assistant of DataWorks. You can go to the Migration Assistant page in the DataWorks console to view the migration progress, results, and reports.

2021.06

All regions

Migrate EMR projects to DataWorks

Support for the resource O&M feature in Operation Center

The resource O&M feature is supported in Operation Center. This feature can help you monitor the usage of resource groups that are used to run a node.

2021.06.09

All regions

Resource O&M

MaxCompute data source-based API encapsulation

MaxCompute tables can be accessed and used to encapsulate APIs in DataService Studio. This feature is in canary release. Such APIs query data based on the MaxCompute Query Acceleration (MCQA) feature of MaxCompute. This helps you achieve quick and efficient API calls. You can run MaxCompute tasks only on exclusive resource groups for DataService Studio.

2021.06

All regions

None

Configuration of alert contacts in DataWorks

A RAM user or a RAM role can be added as an alert contact on the Alert Contacts page of the DataWorks console. If an error occurs during the running of a task, DataWorks sends alert notifications to the alert contact that you specified. This way, you can handle exceptions at the earliest opportunity.

2021.06

All regions

Configure and view alert contacts

2021-05

Feature

Description

Release date

Region

References

Real-time data synchronization to AnalyticDB for MySQL V3.0

A real-time synchronization task can be created in DataWorks to synchronize data from a MySQL, OceanBase, or PolarDB database to an AnalyticDB for MySQL data source. The real-time synchronization task synchronizes full data from a database at a time and then synchronizes incremental data from the database in real time to the AnalyticDB for MySQL V3.0 data source. In addition, columns that you add to the source are automatically added to the destination during real-time synchronization.

2021.05.25

All regions

Plan and configure resources

Public preview of Open Message

The Open Message service is supported in DataWorks. You can enable the message subscription feature in DataWorks Open Message. The Open Message service is in public review. Only users of DataWorks Enterprise Edition can join the public preview. If your DataWorks service is of Enterprise Edition, you can use the Open Message service on a trial basis and are not charged additional fees during the public preview. You can use Open Message to obtain metadata and task change events in DataWorks. This way, DataWorks can be deeply integrated with your system.

2021.05.21

China (Beijing), China (Hangzhou), China (Shenzhen), and China (Shanghai)

Overview of OpenEvent

Support for scheduling tasks to run at a specified point in time on specific days every year or at a specified point in time on the last day of a month

Tasks can be scheduled to run at a specified point in time on specific days every year or at a specified point in time on the last day of a month. This way, you can schedule tasks to run at a specified point in time on the last day of every year, quarter, or month. DataWorks allows you to schedule tasks by minute, hour, day, week, month, or year.

2021.05.19

All regions

Configure time properties

Support for ClickHouse data sources

ClickHouse data sources are supported by DataWorks. ETL operations such as data synchronization, data development, task scheduling, and task O&M related to ClickHouse data sources are allowed and management capabilities for the ETL operations are provided.

  • DataWorks allows you to associate a ClickHouse cluster with a workspace in EMR instance mode or JDBC connection string mode. You can also add a ClickHouse data source in JDBC connection string mode.

  • DataWorks Data Integration allows you to create data synchronization tasks to read data from or write data to ClickHouse.

  • DataWorks allows you to create ClickHouse SQL nodes. A ClickHouse SQL node allows you to use a distributed SQL query engine to process structured data. This improves the running efficiency of jobs.

2021.05.15

All regions

2021-04

Feature

Description

Release date

Region

References

Real-time data synchronization to AnalyticDB for MySQL V3.0

A real-time synchronization task can be created in Data Integration to synchronize data from multiple tables to an AnalyticDB for MySQL V3.0 data source in real time. Columns that you add to a source table by executing DDL statements are automatically added to the destination table during real-time synchronization.

2021.4.20

All regions

Create a real-time synchronization solution to synchronize data to AnalyticDB for MySQL V3.0

Support for FTP Check nodes in DataStudio

An FTP Check node can be created in DataStudio to periodically detect whether a specific file exists based on FTP. If the FTP Check node detects that the file exists, the scheduling system runs the descendant node of the FTP Check node. Otherwise, the FTP Check node detects the file based on the configured detection interval. The FTP Check node stops the retry until the condition to stop the detection is met. In most cases, FTP Check nodes are used for communications between the DataWorks scheduling system and external scheduling systems.

2021.4.15

China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Zhangjiakou), China (Chengdu), and Singapore

Create an FTP Check node

2021-03

Feature

Description

Release date

Region

References

Support for custom roles in workspaces of DataWorks Enterprise Edition

Custom roles are supported in DataWorks Enterprise Edition. You can grant permissions to the roles based on your business requirements.

2021.3.22

All regions

Manage permissions on workspace-level services

Kerberos authentication in Data Integration

Kerberos authentication is supported in Data Integration. If you want to perform identity authentication for data sources, such as Hive and Kafka, upload the files required to configure Kerberos authentication when you add the data sources. This ensures that you can access the data sources in a secure manner.

2021.3.16

All regions

Configure Kerberos authentication

Security Center of the latest version

Security Center of the latest version is released. You can use Security Center to build a security system that can secure data and personal privacy in an efficient manner. Security Center can meet various security requirements in high-risk scenarios, such as auditing. You can use Security Center without the need to perform additional configurations.

2021.03.13

All regions

Overview

Support for the node aggregation, ancestor node analysis, and descendant node analysis features in Operation Center

The node aggregation feature is supported. This feature allows you to aggregate nodes in a directed acyclic graph (DAG) from different dimensions, such as workspace, owner, or priority. This way, you can view the total number of nodes from a specified dimension. The ancestor node analysis and descendant node analysis features are also supported. The features allow you to analyze the ancestor and descendant nodes of a specific node. This way, you can quickly find the ancestor node that blocks the running of the node, view the number of the descendant nodes of the node based on the analysis results, and understand the running status of all nodes.

2021.03.10

China (Shenzhen)

View and manage auto triggered nodes

2021-02

Feature

Description

Release date

Region

References

Creation of multiple metadata crawlers at a time by using the data discovery feature

The data discovery feature of Data Map can be used to create multiple metadata crawlers at a time. This way, you can quickly view the table schema and associations between tables.

2021.02.17

All regions

Collect metadata from an EMR data source

Task migration from Airflow by using Migration Assistant

Tasks in Airflow can be migrated to DataWorks by using Migration Assistant.

2021.02.16

All regions

Export tasks from open source engines

Support for view of API statistics in DataService Studio

API statistics can be viewed on the Statistics Dashboard and Statistics Details pages of DataService Studio. The Statistics Dashboard page of DataService Studio provides various charts and tables to show API statistics. For example, you can view the total number of APIs in a workspace and the total number of API calls. This helps you obtain information about API calls from a global perspective. On the Statistics Details page of DataService Studio, you can view the monitoring charts to obtain information about a specific API, such as API gateway status codes and DataService Studio error codes.

2021.02.16

China (Beijing)

Open Platform

The Open Platform service is available in DataWorks. This service allows you to view the metering reports of APIs and the call details on a specified date.

2021.02.13

All regions

DataWorks Open Platform

2021-01

Feature

Description

Release date

Region

References

Support for RestAPI data sources in Data Integration

RestAPI data sources are supported in Data Integration. Such data sources provide or receive data by using RESTful API operations. Data Integration supports batch synchronization of data in these data sources.

2021.1.4

All regions

RestAPI Reader

2020

2020-12

Feature

Description

Release date

Region

References

Full and incremental data synchronization to Elasticsearch

Full and incremental data in all tables or specific tables in a database can be synchronized to Elasticsearch.

2020.12.30

All regions

Create a real-time synchronization solution to synchronize data to Elasticsearch

2020-09

Feature

Description

Release date

Region

References

Real-time synchronization in Data Integration

The real-time synchronization feature is supported in Data Integration. This feature allows you to synchronize data changes from a single table or all tables in a source database to a destination database in real time. This way, data in the destination database is consistent with data in the source database in real time. You can create a synchronization task to synchronize full and incremental data between different data sources.

2021.4.15

All regions

2020-07

Feature

Description

Release date

Region

References

Public preview of API operations

API operations of multiple modules are provided to help you use DataWorks in a flexible manner. These modules include tenants, metadata, DataStudio, Operation Center, Data Quality, and DataService Studio.

Note

You can use the API operations only in DataWorks Enterprise Edition or a more advanced edition.

2020.07.16

China (Hangzhou), China (Shanghai), China (Shenzhen), China (Beijing), and China (Zhangjiakou)

Overview of DataWorks API operations

Public preview of Migration Assistant

You can export the data objects in your workspace, including auto triggered tasks, manually triggered tasks, resources, functions, data sources, table metadata, ad hoc queries, and SQL script templates. You can also create full export tasks, incremental export tasks, or custom export tasks to export your data objects in DataWorks based on your business requirements.

2020.07

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), and Singapore

Migration Assistant overview

Upgrade of DataService Studio

The items in the left-side navigation pane of DataService Studio are adjusted.

2020.07.28

  • You can create functions and configure filters for APIs only if you activate DataWorks Enterprise Edition or a more advanced edition and create a workspace in the China (Shanghai) region.

  • You can use the service orchestration feature only if you activate DataWorks Enterprise Edition or a more advanced edition and create a workspace in the China (Shanghai) region.

DataService Studio overview

2020-06

Feature

Description

Release date

Region

References

Data source query

The data source query feature is supported. When you modify a workbook, you can use this feature to read data from a data source for analysis.

2020.06.09

China (Shanghai)

Analyze data

2020-04

Feature

Description

Release date

Region

References

Phone call-based alerting in Operation Center

Alert notifications can be sent by phone call, text message, and email.

Important

You can use the phone call-based alerting feature only in DataWorks Professional Edition or a more advanced edition.

2020.04.15

All regions

Create a custom alert rule