To facilitate the management of DataWorks resources and improve user experience, DataWorks offers serverless resource groups. A serverless resource group can implement the core features of an exclusive resource group for scheduling, an exclusive resource group for Data Integration, and an exclusive resource group for DataService Studio simultaneously. You can perform operations such as data synchronization, task scheduling and running, and API calling and management using a single serverless resource group.
Introduction to serverless resource groups
Old-version resource groups consist of exclusive resource groups and shared resource groups. Compared with old-version resource groups, serverless resource groups provide more robust capabilities, support a more unified sales model, and reduce resource waste by improving the utilization of resource fragments. For more information about serverless resource groups and the comparison between serverless resource groups and old-version resource groups, see Resource Management.
Notes
DataWorks supports various types of nodes. Some nodes are issued to the related compute engines for running, while others run on DataWorks resource groups. Data computing fees generated by running nodes on compute engines are charged by the Alibaba Cloud services to which the compute engines belong. Data computing fees generated by running nodes on DataWorks resource groups are charged by DataWorks.
If you use a pay-as-you-go serverless resource group to run your tasks, the tasks may compete for resources during peak hours, and the timeliness of resource usage cannot be guaranteed.
The fees of serverless resource groups do not include task scheduling fees. Regardless of whether you use a pay-as-you-go serverless resource group or a subscription serverless resource group, task scheduling fees are charged based on the number of successful instances. For more information, see Appendix: Billing of task scheduling.
You cannot change the billing method of a serverless resource group between subscription and pay-as-you-go. For example, if you choose the subscription billing method for a serverless resource group, you are charged based on this billing method when you use the resource group. You cannot change the billing method of the resource group to pay-as-you-go.
When a new user activates DataWorks, a pay-as-you-go serverless resource group is purchased by default. This resource group is used to settle data computing fees. You are not charged if you do not use the resource group. For more information about billing, see Billing rules.
Billing scenarios
The resource fees of DataWorks include data computing fees and task scheduling fees. The details are as follows:
Data computing fees: If you run data synchronization tasks (such as batch synchronization tasks), DataService Studio tasks (such as calling DataService Studio APIs), data computing tasks (such as ODPS SQL, PyODPS, and EMR Hive tasks), and Data Quality rule execution tasks in DataWorks, data computing fees are generated.
For information about how to identify data computing tasks and the list of compute-optimized tasks, see Billing of data computing.
Task scheduling fees: If you deploy tasks to the production environment for periodic scheduling, scheduling fees are generated.
Data computing fees are charged based on compute units (CUs), whereas task scheduling fees are charged based on the number of successful instances (excluding dry-run instances). Currently, serverless resource groups support only data computing fees. Therefore, regardless of whether you use a pay-as-you-go serverless resource group or a subscription serverless resource group, task scheduling fees are charged separately. For more information about billing, see Appendix: Billing of task scheduling.
Performance metrics and purchase recommendations
Both subscription and pay-as-you-go serverless resource groups are charged based on the number of CUs. 1 CU = 1 CPU core + 4 GiB memory
.
When you use a serverless resource group, you need to plan the specifications of the resource group based on your actual development scenarios and task types.
The recommended specifications are provided as general guidance. You can adjust resources based on your specific business requirements and actual situations to ensure that tasks run efficiently and stably.
Batch synchronization
Concurrency of batch synchronization tasks | Recommended specifications | Minimum specifications |
<4 | 0.5 CU | 0.5 CU |
>=4 |
|
Real-time synchronization
Synchronization task type | Recommended specifications | Minimum specifications | |
Real-time synchronization from MySQL | One source database | 2.5 CU | Minimum specifications that are required to run such a real-time synchronization task: 1 CU |
Two to five source databases | 4 CU | ||
Six or more source databases | 7 CU | ||
Real-time synchronization from PolarDB-X 1.0 | 7 CU | ||
Real-time synchronization from Kafka | 2.5 CU | ||
Real-time synchronization of data in a single table of another source type | 2.5 CU | ||
Real-time data synchronization from a database | - | Minimum specifications that are required to run such a synchronization task: 2 CUs |
DataWorks supports various types of nodes. Some nodes are issued to the related compute engines for running, while others run on DataWorks resource groups. Data computing fees generated by running nodes on compute engines are charged by the Alibaba Cloud services to which the compute engines belong. Data computing fees generated by running nodes on DataWorks resource groups are charged by DataWorks.
When you use a serverless resource group to run data computing tasks, CUs are consumed. For more information about the list of compute-optimized tasks and the default and actual CUs required to run compute-optimized tasks, see List of compute-optimized tasks.
The maximum number of parallel instances supported by a serverless resource group is 200.
If your scheduling tasks include data computing tasks (such as PyODPS2 and EMR Hive tasks): The data computing tasks use the serverless resource group for computing. You need to plan the CU specifications of the resource group based on your actual tasks.
For more information about the default number of CUs for each computing task, see List of data computing tasks.
If your scheduling tasks do not include data computing tasks, the maximum number of parallel instances supported by a serverless resource group is 200, which is greater than the maximum number of parallel instances supported by an old-version resource group with the highest specifications. In this case, the default specifications of a serverless resource group can meet your business requirements and you do not need to adjust the specifications.
Maximum queries per second (QPS) | Minimum specifications | Service availability (SLA) |
500 | 4 CU | 99.95% |
1000 | 8 CU | |
2,000 | 16 CU |
Billing details
Billing methods
Serverless resource groups are divided into subscription resource groups and pay-as-you-go resource groups based on the billing method.
Subscription serverless resource groups
You must determine the number of CUs that you require and the subscription duration in advance, and pay the fee before you can use such a resource group. After you purchase a subscription serverless resource group, you are not charged additional fees for using the resource group to synchronize data, perform data computing, and call and debug DataService Studio APIs in DataWorks.
Pay-as-you-go serverless resource groups
You can use the related features and then pay the fee based on the total number of CUs that are used. If you use a pay-as-you-go serverless resource group to run tasks, such as batch synchronization tasks, DataService Studio tasks, and data development tasks, data computing fees are generated.
The following table compares the features of serverless resource groups that are charged based on different billing methods.
Feature category | Feature scenario/Description | Pay-as-you-go serverless resource group | Subscription serverless resource group |
Feature category | Feature scenario/Description | Pay-as-you-go serverless resource group | Subscription serverless resource group |
Quota | Total number of CUs that can be used in a resource group. | The number of CUs that are actually used. | The number of CUs that you specify when you purchase a resource group. |
Usage scenario | Data computing, data synchronization, and DataService Studio | Support | Support |
Action | Scale-out, scale-in, and renewal | N/A | Support |
Quota Management | This feature is used to control the maximum number of CUs that can be used in different scenarios. You can use this feature in data computing, data synchronization, and DataService Studio. | ||
Maximum number of parallel tasks allowed in data scheduling. | Support, a maximum of 200 task instances can run in parallel. | ||
Network configuration | Number of virtual private clouds (VPCs) that can be associated. |
| The number depends on the number of CUs that you purchase.
|
Billing rules
The fees of serverless resource groups do not include task scheduling fees. Regardless of whether you use a pay-as-you-go serverless resource group or a subscription serverless resource group, task scheduling fees are charged based on the number of successful instances. For more information, see Appendix: Billing of task scheduling.
Subscription
The fee is charged based on the number of CUs that you use. Fee = Monthly unit price × Number of months × Number of CUs purchased per month
.
For a subscription serverless resource group, you must purchase a minimum of 2 CUs per month. No upper limit is imposed. However, the maximum number of CUs that you can purchase may be affected by the inventory. If the inventory is insufficient, follow the instructions on the buy page.
If the specifications of the resource group that you purchase do not meet your requirements, you can scale out the resource group at any time. For more information, see Scale out or scale in a resource group.
For more information about the minimum specifications required by different types of tasks that run on serverless resource groups, see Performance metrics and purchase recommendations.
Pay-as-you-go
The fee is charged based on CU-hours × Unit price per CU. Fee = CU-hours × Unit price per CU
. The fee is billed by hour.
Calculation of CU-hours: If a data computing task is configured with 2 CUs and runs for 0.5 hours (regardless of whether the task succeeds, fails, or is manually stopped), the number of CU-hours consumed by the task is 2 CUs × 0.5 hours = 1 CU-hour.
Unit prices
Region | Monthly unit price (USD/month/CU) |
China (Shanghai), China (Hangzhou), China (Beijing), China (Shenzhen) | 37.1517 |
UK (London) | 51.01286 |
US (Virginia) | 53.92014 |
Malaysia (Kuala Lumpur) | 63.36534 |
China (Hong Kong), Singapore, Germany (Frankfurt), Indonesia (Jakarta) | 67.61327 |
US (Silicon Valley) | 72.74794 |
Japan (Tokyo) | 77.45584 |
Region | Unit price (USD/CU-hour) | Example |
China (Shanghai), China (Hangzhou), China (Beijing), China (Shenzhen) | 0.077399 | Example: A data synchronization task in the China (Shanghai) region is configured with 2 CUs and runs successfully after 0.5 hours. The unit price of a CU in the China (Shanghai) region is USD 0.077399/CU-hour. The number of CU-hours consumed by the task and the fee are calculated as follows:
|
UK (London) | 0.106277 | |
US (Virginia) | 0.112334 | |
Malaysia (Kuala Lumpur) | 0.132011 | |
Germany (Frankfurt), Indonesia (Jakarta), China (Hong Kong), Singapore | 0.140861 | |
US (Silicon Valley) | 0.151558 | |
Japan (Tokyo) | 0.161366 |
Appendix: Billing of task scheduling
Billing scenarios
If you deploy data synchronization tasks and data development tasks to the production environment for periodic scheduling in DataWorks, scheduling fees are generated. For more information about the task types that DataWorks supports for scheduling, see Supported node types.
Billing unit and rules
Tiered pricing based on the number of successfully run instances. DataWorks divides the scheduling fees into 12 tiers based on the number of instances that run successfully each day. The fees are charged daily according to the tier that corresponds to the number of successfully run instances.
Dry-run instances (instances that are returned as successful by the platform without being actually run) are not charged and are not counted as successful instances.
Example of statistics on the number of instances per day: If a task is scheduled by hour and is scheduled once per hour during the period of 00:00 to 23:59 every day, a total of 24 instances are generated every day.
Unit prices
Region | Billing tier | Billing cycle | Fee |
Region | Billing tier | Billing cycle | Fee |
China (Hangzhou) China (Shanghai) China (Beijing) China (Shenzhen) China (Hong Kong) | 1 to 10 successful instances per day | Day | USD 0.00 per day |
11 to 500 successful instances per day | Day | USD 0.15 per day | |
501 to 5,000 successful instances per day | Day | USD 9.29 per day | |
5,001 to 20,000 successful instances per day | Day | USD 23.22 per day | |
20,001 to 50,000 successful instances per day | Day | USD 41.79 per day | |
50,001 to 120,000 successful instances per day | Day | USD 92.87 per day | |
Singapore Malaysia (Kuala Lumpur) Indonesia (Jakarta) Japan (Tokyo) US (Silicon Valley) US (Virginia) Germany (Frankfurt) UK (London) UAE (Dubai) | 1 to 10 successful instances per day | Day | USD 0.00 per day |
11 to 500 successful instances per day | Day | USD 0.23 per day | |
501 to 5,000 successful instances per day | Day | USD 13.93 per day | |
5,001 to 20,000 successful instances per day | Day | USD 34.82 per day | |
20,001 to 50,000 successful instances per day | Day | USD 62.68 per day | |
50,001 to 120,000 successful instances per day | Day | USD 139.30 per day |
Expiration and renewal
If your subscription serverless resource group is about to expire, you can renew the resource group. If you do not renew the resource group, the resource group stops providing services or is released after it expires. For more information about renewal, see Expiration and renewal.
What to do next
You can purchase a serverless resource group and use the resource group to run data synchronization tasks, data development tasks, and DataService Studio tasks. For more information about how to purchase a resource group, associate the resource group with a workspace, and connect the resource group to a network, see Create and use a serverless resource group.