Billing, performance metrics, and purchase recommendations for serverless resource groups - DataWorks

DataWorks introduces serverless resource groups, which consolidate the core features of the legacy exclusive resource groups for scheduling, data integration, and data service. You can now use a single serverless resource group to run all core operations, such as Data Synchronization, periodic scheduling tasks, and API services. This greatly simplifies resource management. serverless resource groups offer two billing models:

Subscription: Provides stable, predictable, and dedicated compute resources. This model is ideal for production environments.
Pay-as-you-go: Provides flexible and elastic compute resources that you pay for on demand. This model balances flexibility and cost-effectiveness.

Important

When you use a serverless resource group, a task scheduling fee is incurred for any node task published to the production environment for periodic scheduling.

Billing scenarios

The fees for a DataWorks serverless resource group consist of a resource usage fee and a task scheduling fee.

Resource usage fee: Charged for Compute Units (CUs) consumed by tasks running in a serverless resource group. This fee is calculated based on total CU consumption, with the CU as the billable item.
A CU is defined as 1 CU = 1 vCPU + 4 GiB memory.
Task scheduling fee: Tasks deployed to the production environment for periodic scheduling run on a serverless resource group. These tasks incur only a task scheduling fee, not a resource usage fee. This fee is billed based on the number of successfully run instances, excluding any dry runs.
A serverless resource group supports a maximum of 200 concurrent instances. This limit meets the maximum concurrency requirements of all previous resource group specifications. Therefore, CU specifications are not a factor for scheduling concurrency.

The following table describes the relationship between supported task types and the fees they incur in a serverless resource group.

Task type	Task type description	Fee type
Data Integration	Runs a Data Synchronization task, such as an offline synchronization task, in `Data Integration` or `Data Studio`.	Resource usage fee
Data Compute	Runs compute node tasks, such as PyODPS, Shell, and EMR Hive, in `Data Studio`. Runs compute node tasks, such as Hologres SQL and EMR Hive, in the Data Analytics module. Runs custom tasks, such as custom EMR SQL. Important For information about data compute tasks, see Appendix 1: Task types and CU consumption.
Data Service	Calls an API in DataService Studio.
Personal development environment	Uses a personal development environment to debug tasks.
Large model service	Deploys and uses a large model service.
Task Scheduling	A periodic scheduling task runs in the production environment.	Task scheduling fee

Notes

For the pay-as-you-go billing model, resource contention may occur during peak hours, which can delay resource availability.
You can convert a pay-as-you-go resource group to a subscription one, but the reverse is not supported.
When new users activate DataWorks, a pay-as-you-go serverless resource group is created by default. You are not charged for it if it remains unused. For billing details, see Pay-as-you-go resource group billing.

Performance metrics

Serverless resource groups are purchased based on the number of CUs. A CU is defined as 1 CU = 1 vCPU + 4 GiB memory. Plan your resource group specifications based on your development scenarios and task types.

Important

The following recommended specifications are general guidelines. You can adjust the resources based on your specific business requirements to ensure efficient and stable task execution.

Data Integration

Batch synchronization

Batch synchronization task concurrency configuration	Recommended specifications	Minimum required specifications
< 4	0.5 CU	0.5 CU
>= 4	`(Concurrency - 4) × 0.07 + 0.5` CU	0.5 CU

Real-time synchronization

Synchronization task type		Recommended specifications	Minimum required specifications
MySQL real-time synchronization	1 database	2 CU	Minimum specifications for running one real-time synchronization task: 1 CU
	2 to 5 databases	2 CU
	More than 6 databases	2 CU
Kafka real-time synchronization		1 CU
Other types of single-table real-time tasks		1 CU
Real-time synchronization for an entire database		-	Minimum specifications for running an entire-database synchronization task: 2 CU

Data Compute

Each data compute task has a default CU value. For more information, see Appendix 1: Task types and CU consumption.

DataService Studio

Maximum queries per second (QPS)	Minimum required specifications	Service Level Agreement (SLA)
500	4 CU	99.95%
1,000	8 CU
2,000	16 CU

Personal development environment

For CPU-based personal development environments, resource quotas range from 2 to 100 CUs. For GPU-based personal development environments, resource quotas range from 21 to 60 CUs. Estimate your needs based on the task type:

Lightweight tasks (such as simple SQL queries or Python script debugging): A lower resource quota, such as 2 CUs, is recommended.
Moderately complex tasks (such as data processing or Notebook analysis): A medium resource quota, such as 4 CUs, is recommended.
Deep learning tasks (such as TensorFlow or PyTorch model training): A GPU-based resource type is recommended. Select the appropriate video memory and number of CUs based on the model size.

Large model service

Calculate the required CUs based on the GPU video memory.

A minimum of 24 GB of video memory is required to deploy 0.6B, 1.7B, 4B, and 8B models.
A minimum of 48 GB of video memory is required to deploy a 14B model.
A minimum of 96 GB of video memory is required to deploy a 32B model.

Task scheduling

A serverless resource group supports a maximum of 200 concurrent instances. This limit is independent of the CU specification. The default number of concurrent instances is 50. You can set the upper limit for concurrent task scheduling to 200 on the resource group details page.

Billing models

Serverless resource groups are available in two billing models: subscription (pre-paid) and pay-as-you-go (post-paid).

Subscription: Pre-pay for a specific number of CUs over a set duration. This model covers all resource usage fee for tasks run within the subscribed resource group, including Data Synchronization, Data Compute, and DataService Studio API calls.
Pay-as-you-go: Pay for resources after you use them, based on the total CUs consumed. A resource usage fee is incurred for tasks such as batch synchronization, DataService Studio API calls, and data development.

The following table compares the features of the two billing models.

Item	Pay-as-you-go serverless resource group	Subscription serverless resource group
Total available CUs in the resource group	Calculated based on actual usage.	The number of CUs specified at the time of purchase.
Resizing, scaling, and renewal	Not applicable	Yes
Quota management	Controls the maximum number of CUs that can be used in different scenarios. Supported for Data Compute, Data Integration, and Data Service.
Set upper limit for concurrent task scheduling	Yes. A maximum of 200 task instances can run concurrently.
Number of bound Virtual Private Clouds (VPCs)	Data Compute and Data Integration: A maximum of 2 VPCs can be bound in total. Data Service: Only 1 VPC can be bound.	Depends on the number of CUs you purchase. Less than or equal to 10 CUs: A maximum of 4 VPCs can be bound in total. Data Compute: Only 1 VPC can be bound. Task Scheduling and Data Integration: A maximum of 3 VPCs can be bound in total. Greater than 10 CUs: A maximum of 8 VPCs can be bound in total. Data Compute: Only 1 VPC can be bound. Task Scheduling and Data Integration: A maximum of 7 VPCs can be bound in total.

Pricing

Subscription resource group billing

The cost is calculated using the following formula: Cost = Monthly unit price × Number of months × Number of CUs purchased per month.

Note

A minimum purchase of 2 CUs is required. While there is no upper limit on the number of CUs you can purchase, the transaction is subject to available inventory. If inventory is insufficient, a notification will appear on the purchase page.
If the specifications do not meet your requirements after purchase, you can scale up the resources at any time. For more information, see Use serverless resource groups.
For the minimum resource specifications required for different task types when running on a serverless resource group, see Performance metrics.

Region	Monthly unit price (USD/Month/CU)
China (Shanghai), China (Hangzhou), China (Beijing), China (Shenzhen)	37.1517
UK (London)	51.01286
US (Virginia)	53.92014
Malaysia (Kuala Lumpur)	63.36534
China (Hong Kong), Singapore, Germany (Frankfurt), Indonesia (Jakarta)	67.61327
US (Silicon Valley)	72.74794
Japan (Tokyo)	77.45584

Pay-as-you-go resource group billing

The cost is calculated using the following formula: Cost = CU-hour × CU unit price, with bills generated hourly.

Important

When you use resource quota management to allocate CUs to DataService Studio, you are billed for these CUs continuously, even if the service is idle. To stop these charges, you must set the CU allocation for DataService Studio to 0.

Region	Unit price (USD/CU-hour)	Example
China (Shanghai), China (Hangzhou), China (Beijing), China (Shenzhen)	0.077399	For example, a Data Synchronization task in the China (Shanghai) region is configured with 2 CUs and completes in 0.5 hours. The unit price for a CU in the Shanghai region is 0.077399 USD/CU-hour. The CU-hours and cost for this task are calculated as follows: CU-hour: 2 CUs × 0.5 hours = 1 CU-hour Cost: 1 CU-hour × 0.077399 USD/CU-hour = 0.077399 USD
UK (London)	0.106277
US (Virginia)	0.112334
Malaysia (Kuala Lumpur)	0.132011
Germany (Frankfurt), Indonesia (Jakarta), China (Hong Kong), Singapore	0.140861
US (Silicon Valley)	0.151558
Japan (Tokyo)	0.161366

View billing details

When you view billing details in the Billing & Cost Management console, the billable items and codes for serverless resource groups are as follows:

Pay-as-you-go: The billable item is General Resource Group CU*H (Pay-as-you-go), and the billing code is exresource_cu_hour_post.
Subscription: The billable item is General Exclusive Resource Group (Subscription and Pay-as-you-go), and the billing code is cu_number.

For more information, see View bill details.

Expiration and renewal

If a subscription serverless resource group is not renewed before it expires, its service will be suspended and it may eventually be released. For more information, see Subscription expiration and renewal.

Next steps

You can purchase a resource group and use it for tasks such as Data Integration, Data Development, and Data Service. For information about how to purchase a resource group, bind it to a workspace, and connect it to a network, see Use serverless resource groups.

More information

Appendix 1: Task types and CU consumption

Tasks created in DataWorks are categorized as either data compute tasks, which consume CUs, or scheduling tasks, which do not consume CUs.

Identify the task type

Go to the node editing page in Data Studio. In the right-side navigation pane, check the Scheduling section to identify the task type.

Compute task: In the Scheduling Policies section, you must specify the compute CUs required to run the task.
- Scenario 1: You can customize the number of compute CUs.
- Scenario 2: You can only use the default number of compute CUs.
Scheduling task: In the Scheduling Policies section, you only need to select a scheduling resource group. CU configuration is not required.

CU configuration for compute tasks

Running a data compute task with a serverless resource group consumes CUs. The following describes the default and running CUs:

Default CU: The recommended number of CUs that the platform allocates each time a task runs. If the value is lower than the default, task efficiency may be compromised.
Running CUs: The actual number of CUs configured to run the task. By default, this is set to the Default CU value, which you can adjust as needed. Follow these principles for configuration:
- The minimum configuration is 0.25 CU, with increments of 0.25 CU. If the message The CU quota for the current resource group is insufficient appears, you can adjust the CU quota for the data compute task.
- To avoid under-provisioning or over-provisioning resources, configure this parameter based on the default CU value and the CU quota for the data compute task. For more information, see Assign CU quotas to tasks.

Note

You can adjust the running CUs for only some tasks. For example:

You cannot adjust the running CUs for a Hologres SQL task. It can only be set to 0.25 (the default CU).
The default running CUs for a PyODPS 2 task is 0.5, which you can adjust as needed (for example, to 0.25 or 0.75).

Node type	Node name	Default CU (Unit: CU)	Customizable?
Notebook	Notebook development	0.5	Yes
MaxCompute	PyODPS 2 node	0.5	Yes
	PyODPS 3 node	0.5	Yes
	MaxCompute MR node	0.5	Yes
	Metadata mapping to Hologres	0.25	Yes
	Node for synchronizing data to Hologres	0.25	Yes
Hologres	Hologres SQL node	0.25	-
	Node for synchronizing data to MaxCompute	0.25	-
	Node for synchronizing the schemas of MaxCompute tables	0.25	Yes
	Create a node to synchronize data from MaxCompute	0.25	Yes
EMR	EMR Hive node	0.25	-
	EMR Impala node	0.25	-
	EMR MR node	0.25	Yes
	EMR Presto node	0.25	-
	EMR Shell node	0.25	Yes
	EMR Spark nodes	0.5	Yes
	EMR Spark SQL node	0.5	Yes
	EMR Spark Streaming node	0.5	Yes
	EMR Trino node	0.25	-
	EMR Kyuubi node	0.25	-
Serverless Spark	Serverless Spark Batch node	0.25	-
	Serverless Spark SQL node	0.25	-
	Serverless Kyuubi node	0.25	-
Severless StarRocks	Serverless StarRocks SQL node	0.25	-
Large model	Large language model node	0.5	-
ADB	ADB for PostgreSQL node	0.25	Yes
	AnalyticDB for MySQL node	0.25	Yes
	ADB Spark node	0.25	-
	ADB Spark SQL nodes	0.25	-
CDH	CDH Hive nodes	0.25	-
	CDH Spark node	0.5	Yes
	CDH Spark SQL node	0.25	-
	CDH MR node	0.25	-
	CDH Presto node	0.25	-
	CDH Impala node	0.25	-
Lindorm	Lindorm Spark node	0.25	-
Lindorm	Lindorm Spark SQL node	0.25	-
Click House	ClickHouse SQL	0.25	-
Data Quality	Quality monitoring	0.25	-
Data Quality	Data comparison	0.5	Yes
General	Assignment node	0.25	Yes
	Shell node	0.25	Yes
	OSS object inspection node	0.25	-
	Python nodes	0.5	Yes
	The for-each node	0.25	Yes
	do-while node	0.25	Yes
	Function Compute nodes	0.25	-
	SSH Node	0.25	-
	Data push node	0.25	-
Database nodes	MySQL Node	0.25	-
	SQL Server
	Oracle Node
	PostgreSQL Node
	StarRocks Node
	DRDS Node
	PolarDB MySQL Node
	PolarDB PostgreSQL Node
	Doris Node
	MariaDB Node
	SelectDB Node
	Redshift Node
	SAPHANA Node
	DM Node
	KingbaseES Node
	OceanBase Node
	DB2 Node
	GBase 8a Node
Algorithm	PAI DLC node	0.25	-

Configuration for scheduling tasks

Scheduling tasks do not consume CUs from the serverless resource group.

Node type	Node name
Data Integration	Create a batch synchronization node
Data Integration	Real-time synchronization node
MaxCompute	MaxCompute SQL node
	SQL component nodes
	MaxCompute Script node
	MaxCompute Spark node
Flink	Flink SQL Streaming node
Flink	Create a Flink SQL Batch node
General	Zero load node
	Parameter node
	Merge node
	Branch node
	Check node
	HTTP trigger node
Algorithm	PAI Designer nodes

Appendix 2: Billing modes for task execution

When you run a task in DataWorks, the associated compute fees are not always billed directly by DataWorks. The billing depends on the underlying compute engine that executes the task. There are three possible scenarios:

Note

When a task is published to the production environment for periodic scheduling, a task scheduling fee is always incurred.

Execution mode	Example node	Computing resource provider	Fee composition
Mode 1: A compute task is sent to a serverless resource group for execution	PyODPS, Shell, Data Integration, Data Quality	Serverless resource group	Serverless resource group fees only
Mode 2: A compute task is sent to a third-party engine for execution through a serverless resource group	EMR Hive, Hologres SQL	Serverless resource group + Third-party engine	Serverless resource group fees + Third-party engine fees
Mode 3: A scheduling task is sent to a third-party engine for execution through Operation Center	MaxCompute SQL, Flink SQL	Third-party engine	Third-party engine fees only

Appendix 3: Fee breakdown for specific modules

When you use a serverless resource group with the following modules, the following fees apply:

Data Integration: When you perform data synchronization, Data Integration tasks run in the Data Integration, Data Studio, and Operation Center modules. This consumes resources from the serverless resource group and incurs a resource usage fee. Periodic synchronization tasks also incur a task scheduling fee.
Data Studio: When you use Data Studio for task development, data compute and scheduling tasks run in the Data Studio, Data Quality, and Operation Center modules. This consumes resources from the serverless resource group and incurs a resource usage fee and a task scheduling fee. Using a personal development environment incurs an additional resource usage fee. Using a large model service or large model node also incurs an additional resource usage fee.
Data Analysis: When you use Data Analysis for SQL query analysis or to download query results, data compute tasks run in the Data Analytics module. This consumes resources from the serverless resource group and incurs a resource usage fee. Using Data Analysis also incurs a task scheduling fee.
DataService Studio: In DataService Studio, you allocate CUs via resource quota management, which consumes serverless resources and incurs a resource usage fee. Using Data Push also incurs a task scheduling fee.