This topic describes the resource measurement method, billable items, billing method, and region-specific unit price of E-MapReduce (EMR) Serverless Spark.
CU
A compute unit (CU) is the basic unit of computing capabilities in EMR Serverless Spark workspaces and is billed by minute. The unit price of a CU depends on the CPU architecture of an EMR Serverless Spark workspace and the high availability attribute of the zone. By default, the Intel x86 architecture and a single zone are used. The unit price of a CU also varies based on regions.
Measurement method
CUs reflect the CPU compute power of the underlying system of Serverless Spark. The number of CUs consumed by a computing task varies based on the actual amount of data processed by the task, the computing complexity, the distribution of data, and whether you enable the Fusion engine. If you enable the Fusion engine for acceleration, the CU consumption per unit time increases by 25%, but the time required to run a job is reduced by more than 60% in most cases, leading to higher cost-effectiveness. You can estimate the number of CUs that you need to purchase based on your business scale and the amount of data that you want to process. By default, one CU is equal to 1 CPU core and 4 GiB of memory. If the ratio of CPU cores to memory is not 1:4, you can calculate the number of CUs by using the following formula: max(Number of CPU cores/1, Memory size/4)
.
The following table describes the processing capability of one CU.
Scenario | Processing capability (Java Runtime) | Processing capability (Fusion engine) |
Simple data processing, such as filtering and cleansing | One CU can process about 2,000,000 data records per second. | One CU can process about 5,000,000 data records per second. |
Complex data processing, such as aggregation, join, and string-related operations | One CU can process about 700,000 data records per second. | One CU can process about 2,000,000 data records per second. |
Billable item
Only the pay-as-you-go billing method is supported. Billing formula:
Workspace fee = Number of CUs consumed per hour × Hourly unit price
The pay-as-you-go billing method supports only the Intel x86 architecture. A pay-as-you-go workspace is billed by minute. The billing cycle is 1 hour.
The fee of one Serverless Spark workspace is calculated based on the preceding formula. If you purchase multiple workspaces for an Alibaba Cloud account, you must complete payment based on the total fees of these workspaces.
Billing method
Pay-as-you-go
The pay-as-you-go billing method allows you to use resources before you pay for the resources. You do not need to purchase a large number of resources in advance. The system calculates the fees based on the actual resource usage of your workspace. The following table describes the scenarios and billing rules of the pay-as-you-go billing method and the billing cycle.
Item | Description |
Scenarios | The pay-as-you-go billing method is suitable for the following scenarios:
|
Billing rules | The bill that is generated in a one-hour billing cycle reflects the fee of the consumed computing resources. The computing resource fee in a billing cycle is calculated by using the following formula:
Sample configurations of a Spark job:
The configurations show that 3 CPU cores and 6 GB of memory are consumed per minute. The usage of computing resources in an hour is calculated by using the following formula: Important
|
Billing cycle | Bills are generated on an hourly basis at the top of every hour (UTC+8). The new billing cycle starts after the bills are settled. After each billing cycle ends, the system generates a bill and deducts the fees from your account. Bill details may be generated with a delay. |
Unit prices in different regions
The following table describes the unit prices of different types of workspaces in different regions.
You can obtain the actual price on the buy page.
Region | Unit price (USD per CU per hour) |
Indonesia (Jakarta) | 0.067106 |
Germany (Frankfurt) | 0.064792 |
Singapore | 0.067106 |
US (Virginia) | 0.053801 |
Precautions
EMR Serverless Spark resources are billed on the pay-as-you-go method. During peak hours, tasks in an EMR Serverless Spark workspace may compete for resources, and the timeliness of resource usage cannot be guaranteed.