This topic describes the atomicity, consistency, isolation, and durability (ACID) semantics for concurrent jobs in MaxCompute, and the ACID semantics for transactional tables.
Terms
- Operation: a single job submitted in MaxCompute.
- Data object: an object that stores data, such as a non-partitioned table or a partition.
- INTO job: an SQL job that contains the INTO keyword, such as INSERT INTO or DYNAMIC INSERT INTO.
- OVERWRITE job: an SQL job that contains the OVERWRITE keyword, such as INSERT OVERWRITE or DYNAMIC INSERT OVERWRITE.
- Data upload by using Tunnel: an INTO or OVERWRITE job.
Description of ACID semantics
- Atomicity: An operation is fully complete or not performed at all. That is, an operation is not partially performed.
- Consistency: The integrity of data objects is maintained when an operation is performed.
- Isolation: An operation can be performed independent of other concurrent operations.
- Durability: After an operation is complete, modified data is permanently valid and is not lost even if a system failure occurs.
ACID semantics for concurrent write jobs in MaxCompute
- Atomicity
- If multiple jobs conflict with each other, MaxCompute ensures that only one job succeeds.
- The atomicity of the CREATE, OVERWRITE, and DROP operations on a single table or partition can be ensured.
- The atomicity of cross-table operations such as MULTI-INSERT cannot be ensured.
- In extreme cases, the following operations may not be atomic:
- A
DYNAMIC INSERT OVERWRITE
operation that is performed on more than 10,000 partitions. - An INTO operation. The atomicity of INTO operations cannot be ensured because data cleansing fails during a transaction rollback. However, the data cleansing failure does not cause loss of original data.
- A
- Consistency
- The consistency can be ensured for OVERWRITE jobs.
- If an INTO job fails due to a conflict, data from the failed job may remain.
- Isolation
- For non-INTO operations, MaxCompute ensures that read operations are submitted.
- For INTO operations, some read operations may not be submitted.
- Durability
- MaxCompute ensures data durability.
ACID semantics for transactional tables
- For INTO operations, MaxCompute ensures that read operations are submitted. If an INTO job fails due to a conflict, data from the failed job does not remain.
- The atomicity of the UPDATE, DELETE, and small file MERGE operations on a non-partitioned
table or a partition can be ensured.
For example, if two UPDATE operations are performed on a partition at the same time, only one UPDATE operation succeeds. The following cases do not exist: 1. An UPDATE operation is partially performed. 2. Both UPDATE operations succeed.
Conflict of concurrent operations
When jobs are concurrently performed on the same destination table, a conflict may occur. In the event of a conflict, the job that ends earlier succeeds, and the job that ends later may fail due to the conflict.
The following table describes the results of jobs that are submitted at the same time on a non-partitioned table or a partition.
Job type | INSERT OVERWRITE or TRUNCATE job that ends later | INSERT INTO job that ends later | UPDATE or DELETE job that ends later | Small file MERGE job that ends later |
---|---|---|---|---|
INSERT OVERWRITE or TRUNCATE job that ends earlier |
|
|
|
|
INSERT INTO job that ends earlier |
|
|
|
|
UPDATE or DELETE job that ends earlier |
|
|
|
|
Small file MERGE job that ends earlier |
|
|
|
|
- INSERT operations do not report errors due to conflicts when data changes.
- The UPDATE, DELETE, and small file MERGE operations report errors due to conflicts when data in the destination non-partitioned table or partition changes.