By Hebo
A common problem exists in the distributed system call scenario. Multiple downstream businesses need to be called for business processing while executing a core business logic. Multiple downstream businesses and the current core business must succeed or fail at the same time, thus avoiding the inconsistency between partial successes and failures. In short, the transaction in the message queue mainly solves the data consistency problem between message producers and consumers. This article introduces the scenarios, basic principles, implementation details, and actual use of RocketMQ transactional messages to help you understand and use RocketMQ transactional messages.
When a user places an order in an e-commerce scenario, downstream systems are triggered to make changes accordingly. For example, the logistics system must initiate a shipment, the credit system must update the user's credit points, and the shopping cart system must clear the user's shopping cart. The processing branches include:
Distributed system calls are characterized by the execution of one core business logic and the need to call multiple downstream services for processing. Therefore, ensuring the consistency of the execution results between the core business and the downstream businesses is the biggest challenge that needs to be solved for distributed transactions.
The typical method used to ensure result consistency among the branches is using a distributed transaction system based on the XA protocol. Encapsulate four call branches into large transactions that contain four independent transaction branches. The XA-based distributed transaction solution can satisfy the correctness of business processing results. However, the biggest disadvantage is that the resource locking range is large, and the concurrency is low in a multi-branch environment. As the downstream branches increase, the system performance will become worse.
Simplify the preceding XA transaction-based solution. Order system changes are performed as local transactions, and the remaining system changes are performed as the downstream of normal messages. The transaction branch is simplified into normal messages + order table transactions, which fully uses the asynchronous capabilities of messages to shorten links and improve concurrency.
However, this solution is prone to deliver inconsistent results between the core transaction and transaction branches. For example:
In the preceding normal message solution, normal messages and order transactions cannot be consistent because normal messages fail to be committed, rolled back, and coordinated like stand-alone database transactions.
The distributed transactional message feature based on Message Queue for Apache RocketMQ supports two-phase commits based on normal messages. Bind a two-phase commit to a local transaction to achieve consistency in global commit results.
The solution of transactional messages in Message Queue for Apache RocketMQ provides the advantages of high performance, scalability, and simple business development.
The interaction process of transactional messages is shown in the following figure:
1. The producer sends the message to the RocketMQ server.
2. After the RocketMQ server persists the message, it returns ACK to the producer to confirm that the message has been sent. At this time, the message is marked as temporarily undeliverable. The message in this state is half-transactional.
3. The producer executes the local transaction.
4. The producer submits the secondary confirmation result (Commit or Rollback) to the server based on the local transaction result. The following is the processing logic after the server receives the confirmation result:
5. If the network is disconnected or the producer application is restarted, and the Broker does not receive a second confirmation (or the status of the half message is Unknown), the Broker waits a period and sends a request to a producer in the producer cluster to query the status of the half message.
6. After the producer receives the request, the producer checks the eventual status of the local transaction that corresponds to the message.
7. The producer submits the second confirmation based on the eventual status of the checked local transaction. The server still processes the half-transactional message as per step 4.
According to the needs of the basic process of sending transactional messages, the implementation is divided into three main processes: receiving and processing Half messages, Commit or Rollback command processing, and transactional message check.
After the sender sends a Half message to the Broker in the first phase, the Broker processes the Half message. Please see the following figure for the Broker process:
The specific process is to convert the message topic to RMQ_SYS_TRANS_HALF_TOPIC and then write the rest of the message content to the Half queue. The specific implementation refers to the logical processing of the SendMessageProcessor.
After the sender completes the local transaction, it continues to send a Commit or Rollback to the Broker. Since the current transaction is completed, the Broker needs to delete the original Half messages. Due to the appendOnly feature of RocketMQ, the Broker uses OP messages to delete tags. Please see the following figure for the Broker process:
The specific implementation is in EndTransactionProcessor.
If the sender sends the UNKNOWN command, or the Broker/sender restarts publishing, the marked deleted OP messages in process 2 may be missing. Therefore, a transaction message check process is added. This process is regularly executed on asynchronous threads (transactionCheckInterval is 30s by default) to check the status of these Half messages whose OP messages are missed. Please see the following figure for more information:
The transactional message check process scans the current OP message queue and reads the queueOffset of the Half message that has been marked for deletion. If it is found that a Half message does not have the corresponding mark of the OP message and has timed out (transactionTimeOut is six seconds by default), the Half message is read and rewritten to the half queue, and the check command is sent to the original sender to check the transaction status. If there is no timeout, the OP message queue will be read after waiting to obtain new OP messages.
If the bornTime of a Half message exceeds the maximum retention time (transactionCheckMaxTimeInMs is 12 hours by default), the message is automatically skipped and not checked to avoid the status of the transaction being undetermined for a long time due to the sender exception.
Specific implementation reference:
TransactionalMessageServiceImpl#check Method
After understanding the principle of RocketMQ transactional messages, let's take a look at how to use transactions. First, we need to create a topic of the transactional message type, which can be created using the console or CLI commands.
Sending transactional messages is different from sending normal messages in the following aspects:
After the transactional message is committed, the message is a normal message delivered to the user topic. For consumers, it is no different from the consumption of normal messages.
Note:
Today, I introduced the transactional messages of RocketMQ to offer a deeper understanding of its principles and applications. I hope the transactional messages of RocketMQ can help you solve your business problems effectively.
RocketMQ Message Integration: Multi-Type Business Message-Scheduled Messages
506 posts | 48 followers
FollowAlibaba Developer - July 8, 2021
Alibaba Cloud Native - June 11, 2024
Alibaba Cloud Native Community - January 31, 2023
Alibaba Cloud Native Community - January 5, 2023
Alibaba Cloud Native Community - March 20, 2023
Alibaba Developer - August 19, 2021
506 posts | 48 followers
FollowApsaraMQ for RocketMQ is a distributed message queue service that supports reliable message-based asynchronous communication among microservices, distributed systems, and serverless applications.
Learn MoreReach global users more accurately and efficiently via IM Channel
Learn MoreAlibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.
Learn MoreA one-stop cloud service for global voice communication
Learn MoreMore Posts by Alibaba Cloud Native Community