This topic describes how to use Serverless workflow to guarantee that distributed transactions are reliably processed in a complex flow, helping you focus on your business logic.
Overview
In complex scenarios involving order management, such as e-commerce websites, hotel booking, and flight reservations, applications need to access multiple remote services, and have high requirements for the operational semantics of transactions. In other words, all steps must succeed or fail without intermediate states. In applications with small traffic and centralized data storage, the atomicity, consistency, isolation, durability (ACID) properties of relational databases can guarantee that transactions are reliably processed. However, in large-traffic scenarios, distributed microservices are usually used for high availability and scalability. To guarantee reliable processing of multi-step transactions, the service providers usually need to introduce queues and persistent messages and display the flow status to the distributed architecture. This brings additional development and O&M costs. To resolve the preceding problems, Serverless workflow provides guarantee on reliable processing of distributed transactions in complex flows.
Scenarios
Assume that an application provides the train ticket, flight, and hotel booking feature and ensures that the transactions are reliably processed in three steps. Three remote calls are required to implement this feature (for example, you must call the 12306 API to book a train ticket). If all the three calls are successful, the order is successful. However, any of the three remote calls may fail. Therefore, the application must have compensation logic for different failure scenarios to roll back completed operations. The following figure shows the details.
- If BuyTrainTicket is successful but ReserveFlight fails, the application calls CancelTrainTicket and notifies the user that the order failed.
- If both BuyTrainTicket and ReserveFlight are successful but ReserveHotel fails, the application calls CancelFlight and CancelTrainTicket and notifies the user that the order failed.
Implementation in Serverless workflow
In the following example, a function deployed in Function Compute is orchestrated into a flow in Serverless workflow to implement a reliable multi-step complex flow in three steps:
Step 1: Create a function in Function Compute to simulate the BuyTrainTicket, ReserveFlight, and ReserveHotel operations
- Service: fnf-demo
- Function: Operation
The Operation function simulates the operations such as ReserveFlight, and ReserveHotel. The Operation result (success or failure) is determined by the input.
import json
import logging
import uuid
def handler(event, context):
evt = json.loads(event)
logger = logging.getLogger()
id = uuid.uuid4()
op = "operation"
if 'operation' in evt:
op = evt['operation']
if op in evt:
result = evt[op]
if result == False:
logger.info("%s failed" % op)
exit()
logger.info("%s succeeded, id %s" % (op, id))
return '{"%s":"success", "%s_txnID": "%s"}' % (op, op, id)
Step 2: Create a flow
In the Serverless workflow console, perform the following steps to create a flow:
- Configure a Resource Access Management (RAM) user for the flow.
{ "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "Service": [ "fnf.aliyuncs.com" ] } } ], "Version": "1" }
- Define the flow.
version: v1 type: flow steps: - type: task resourceArn: acs:fc:{region}:{accountID}:services/fnf-demo/functions/Operation name: BuyTrainTicket inputMappings: - target: operation source: buy_train_ticket - target: buy_train_ticket source: $input.buy_train_ticket_result catch: - errors: - FC.Unknown goto: OrderFailed - type: task resourceArn: acs:fc:{region}:{accountID}:services/fnf-demo/functions/Operation name: ReserveFlight inputMappings: - target: operation source: reserve_flight - target: reserve_flight source: $input.reserve_flight_result catch: # When the FC.Unknown error thrown by the ReserveFlight task is captured, Serverless Workflow jumps to the CancelTrainTicket task. - errors: - FC.Unknown goto: CancelTrainTicket - type: task resourceArn: acs:fc:{region}:{accountID}:services/fnf-demo/functions/Operation name: ReserveHotel inputMappings: - target: operation source: reserve_hotel - target: reserve_hotel source: $input.reserve_hotel_result retry: # Serverless Workflow retries the task step up to three times in the exponential backoff mode upon an FC.Unknown error. The initial retry interval is 1s, and the next retry interval is twice the previous retry interval for the rest of the retries. - errors: - FC.Unknown intervalSeconds: 1 maxAttempts: 3 multiplier: 2 catch: # When the FC.Unknown error thrown by the ReserveHotel task is captured, Serverless Workflow jumps to the CancelFlight task. - errors: - FC.Unknown goto: CancelFlight - type: succeed name: OrderSucceeded - type: task resourceArn: acs:fc:{region}:{accountID}:services/fnf-demo/functions/Operation name: CancelFlight inputMappings: - target: operation source: cancel_flight - target: reserve_flight_txnID source: $local.reserve_flight_txnID - type: task resourceArn: acs:fc:{region}:{accountID}:services/fnf-demo/functions/Operation name: CancelTrainTicket inputMappings: - target: operation source: cancel_train_ticket - target: reserve_flight_txnID source: $local.reserve_flight_txnID - type: fail name: OrderFailed
Step 3: Execute the flow and view the result
Execute the flow you created in the console. The inputs for the StartExecution operation must be in JSON format. The following JSON objects can simulate the success or failure of each step. For example, "reserve_hotel_result":"fail" indicates a failure to reserve a hotel. StartExecution is an asynchronous operation. After the operation is called, Serverless workflow returns an execution name for you to query the flow execution status.
{
"buy_train_ticket_result":"success",
"reserve_flight_result":"success",
"reserve_hotel_result":"fail"
}
After the flow execution starts, in the Serverless workflow console, click the target execution name. On the page that appears, view the execution process and results in the Definition and Visual Workflow section. As shown in the following figure, due to "reserve_hotel_result":"fail"
, ReserveHotel
fails, and Serverless workflow calls CancelFlight and CancelTrainTicket in sequence based on the flow definition. In Serverless workflow, each step is persistent. In this way, failures such as network interruption or unexpected process exits do not affect the transactions in the flow.
An execution event is generated for each flow execution. You can call the GetExecutionHistory
operation to query the execution events in the console or by using the SDK or command-line interface (CLI).
Error handling and retries
- In the preceding example, remote calls of ReserveFlight and ReserveHotel fail due to network or service errors. Retry upon transient errors can improve the success rate of the ordering flow. Serverless workflow automatically retries task steps. For example, define the ReserveHotel step based on the following code to retry the step in exponential backoff mode after the FC.Unknown is captured. If
ReserveHotel
still fails after the maximum number of retries, based on thecatch
definition of the step, Serverless Workflow captures the FC.Unknown error thrown by the ReserveHotel function and then jumps to theCancelFlight
operation and implements the defined compensation logic.- type: task resourceArn: acs:fc:{region}:{accountID}:services/fnf-demo/functions/Operation name: ReserveHotel inputMappings: - target: operation source: reserve_hotel retry: # Serverless Workflow retries the task step up to three times in the exponential backoff mode upon an FC.Unknown error. The initial retry interval is 1s, and the next retry interval is twice the previous retry interval for the rest of the retries. - errors: - FC.Unknown intervalSeconds: 1 maxAttempts: 3 multiplier: 2 catch: # When the FC.Unknown error thrown by the ReserveHotel task is captured, Serverless Workflow jumps to the CancelFlight task. - errors: - FC.Unknown goto: CancelFlight
- The following figure shows that, after the retry parameter is defined, the ReserveHotel task step is retried the specified maximum number of times.
Data transfer between steps
- After ReserveHotel fails, CancelFlight and CancelTrainTicket are called. To cancel these two tasks, the transaction IDs (txnID) returned by ReserveFlight and BuyTrainTicket are required. The following section describes how to use the
inputMapping
object to pass the outputs of the previous steps to theCancelFlight
step.- type: task resourceArn: acs:fc:{region}:{accountID}:services/fnf-demo/functions/Operation name: CancelFlight inputMappings: - target: operation source: cancel_flight - target: reserve_flight_txnID source: $local.reserve_flight_txnID
- Outputs of each step of the flow are stored in the local object of EventDetail in the
StepExited
event.{ "input":{ "operation":"reserve_hotel", "reserve_hotel_result":"fail" }, "local":{ "buy_train_ticket":"success", "buy_train_ticket_txnID":"d37412b3-bb68-4d04-9d90-c8c15643d45e", "reserve_flight_result":"success", "reserve_flight_txnID":"024caecf-cfa3-43a6-b561-9b6fe0571b55" }, "resourceArn":"acs:fc:{region}:{accountID}:services/fnf-demo/functions/Operation", "cause":"{\"errorMessage\":\"Process exited unexpectedly before completing request (duration: 12ms, maxMemoryUsage: 9.18MB)\"}", "error":"FC.Unknown", "retryCount":3, "goto":"CancelFlight" }
- Based on
EventDetail
andinputMappings
, the inputs of theCancelFlight
step are converted into the following JSON object. In this way, the inputs of theCancelFlight
function contain thereserve_flight_txnID
field."input":{ "operation":"cancel_flight", "reserve_flight_txnID":"024caecf-cfa3-43a6-b561-9b6fe0571b55" }