API Gateway provides the circuit breaker mechanism to protect systems at times of backend performance deterioration. This topic describes the configuration rules for circuit breaker plug-ins.
Limits
Circuit breaker plug-ins take effect only for APIs in dedicated instances.
Each conditional expression can contain a maximum of 512 characters.
Each plug-in can contain a maximum of 50 KB of metadata.
1. Overview
API Gateway provides a circuit breaker for each API to protect the API in the event of abnormal backend performance. By default, if timeout occurs 1,000 times at the backend of an API within 30 seconds, the circuit breaker trips. The circuit breaker stays open for 90 seconds, during which the following error is returned for all API requests: Status=503,
X-Ca-Error-Code=D503CB
. After 90 seconds, the circuit breaker allows a limited number of concurrent API requests to pass through. If these requests are successful, the circuit breaker closes and API requests can be handled as expected again.
You can also bind a plug-in of the Circuit Breaker type to an API to customize the configurations of its circuit breaker. Take note that circuit breaker plug-ins take effect only for APIs in dedicated instances. You can customize the following configurations of a circuit breaker:
The condition under which the circuit breaker trips. You can specify that the circuit breaker trips after the number of occurrences of timeout, or the number of occurrences of a specified error, at the backend reaches a threshold within a specified period of time.
The time window during which the number of occurrences of timeout, the number of occurrences of a long response time, or the number of occurrences of a specified error, at the backend is checked by the circuit breaker to determine whether to trip.
The period of time during which the circuit breaker stays open after it trips.
The backend to which API requests are directed when the circuit breaker is open.
2. Configurations
Circuit breaker plug-ins take effect only for APIs in dedicated instances. If you bind a circuit breaker plug-in to an API in a shared instance, the circuit breaker still uses the default configurations.
2.1 Specify that the circuit breaker trips after the number of occurrences of timeout at the backend reaches a threshold
When you configure a circuit breaker plug-in, you can specify that the circuit breaker trips after the number of occurrences of timeout at the backend reaches a threshold within a specified period of time. If the backend timeout threshold specified for an API is 10 seconds and no response is received from the backend within 10 seconds, one occurrence of timeout is counted.
timeoutThreshold: 15 # The threshold of the number of occurrences of timeout at the backend.
windowInSeconds: 30 # The time window during which the number of occurrences of timeout at the backend is checked by the circuit breaker to determine whether to trip.
openTimeoutSeconds: 15 # The period of time during which the circuit breaker stays open after it trips.
downgradeBackend: # The backend to which API requests are directed when the circuit breaker is open.
type: mock
statusCode: 418
In the preceding code snippet, you can specify the following parameters:
timeoutThreshold
: the threshold of the number of occurrences of timeout at the backend. If this threshold is reached, the circuit breaker trips. The maximum value of this parameter is 5000. We recommend that you specify an appropriate value. If the value is small, the circuit breaker trips regularly after timeout occurs only several times.windowInSeconds
: the time window during which the number of occurrences of timeout at the backend is checked by the circuit breaker to determine whether to trip. Valid values: 10 to 90. Unit: seconds.openTimeoutSeconds
: the period of time during which the circuit breaker stays open after it trips. Valid values: 15 to 300. Unit: seconds.downgradeBackend
: optional. The backend to which API requests are directed when the circuit breaker is open.
2.2 Specify that the circuit breaker trips after the number of occurrences of long response at the backend reaches a threshold
When you configure a circuit breaker plug-in, you can specify that the circuit breaker trips after the number of occurrences of long response at the backend reaches a threshold within a specified period of time. The backend response time is the duration between when API Gateway sends a request to the backend and when API Gateway receives a response from the backend.
errorThreshold: 10 # The threshold of the number of occurrences of long response at the backend.
windowInSeconds: 60 # The time window during which the number of occurrences of long response at the backend is checked by the circuit breaker to determine whether to trip.
openTimeoutSeconds: 120 # The period of time during which the circuit breaker stays open after it trips.
errorCondition: "$LatencyMilliSeconds > 500" # The conditional expression that is used to determine whether the backend response is counted as a long response. In this example, if the backend response time exceeds 500 ms, the response is considered as a long response.
downgradeBackend: # The backend to which API requests are directed when the circuit breaker is open.
type: mock
statusCode: 403
In the preceding code snippet, you can specify the following parameters:
errorThreshold
: the threshold of the number of occurrences of long response.windowInSeconds
: the time window during which the number of occurrences of timeout at the backend is checked by the circuit breaker to determine whether to trip. Valid values: 10 to 90. Unit: seconds.openTimeoutSeconds
: the period of time during which the circuit breaker stays open after it trips. Valid values: 15 to 300. Unit: seconds.errorCondition
: the conditional expression that is used to determine whether the backend response is counted as a long response. You can use the$LatencyMilliSeconds
and$LatencySeconds
variables. The unit of$LatencyMilliSeconds
is milliseconds. The unit of$LatencySeconds
is seconds.downgradeBackend
: optional. The backend to which API requests are directed when the circuit breaker is open.
2.3 Specify that the circuit breaker trips after the number of occurrences of a specified error at the backend reaches a threshold
When you configure a circuit breaker plug-in, you can specify that the circuit breaker trips after the number of occurrences of a specified error at the backend reaches a threshold within a specified period of time.
errorCondition: "$StatusCode == 503" # The conditional expression that specifies the error whose number of occurrences is checked by the circuit breaker to determine whether to trip.
errorThreshold: 1000 # The threshold of the number of occurrences of the specified error.
windowInSeconds: 30 # The time window during which the number of occurrences of the specified error at the backend is checked by the circuit breaker to determine whether to trip.
openTimeoutSeconds: 15 # The period of time during which the circuit breaker stays open after it trips.
downgradeBackend: # The backend to which API requests are directed when the circuit breaker is open.
type: "HTTP"
address: "http://api.foo.com"
path: "/system-busy.json"
method: GET
errorCondition
: the conditional expression that specifies the error whose number of occurrences is checked by the circuit breaker to determine whether to trip. You can use the$StatusCode
and$LatencySeconds
variables.If you specify a conditional expression as
$StatusCode = 503 or $StatusCode = 504
, the circuit breaker checks the total number of occurrences of HTTP status code 503 or 504.If you specify a conditional expression as $LatancySeconds > 30, the circuit breaker checks the total number of occurrences of timeout that lasts longer than 30 seconds.
errorThreshold
: the threshold of the number of occurrences of the specified error.windowInSeconds
: the time window during which the number of occurrences of timeout at the backend is checked by the circuit breaker to determine whether to trip. Valid values: 10 to 90. Unit: seconds.openTimeoutSeconds
: the period of time during which the circuit breaker stays open after it trips. Valid values: 15 to 300. Unit: seconds.downgradeBackend
: optional. The backend to which API requests are directed when the circuit breaker is open.
2.4 Accurate status control
The API Gateway service is deployed on multiple nodes in a cluster to ensure high availability and performance. By default, different service nodes independently calculate and save the circuit breaker status. As a consequence, from a global perspective, the circuit breaker may have status inaccuracy. If you require accurate circuit breaker status, you can add the useGlobalState field to the plug-in configuration. Example:
---
timeoutThreshold: 15 # The threshold of the number of occurrences of timeout at the backend.
windowInSeconds: 30 # The time window during which the number of occurrences of timeout at the backend is checked by the circuit breaker to determine whether to trip.
openTimeoutSeconds: 15 # The period of time during which the circuit breaker stays open after it trips.
useGlobalState: true # Accurate status control is enabled.
downgradeBackend: # The backend to which API requests are directed when the circuit breaker is open.
type: mock
statusCode: 302
body: |
<result>
<errorCode>I's a teapot</errorCode>
</result>
The default value of useGlobalState is false. If you set it to true, the accurate circuit breaker status is obtained at the cost of service performance loss to a degree that does not compromise the promised queries per second (QPS) and service level agreement (SLA) metrics of the current instance.
2.5 Specify that the circuit breaker trips after the percentage of requests in which a specific error occurs to all requests within a specified period of time reaches a threshold
API Gateway trips a circuit breaker when one of the following conditions is met:
errorThreshold
: the threshold of the number of occurrences of the specific error at the backend. This field is used in combination with a conditional expression.timeoutThreshold
: the threshold of the number of occurrences of timeout at the backend.errorThresholdByPercent
: the threshold of the percentage of the number of requests in which the specific error occurs to the total number of requests in a time window.timeoutThresholdByPercent
: the threshold of the percentage of the number of requests in which timeout occurs to the total number of requests in a time window.
The following code shows an example:
---
windowInSeconds: 3 # The time window during which the circuit breaker determines whether to trip. Valid values: 10 to 90. Unit: seconds.
openTimeoutSeconds: 3
errorThreshold: 90 # The threshold of the number of occurrences of the specified error.
timeoutThreshold: 90 # The threshold of the number of occurrences of timeout.
errorThresholdByPercent: 20 # The threshold of the percentage of requests in which the specified error occurs to the total number of requests.
timeoutThresholdByPercent: 20 # The threshold of the percentage of requests in which timeout occurs to the total number of requests.
errorCondition: "$StatusCode = 500" # The error condition.
downgradeBackend:
type: mock
statusCode: 418
body: |
<result>
<errorCode>I's a teapot</errorCode>
</result>
If you use a percentage threshold, the number of requests in a time window must be 100 at least. Otherwise, the rule does not take effect.
In this example, the
errorThreshold: 90
timeoutThreshold: 90
specifies that the circuit breaker trips if the number of occurrences of the specified error or timeout exceeds 90 in a time window.In this example, the
errorThresholdByPercent: 20
timeoutThresholdByPercent: 20
configuration specifies that the circuit breaker trips if the number of occurrences of the specified error or timeout exceeds 20 for every 100 requests in a time window.This feature is supported by versions released in and after June 2023.
2.6 Throttle requests when the circuit breaker trips
Once the circuit breaker trips, a temporary throttling configuration is added to the API, and all traffic is throttled based on this configuration when the circuit breaker is open or half open. Example:
---
windowInSeconds: 1 # The time window in which the circuit breaker checks the number of occurrences of timeout at the backend.
openTimeoutSeconds: 15 # The period of time during which the circuit breaker stays open after it trips.
errorThreshold: 3
errorCondition: "$LatencyMilliSeconds > 1"
downgradeTrafficLimit: # The backend to which API requests are directed when the circuit breaker is open.
limit: 2
period: MINUTE
3. Configure the backend to which API requests are directed when the circuit breaker is open
You can set the downgradeBackend
parameter to specify a backend to which API requests are directed when the circuit breaker is open. The configurations of the backend must be consistent with the API specification files that are imported to API Gateway. For more information, see Import Swagger files to create APIs with API Gateway extensions. You can configure the following types of backends by using the samples:
HTTP
---
backend:
type: HTTP
address: "http://10.10.100.2:8000"
path: "/users/{userId}"
method: GET
timeout: 7000
HTTP-VPC
---
backend:
type: HTTP-VPC
vpcAccessName: vpcAccess1
path: "/users/{userId}"
method: GET
timeout: 10000
Function Compute
---
backend:
type: FC
fcRegion: cn-shanghai
serviceName: fcService
functionName: fcFunction
arn: "acs:ram::111111111:role/aliyunapigatewayaccessingfcrole"
MOCK
---
backend:
type: MOCK
mockResult: "mock result sample"
mockStatusCode: 200
mockHeaders:
- name: Content-Type
value: text-plain
- name: Content-Language
value: zhCN
4. Error codes
Error code | HTTP status code | Message | Description |
D503BB | 503 | Backend circuit breaker busy | The error message returned because the API is protected by its circuit breaker. |
D503CB | 503 | Backend circuit breaker open, ${Reason} | The error message returned because the circuit breaker of the API is open. Test API calls after you check the backend performance of the API. |