Simple Message Queue (formerly MNS) throttles requests that exceed the threshold to prevent excessive pressure on underlying resources.
Throttling behavior
When traffic approaches or reaches the throttling threshold, the server-side automatically and elastically adjusts the threshold based on real-time resource usage. In most scenarios, this dynamic adjustment supports higher concurrency without affecting users. If temporary throttling is triggered by events such as burst traffic or cluster resource bottlenecks, the system resumes traffic processing and increases the throttling threshold after it automatically scales out.
When a throttling error is triggered, the system activates a backpressure mechanism. The server-side suspends requests that exceed the threshold for about 500 ms before returning an error. This prevents system overloads from affecting overall performance and stability.
Error code
When the throttling policy is triggered, the Simple Message Queue (formerly MNS) server-side returns the following error code.
HTTP status code | Error code | Error message |
429 | TooManyRequests | The request is denied by cluster flow limiter for too many requests. |
Throttling threshold details
Throttling policy for abnormal behavior in the queue consumption pattern
In a standard queue consumption pattern, the client should delete a message after processing it. If a client repeatedly receives messages but does not send delete requests, the system flags this as abnormal behavior. This behavior triggers throttling to protect system stability, which significantly reduces the rate at which the client can receive messages.
Throttling is triggered if any of the following conditions are met:
Duration: The abnormal behavior persists for more than 30 minutes.
Message count: The total number of messages that are received but not deleted reaches 5,000.
Rate: The instantaneous rate of receiving but not deleting messages exceeds 1,000 Transactions Per Second (TPS).
Throttling policy for high-traffic requests
The default throttling threshold is 20,000 TPS for each Alibaba Cloud account in each region. If your traffic exceeds 20,000 TPS, you can submit a ticket to increase the default threshold.
Requests are counted as follows:
Each call to an API operation is counted as one request.
TPS calculation for batch sending: When you use the BatchSendMessage API operation to send messages to a queue, the TPS for BatchSendMessage is calculated as: actual requests per second × number of messages in the request. For example, if you make 100 BatchSendMessage requests per second and each request contains 10 messages, the TPS for that queue is 100 × 10 = 1,000.
TPS calculation for batch consumption: When you use the BatchReceiveMessage API operation to receive messages from a queue, the TPS for BatchReceiveMessage is calculated as: actual requests per second × number of messages in the request. For example, if you make 100 BatchReceiveMessage requests per second and each request contains 10 messages, the TPS for that queue is 100 × 10 = 1,000.
Avoid the impact of throttling
To avoid the impact of throttling on your business, consider the following:
Plan your traffic and communicate peak traffic in advance: If you expect a large increase in traffic, submit a ticket to contact us. We will reserve more resources for you to prevent throttling.
Monitoring and alerting: Use the monitoring tools for Simple Message Queue (formerly MNS) to retrieve real-time information about your traffic and throttling status. This lets you take timely action.