All Products
Search
Document Center

Simple Message Queue (formerly MNS):Throttling policy

Last Updated:Jan 13, 2026

Simple Message Queue (formerly MNS) throttles requests that exceed the threshold to prevent excessive pressure on underlying resources.

Throttling behavior

When traffic approaches or reaches the throttling threshold, the server-side automatically and elastically adjusts the threshold based on real-time resource usage. In most scenarios, this dynamic adjustment supports higher concurrency without affecting users. If temporary throttling is triggered by events such as burst traffic or cluster resource bottlenecks, the system resumes traffic processing and increases the throttling threshold after it automatically scales out.

When a throttling error is triggered, the system activates a backpressure mechanism. The server-side suspends requests that exceed the threshold for about 500 ms before returning an error. This prevents system overloads from affecting overall performance and stability.

Error code

When the throttling policy is triggered, the Simple Message Queue (formerly MNS) server-side returns the following error code.

HTTP status code

Error code

Error message

429

TooManyRequests

The request is denied by cluster flow limiter for too many requests.

Throttling threshold details

Throttling policy for abnormal behavior in the queue consumption pattern

In a standard queue consumption pattern, the client should delete a message after processing it. If a client repeatedly receives messages but does not send delete requests, the system flags this as abnormal behavior. This behavior triggers throttling to protect system stability, which significantly reduces the rate at which the client can receive messages.

Throttling is triggered if any of the following conditions are met:

  • Duration: The abnormal behavior persists for more than 30 minutes.

  • Message count: The total number of messages that are received but not deleted reaches 5,000.

  • Rate: The instantaneous rate of receiving but not deleting messages exceeds 1,000 Transactions Per Second (TPS).

Throttling policy for high-traffic requests

The default throttling threshold is 20,000 TPS for each Alibaba Cloud account in each region. If your traffic exceeds 20,000 TPS, you can submit a ticket to increase the default threshold.

Requests are counted as follows:

  • Each call to an API operation is counted as one request.

  • TPS calculation for batch sending: When you use the BatchSendMessage API operation to send messages to a queue, the TPS for BatchSendMessage is calculated as: actual requests per second × number of messages in the request. For example, if you make 100 BatchSendMessage requests per second and each request contains 10 messages, the TPS for that queue is 100 × 10 = 1,000.

  • TPS calculation for batch consumption: When you use the BatchReceiveMessage API operation to receive messages from a queue, the TPS for BatchReceiveMessage is calculated as: actual requests per second × number of messages in the request. For example, if you make 100 BatchReceiveMessage requests per second and each request contains 10 messages, the TPS for that queue is 100 × 10 = 1,000.

Avoid the impact of throttling

To avoid the impact of throttling on your business, consider the following:

  • Plan your traffic and communicate peak traffic in advance: If you expect a large increase in traffic, submit a ticket to contact us. We will reserve more resources for you to prevent throttling.

  • Monitoring and alerting: Use the monitoring tools for Simple Message Queue (formerly MNS) to retrieve real-time information about your traffic and throttling status. This lets you take timely action.

FAQ

Is my service limited to 20,000 TPS?

No, it is not. 20,000 TPS is the default guaranteed value. The actual supported TPS can be higher, depending on the cluster load and elastic capacity.

Why do I sometimes get errors when exceeding 20,000 TPS, but not always?

This depends on the real-time load of the cluster. When resources are sufficient, the system elastically supports a higher TPS. When resources are constrained, throttling may be triggered.

Will throttling errors affect my business?

Throttling errors are part of a system protection mechanism. They prevent cluster overloads from causing widespread service interruptions. The system scales out as soon as possible to restore the service. If you receive a throttling error, wait for a short period and then retry the request.

Why is throttling triggered sometimes but not at other times?

The system's elastic capacity and throttling behavior are closely related to the following factors:

  • Current cluster load: If the cluster has sufficient resources, the system can support a higher TPS.

  • Traffic fluctuations: A sudden burst of traffic can cause a temporary resource shortage, which triggers throttling.

  • Scale-out speed: Automatic scale-out takes time. During this period, temporary throttling may occur.

Therefore, whether throttling is triggered depends on the real-time status of the cluster and the traffic conditions.