Consumers in the same group consume messages at different rates across partitions -- some partitions lag behind others. This typically happens when the partition count is not evenly divisible by the number of consumers, causing an unbalanced workload distribution.
Diagnose the issue
Log on to the ApsaraMQ for Kafka console.
On the Groups page, click the name of the target group.
On the Group Details page, click the Consumer Status tab.
In the Actions column of the target topic, click Consumer Details.
Compare the Maximum Offset values across partitions. A partition with a significantly higher Maximum Offset received messages from the producer earlier, indicating uneven consumption progress.
Root cause
ApsaraMQ for Kafka assigns partitions to consumers within a group as evenly as possible. When the partition count divides evenly by the consumer count, each consumer handles the same number of partitions. When it does not, some consumers are assigned more partitions than others and process messages more slowly.
Example: A topic has 12 partitions and a consumer group has 5 consumers. The partitions are distributed as follows:
| Consumer | Assigned partitions |
|---|---|
| Consumer 1 | 3 |
| Consumer 2 | 3 |
| Consumer 3 | 2 |
| Consumer 4 | 2 |
| Consumer 5 | 2 |
If all five consumers have the same processing capacity, Consumer 1 and Consumer 2 fall behind because they handle 50% more partitions than the others.
The default partition count is 12 for subscription and pay-as-you-go ApsaraMQ for Kafka instances, and 3 for serverless instances.
Solution
Align the partition count with the consumer count
Make sure the partition count is evenly divisible by the consumer count.
| Partition count | Consumer count options (even distribution) |
|---|---|
| 12 (default) | 1, 2, 3, 4, 6, or 12 |
| 3 (serverless default) | 1 or 3 |
For example, with 12 partitions, use 3, 4, 6, or 12 consumers -- not 5, 7, or 8.
Additional considerations
Unequal processing capacity: Even with an evenly divisible partition-to-consumer ratio, lag can appear if individual consumers differ in processing speed -- for example, due to network latency, resource constraints, or slow downstream dependencies. Monitor per-consumer throughput to identify bottlenecks.
Monitor consumer group lag: Track the consumer group lag metric (the difference between the latest produced offset and the committed consumer offset for each partition) to detect imbalances proactively.