If your consumer client triggers rebalances too often, the root cause is typically slow message processing, misconfigured consumer parameters, or an outdated client version. This topic explains the causes by client version and provides solutions.
Symptom
Rebalances occur frequently on your consumer client when you use ApsaraMQ for Kafka.
Cause
The cause depends on your client version.
Clients earlier than version 0.10.2: The consumer does not have a separate heartbeat thread. Heartbeats are sent through the
poll()method. If message processing takes too long, the heartbeat request times out and a rebalance is triggered.Clients of version 0.10.2 or later: A separate heartbeat thread exists. However, if no messages are pulled after the time specified by
max.poll.interval.mselapses, the client leaves the consumer group and a rebalance is triggered. The default value ofmax.poll.interval.msis 5 minutes.
Key parameters
The following parameters control rebalance behavior:
| Parameter | Applicable versions | Description |
|---|---|---|
session.timeout.ms | All | Session timeout. If no heartbeat is received within this period, the broker removes the consumer from the consumer group. |
max.poll.records | All | The maximum number of messages returned per call to the poll() method. |
max.poll.interval.ms | 0.10.2 or later | The maximum interval between two consecutive calls to the poll() method. If this interval is exceeded, the consumer leaves the consumer group and a rebalance is triggered. |
Solutions
Tune parameter values
Configure the following parameters based on your client version:
session.timeout.msClient version Recommended value Earlier than 0.10.2 Larger than the time to process a batch of messages, but no greater than 30 seconds. 25 seconds is recommended. 0.10.2 or later Keep the default value of 10 seconds. max.poll.recordsSet this value to be far smaller than the result of the following formula:
max.poll.records << messages_per_thread_per_second * number_of_threads * max.poll.interval.msmax.poll.interval.ms(version 0.10.2 or later only)Set this value to be larger than the result of the following formula:
max.poll.interval.ms > max.poll.records / (messages_per_thread_per_second * number_of_threads)Improve consumption speed and separate processing threads
Improve your message processing speed by allocating a separate thread for consumption logic. This prevents slow processing from blocking heartbeats or exceeding the poll interval.
Reduce topics per consumer group
Reduce the number of topics that each consumer group subscribes to. Subscribe to no more than five topics per consumer group. For optimal stability, subscribe to one topic per consumer group.
Upgrade to version 0.10.2 or later
If you are using a client version earlier than 0.10.2, upgrade to version 0.10.2 or later. Later versions use a separate heartbeat thread, which prevents processing delays from causing heartbeat timeouts.