Hologres supports the Rebalance function in V2.0.21 and later. This function triggers a shard rebalancing operation. This topic describes how to use the Rebalance function to rebalance shards.
Background information
During normal runtime, Hologres worker nodes load shard metadata evenly. In some cases, such as after a fast recovery, shards can become unevenly distributed. You can trigger a rebalancing operation to redistribute the shards evenly across all worker nodes.
Limits
The Rebalance function is supported in Hologres V2.0.21 and later. If your instance runs an earlier version, you can use the self-service upgrade feature or join the Hologres DingTalk group to request an instance upgrade. For more information, see How do I get more online support?.
Command syntax
The syntax to trigger a shard rebalancing operation varies by instance type.
General-purpose and read-only replica instances
The Rebalance function triggers shard redistribution on the worker nodes of a general-purpose or read-only replica instance. The syntax is as follows:
SELECT hg_rebalance_instance();Return values:
true: The rebalancing operation was successfully triggered. The system starts the operation.
false: Rebalancing is not required.
Error: The rebalancing operation failed to trigger. For example, an error is returned if you attempt to trigger the operation while a pod is faulty.
During the rebalancing process, the system calculates whether rebalancing is needed to achieve a balanced state. A balanced state is achieved when the difference in the number of shards loaded by each worker is no more than 1. For example:
If there are 2 workers and 2 shards, each worker is assigned 1 shard.
If there are 2 workers and 3 shards, one worker is assigned 1 shard, and the other is assigned 2 shards.
The rebalancing operation usually takes 2 to 3 minutes. The actual duration depends on the number of table groups in the instance. The more table groups there are, the longer the rebalancing takes. During this process, write operations are interrupted for approximately 15 seconds.
Because rebalancing is an asynchronous operation, you can run the following SQL statement to check its progress:
SELECT hg_get_rebalance_instance_status();Return values:
DOING: The rebalancing operation is in progress.
DONE: The rebalancing operation is complete.
FAQ
How can I tell if shards are unevenly distributed?
When a query is running, the load should be relatively balanced across all worker nodes. If a worker node has no shards, its load is much lower than the others.
Example of balanced shard distribution: The following figure shows that when query and write operations are running on an instance with 10 worker nodes, the CPU utilization of all 10 workers is similar, as shown in the monitoring data.

Example of unbalanced shard distribution: The following figure shows that some worker nodes have no shards. The monitoring data shows that the load on these worker nodes is much lower than on others.

You can run the following SQL statement to check which workers have loaded shard metadata:
SELECT DISTINCT worker_id FROM hologres.hg_worker_info;Return value:

Result analysis: Only 9 workers have loaded shard metadata. One worker has not loaded any shard metadata.
Action: Run the rebalancing operation. After the operation is complete, the monitoring data shows that the CPU utilization of the previously underutilized worker has increased significantly and is now similar to the other workers.

You can check which workers have loaded shard metadata again. You will find that all 10 workers have now loaded shard metadata.
