Use the shard-level replication feature in Hologres - Hologres

This topic describes how to use the shard-level replication feature in Hologres.

Overview

In Hologres V1.1 and later, you can specify the replica count of each shard for a table group to improve the concurrency and availability of queries on the table group. To enable the shard-level replication feature, you can explicitly specify the replica count when you create a table group or modify the replica count for an existing table group. The new replicas are runtime replicas that occupy memory and do not increase your storage costs.

If you want to specify or modify the replica count of each shard, take note of the following items:

Data on a Hologres instance is distributed by shard. Each shard manages different data. The data is not duplicated between different shards. The data in all shards forms a complete set of data.
By default, each shard has only one replica. In this case, the value of the replica_count parameter is 1. This replica serves as the leader shard. You can modify the replica_count parameter to increase the number of replicas for each shard. Shard replicas other than the leader shard serve as follower shards.
The leader shard is responsible for all write requests. Read requests are evenly distributed among the leader shard and follower shards. If a query is routed to a follower shard, the query may have a latency of 10 to 20 ms.
The default value of the replica_count parameter is 1, which specifies that the shard-level replication feature is disabled. A value greater than 1 specifies that the shard-level replication feature is enabled. A greater value specifies that more resources are required. Only one replica can be deployed on one compute node because of the anti-affinity feature. The value of the replica_count parameter must be less than or equal to the number of compute nodes. In Hologres V1.3.53 and later, the maximum value of the replica_count parameter is the number of worker nodes. If the upper limit is exceeded, an error message is reported. For more information about the number of compute nodes for different instance specifications, see Instance specifications.
To balance the computing power of compute nodes, you must decrease the number of shards when you increase the number of replicas for each shard. To achieve optimal performance, make sure that the following equation is met: Value of the shard_count parameter × Value of the replica_count parameter = Recommended shard count of your instance.
Hologres V1.3.45 and later support queries in high-availability mode.
The CPU utilization issue shown in the following figures may occur on your Hologres instance. You can view the CPU utilization information of your Hologres instance on the monitoring information page of the Hologres console.
The preceding figures indicate that the CPU utilization of the instance is at a medium level, but the CPU utilization of one worker node is excessively low and the CPU utilization of the other worker node is excessively high. This may be caused by the uneven distribution of queries on worker nodes. Most queries are routed only to a few shards. In this case, you can increase the number of shard replicas to allow more worker nodes to have shard replicas. This can effectively improve resource usage and queries per second (QPS).
Note
- Metadata synchronization between the leader shard and follower shards consumes resources. A large number of shard replicas consume a large amount of resources for data synchronization. We recommend that you improve QPS by increasing the number of shard replicas only if the uneven CPU utilization between worker nodes is caused by the uneven distribution of queries.
- Data queries on the leader shard or follower shards have a millisecond-level latency.
After you increase the number of shard replicas, the resources of each worker node are fully utilized, as shown in the following figure.

Limits

Only Hologres V1.1 and later support the shard-level replication feature.
Note
You can check the version of your Hologres instance on the instance details page in the Hologres console. If the version of your Hologres instance is V0.10 or earlier, you can manually upgrade your instance or join the Hologres DingTalk group to ask engineers to upgrade your instance. For more information, see the "Manual upgrade" section in Instance upgrades and Obtain online support for Hologres.
The value of the replica_count parameter must be less than or equal to the number of compute nodes. You can check the number of compute nodes on your Hologres instance on the instance details page of the Hologres console.

Syntax

Query the table groups in a database
To query the table groups in the current database, execute the following statement:
```
select * from hologres.hg_table_group_properties ;
```
Query the replica count of each shard for a table group
- Sample statement
```
select property_value from hologres.hg_table_group_properties where tablegroup_name = 'table_group_name' and property_key = 'replica_count';
```
- Parameters
  Parameter
  Description
  table_group_name
  The name of the table group that you want to query.
  replica_count
  The property that you want to query. Set the value to replica_count.
Enable the shard-level replication feature
You can enable the shard-level replication feature by modifying the replica count of each shard for a table group.
```
-- Modify the replica count of each shard for a table group.
call hg_set_table_group_property ('<table_group_name>', 'replica_count', '<replica_count>');
```
replica_count: the replica count of each shard for the table group. The value of this parameter must be less than or equal to the number of compute nodes.

Disable the shard-level replication feature

Sample statement

-- Modify the replica_count parameter to disable the shard-level replication feature.
call hg_set_table_group_property ('table_group_name', 'replica_count', '1');

Parameters

Parameter

Description

hg_set_table_group_property

The function that is used to modify the replica count of each shard for a table group.

table_group_name: the name of the table group that you want to manage.
replica_count: the replica count of each shard for the table group.
If you set the replica_count parameter to the default value 1, the shard-level replication feature is disabled. A value greater than 1 specifies that the shard-level replication feature is enabled.

Check the shard metadata that is loaded to each worker node
After you configure multiple replicas for each shard, you can check the shard metadata that is loaded to each worker node by executing the following SQL statement:
```
SELECT * FROM hologres.hg_worker_info;
```
Note
Before the metadata of a shard is loaded to a worker node, the value of the worker_id parameter for this worker node may be empty.
The following figure shows the returned result.
In this example, data in tables of the olap_replica_2 table group is distributed to two shards named Shard 0 and Shard 1. Metadata of Shard 0 and Shard 1 is loaded to the 7tn8k and 9c8sl worker nodes.

Configure shard policies to achieve high availability and high throughput of queries

Description

In this example, multiple replicas are configured for a shard. The shard replicas are loaded to multiple worker nodes. Queries are randomly routed to a shard replica on a worker node, as shown in the following figure.
If point queries time out, the system can route the point queries to the shard replica of another worker node. This ensures that results are returned for point queries. Point queries are processed by using fixed plans.

Parameters

hg_experimental_query_replica_mode: the shard policy that is used to route queries.

Scenario

Default value

Value type

Valid values

Example

All queries

leader_follower

TEXT

leader_follower: Queries are routed to the leader shard and follower shards based on a specific ratio. This is the default value.
leader_only: Queries are routed only to the leader shard. In this case, even if the value of the replica_count parameter is greater than 1, the throughput and availability cannot be improved.
follower_only: Queries are routed only to follower shards. In this case, you must set the replica_count parameter to a value greater than 3. This way, two or more follower shards exist, and the throughput and availability can be improved.

-- Session-level configuration
SET hg_experimental_query_replica_mode = leader_follower;

-- Database-level configuration
ALTER DATABASE <database_name> 
SET hg_experimental_query_replica_mode = leader_follower;

hg_experimental_query_replica_leader_weight: the weight of the leader shard based on which queries are routed.

Scenario

Default value

Value type

Valid values

Example

All queries

100

INT

Maximum value: 10000.
Minimum value: 1.
Default value: 100.

-- Session-level configuration
SET hg_experimental_query_replica_leader_weight = 100;

-- Database-level configuration
ALTER DATABASE <database_name> 
SET hg_experimental_query_replica_leader_weight = 100;

For online analytical processing (OLAP) point queries, if you set the replica_count parameter to a value greater than 1 for the table group of the table from which you want to query data, queries are routed to the leader shard and follower shards based on the values of the hg_experimental_query_replica_mode and hg_experimental_query_replica_leader_weight parameters. Examples:

Example 1: If you set the replica_count parameter to a value greater than 1 for a table group of the table from which you want to query data, and add the hg_experimental_query_replica_mode=leader_follower configuration, the system routes queries to the leader shard and follower shards based on the value of the hg_experimental_query_replica_leader_weight parameter. By default, the value of the hg_experimental_query_replica_leader_weight parameter is 100, and the weight of each follower shard is also 100. If you set the replica_count parameter to 4, 1 leader shard and 3 follower shards are available for each shard. In this case, the leader shard and each follower shard have a 25% probability to hit a query.
Example 2: If you set the replica_count parameter to a value greater than 1 for a table group of the table from which you want to query data, and set the hg_experimental_query_replica_mode parameter to leader_only, the system routes queries only to the leader shard.
Example 3: If you set the replica_count parameter to a value greater than 1 for a table group of the table from which you want to query data, and set the hg_experimental_query_replica_mode parameter to follower_only, the system routes queries only to follower shards. By default, the weight of each follower shard is 100. If you set the replica_count parameter to 4, 1 leader shard and 3 follower shards are available for each shard. In this case, queries are routed to the three follower shards, and each follower shard has a 1/3 probability to hit a query.

hg_experimental_query_replica_fixed_plan_ha_mode: the high-availability policy that is used to route point queries. Point queries are processed by using fixed plans.

Scenario

Default value

Value type

Valid values

Example

Point queries (processed by using fixed plans)

any

TEXT

any: Queries are routed to shard replicas based on the values of the hg_experimental_query_replica_mode and hg_experimental_query_replica_leader_weight parameters. This is the default value.
leader_first: This value takes effect only if the hg_experimental_query_replica_mode parameter is set to leader_follower. This value specifies that queries are preferentially routed to the leader shard. If the leader shard is unavailable due to reasons such as timeouts, queries are routed to the follower shards.
off: Queries are routed only once.

-- Session-level configuration
SET hg_experimental_query_replica_fixed_plan_ha_mode = any;

-- Database-level configuration
ALTER DATABASE <database_name> 
SET hg_experimental_query_replica_fixed_plan_ha_mode = any;

hg_experimental_query_replica_fixed_plan_first_query_timeout_ms: the threshold that determines the timeouts of first-time point queries in high-availability scenarios. After the timeout threshold is reached or exceeded, point queries are routed to another available shard. Point queries are processed by using fixed plans. For example, hg_experimental_query_replica_fixed_plan_first_query_timeout_ms=60 specifies that if no result is returned for a query within 60 ms, the system routes the query to the shard on another worker node.

Scenario

Default value

Value type

Valid values

Example

All queries

INT

Maximum value: 10000.
Minimum value: 0.
Default value: 60.

-- Session-level configuration
SET hg_experimental_query_replica_fixed_plan_first_query_timeout_ms = 60;

-- Database-level configuration
ALTER DATABASE <database_name> 
SET hg_experimental_query_replica_fixed_plan_first_query_timeout_ms = 60;

Suggestions for different scenarios

Scenario 1: Multi-replica for high throughput

Scenario description: The monitoring information indicates that the CPU utilization of the instance is at a medium level, but the CPU utilization of one worker node is excessively low and the CPU utilization of the other worker node is excessively high. This may be caused by the uneven distribution of queries on worker nodes. Most queries are routed only to a few shards. In this case, you can increase the number of shard replicas to allow more worker nodes to have shard replicas. This can effectively improve resource utilization and QPS.
Procedure
1. Increase the number of shard replicas:
  In this example, a table group named tg_replica exists in a database. Execute the following SQL statement to set the number of replicas to 2.
```
-- Set the number of replicas for tables in the tg_replica table group to 2.
call hg_set_table_group_property ('tg_replica', 'replica_count', '2');
```
  The system has the following default settings:
  - hg_experimental_query_replica_mode=leader_follower
  - hg_experimental_query_replica_leader_weight=100
  After you increase the number of replicas, the system randomly routes queries to the worker nodes that correspond to the leader shard and follower shards. This way, QPS can be increased even for hot data queries.
2. Check whether a shard replica is loaded to each worker node:
  Execute the following statement to check whether a shard replica is loaded to each worker node:
```
SELECT * FROM hologres.hg_worker_info WHERE table_group_name = 'tg_replica';
```
  The following figure shows the returned result.
  The preceding figure indicates that one shard replica is loaded to multiple worker nodes. The setting takes effect.

Scenario 2: Multi-replica for high availability

Scenario description: You want to resolve the issue that queries fail when the shard on which the queries are performed fails.
Procedure:
1. Increase the number of shard replicas:
  In this example, a table group named tg_replica exists in a database. Execute the following SQL statement to set the number of replicas to 2.
```
-- Set the number of replicas for tables in the tg_replica table group to 2.
call hg_set_table_group_property ('tg_replica', 'replica_count', '2');
```
  The system has the following default settings:
  - hg_experimental_query_replica_mode=leader_follower
  - hg_experimental_query_replica_fixed_plan_ha_mode=any
  - hg_experimental_query_replica_fixed_plan_first_query_timeout_ms=60
  After you increase the number of replicas, the system performs different operations in different scenarios:
  In OLAP scenarios, the system randomly routes queries to the worker nodes that correspond to the leader shard and follower shards. When you query data, the master node periodically checks whether each shard replica is available. The system removes unavailable shard replicas and routes queries to available shard replicas. When the unavailable shard replicas become available again, the system can route queries to the shard replicas. The system requires 5 seconds to detect whether a shard replica is available and 10 seconds to remove the FE node that corresponds to the worker node to which an unavailable shard replica is loaded. In total, the system requires 15 seconds to detect the issue and resume the query. After 15 seconds, the query can be performed as expected.
  In fixed plan scenarios, the system provides a retry mechanism. If a worker node fails, queries can be routed to the shard replicas of another worker node. The queries can be complete, but the response time increases.
  In several fixed plan scenarios in which data must be queried immediately after it is written, you can set the hg_experimental_query_replica_fixed_plan_ha_mode parameter to leader_first to prevent the long latency of follower shards. In this configuration, queries are preferentially routed to the leader shard. If the queries on the leader shard time out, the queries are routed to follower shards.
  Note
  In this case, QPS cannot be increased for hot data queries.
2. Check whether a shard replica is loaded to each worker node.
  Execute the following statement to check whether a shard replica is loaded to each worker node:
```
SELECT * FROM hologres.hg_worker_info WHERE table_group_name = 'tg_replica';
```
  The following figure shows the returned result.
  The preceding figure indicates that one shard replica is loaded to multiple worker nodes. The setting takes effect.

FAQ

Problem description: After I configure parameters as described in Scenario 1, queries are not routed to follower shards. The monitoring information in the Hologres console indicates that the worker node with a high load still has a high load after the configuration.
Cause: For versions earlier than Hologres V1.3, the Grand Unified Configuration (GUC) parameter hg_experimental_enable_read_replica is provided to specify whether queries can be routed to follower shards. The default value is off. You can execute the following SQL statement to check whether the GUC parameter is set to on. The value on specifies that queries can be routed to follower shards. The value off specifies that queries cannot be routed to follower shards.
```
SHOW hg_experimental_enable_read_replica;
```
Solution: If the value of the hg_experimental_enable_read_replica parameter is off, you can execute the following SQL statement to set the GUC parameter to on at the database level.
```
ALTER DATABASE <database_name> SET hg_experimental_enable_read_replica = on;
```
Replace database_name with your database name.

Parameter	Description
table_group_name	The name of the table group that you want to query.
replica_count	The property that you want to query. Set the value to replica_count.