In scenarios such as real-time data analytics or log collection scenarios that require real-time data processing or large-scale data collection, a large number of concurrent read and write requests may exist. In this case, you can specify an appropriate table group and an appropriate shard count to implement data distribution and load balancing. This helps optimize the query performance and improve the data storage and computing efficiency.
Recommended settings about table groups and shard counts
In Hologres, you can specify table groups and shard counts based on your business requirements. Compared with similar services, Hologres allows you to more flexibly and easily create custom schemas based on specific business scenarios. With a good understanding of your own business, you can make full use of the high performance of Hologres. When you specify table groups, we recommend that you comply with the following rules:
If unnecessary, do not create a table group. We recommend that you use the default table group. For more information about specifications of Hologres instances, see Instance specifications.
For an instance that has more than 256 CPU cores, we recommend that you specify multiple table groups based on your business requirements. This way, business workloads can be balanced.
If the data amount is large, you can create an independent table group with a large shard count.
You can create an independent table group with a small shard count for a group of tables with a small data amount. This reduces the query startup overhead.
You must add the tables involved in local join operations to the same table group.
Do not configure a table group for each table.
When you specify a shard count, you must consider the number of worker nodes and subsequent scale-out. We recommend that you set the shard count to a multiple of the number of worker nodes. This way, computing resources can be evenly allocated.
The shard count of an existing table group cannot be changed. If you want to change the shard count, create a table group.
We recommend that you do not set a shard count to a large value. Otherwise, computing resources might not be allocated to some shards and additional overheads are generated. In Hologres V2.0 and later, an upper limit on the shard count for a table group and an upper limit on the shard count for an instance are configured by default based on the following rules. This prevents uneven allocation of computing resources due to excessive shards. If an upper limit is exceeded, the error message too many shards in this instance is reported.
Maximum shard count for a table group = Default shard count for a table group × 2
Maximum shard count for an instance = Default shard count for a table group × 8
You can set the Grand Unified Configuration (GUC) parameter to off to cancel the upper limits. However, if the upper limits are canceled and you set the shard counts to larger values, computing resources might not be allocated to some shards. Therefore, we recommend that you do not cancel the upper limits.
set hg_experimental_enable_shard_count_cap =off;
The upper limits on the shard counts for a table group and an instance vary based on the instance specifications.
Instance specification
Default number of compute nodes
Default shard count (for Hologres V0.10.31 and later)
Upper limit on the shard count for a table group (for Hologres V2.0 and later)
Upper limit on the shard count for an instance (for Hologres V2.0 and later)
32 Core
2
20
40 (20 × 2)
160 (20 × 8)
64 Core
4
40
80 (40 × 2)
320 (40 × 8)
96 Core
6
60
120 (60 × 2)
480 (60 × 8)
128 Core
8
80
160 (80 × 2)
640 (80 × 8)
160 Core
10
80
160 (80 × 2)
640 (80 × 8)
192 Core
12
80
160 (80 × 2)
640 (80 × 8)
256 Core
16
120
240 (120 × 2)
960 (120 × 8)
384 Core
24
160
320 (160 × 2)
1,280 (160 × 8)
512 Core
32
160
320 (160 × 2)
1,280 (160 × 8)
......
......
M
M*2
M*8
Permissions
If you want to create a table group, change the default table group, drop a table group, or use the resharding feature to migrate tables to a new table group, you must be assigned the superuser role of a Hologres instance. As a superuser, you can execute the following statement to assign the superuser role to a regular user:
-- Replace Account ID with your Alibaba Cloud account ID. If a RAM user is used, prefix the account ID with p4_. ALTER USER "Account ID" SUPERUSER;
If you want to create a table and add it to a specified table group, you must be granted the permission to create a table.
Query the metadata of table groups
Query the default table group.
SELECT * FROM hologres.hg_table_group_properties WHERE tablegroup_name IN ( SELECT tablegroup_name FROM hologres.hg_table_group_properties WHERE property_key = 'is_default_tg' AND property_value = '1' );
The following result is returned:
NoteIn the returned result, the is_default_tg property indicates whether the table group is the default table group, and the shard_count property indicates the shard count of the table group.
tablegroup_name | property_key | property_value -----------------+------------------+---------------- test_tg_default | tg_version | 1 test_tg_default | table_num | 1 test_tg_default | is_default_tg | 1 test_tg_default | shard_count | 3 test_tg_default | replica_count | 1 test_tg_default | created_manually | 0 (6 rows)
Query table groups in the current database.
SELECT tablegroup_name FROM hologres.hg_table_group_properties GROUP BY tablegroup_name;
The following result is returned:
tablegroup_name ----------------- test_tg_default (1 row)
Query the shard count of a table group.
SELECT property_value AS shard_count FROM hologres.hg_table_group_properties WHERE property_key = 'shard_count' AND tablegroup_name ='<tg_name>';
The following result is returned:
shard_count ------------- 3 (1 row)
Query tables in a table group.
SELECT table_namespace AS schema_name, table_name FROM hologres.hg_table_properties WHERE property_key = 'table_group' AND property_value = '<tg_name>';
The following result is returned:
schema_name | table_name -------------+------------ public | a (1 row)
Query the table group to which a table belongs.
SELECT property_value AS table_group_name FROM hologres.hg_table_properties WHERE property_key = 'table_group' AND table_name = '<table_name>';
The following result is returned:
table_group_name ------------------ test_tg_default (1 row)
Create a table group
If you have new business or you want to change the shard count of an existing table group, you can create a table group by using the following syntax.
After a table group is created, the tables and data in the original table group are not automatically migrated to the new table group.
The original table group becomes ineffective only after its tables and data are migrated to the new table group or are dropped.
CALL HG_CREATE_TABLE_GROUP ('<new_tg_name>', <shard_count>);
The following table describes the parameters in the preceding syntax.
Parameter | Data type | Description |
new_tg_name | Text | The name of the table group that you want to create. |
shard_count | INT4 | The shard count for the table group. |
Examples:
-- Create a table group that is named tg_8 and has eight shards.
CALL HG_CREATE_TABLE_GROUP ('tg_8', 8);
Change the default table group
After you create a database, a default table group with a default shard count is created for the database in a Hologres instance. For more information, see Instance types. If the database contains multiple table groups, you can execute the following statement to change the default table group to store subsequently created tables in the new default table group.
Hologres V0.9 and later allow you to execute the following statement to change the default table group. If the version of your Hologres instance is earlier than V0.9, upgrade your instance first.
CALL HG_UPDATE_DATABASE_PROPERTY ('default_table_group', '<tg_name>');
The following table describes the parameter in the preceding syntax.
Parameter | Data type | Description |
tg_name | TEXT | The name of the new default table group. After the default table group is changed, the shard count of the new default table group is the number of shards in the table group. |
Examples:
-- In Hologres V0.9 and later, you can configure a newly created table group as the default table group. After the setting takes effect, subsequently created tables are stored in the new default table group.
CALL HG_UPDATE_DATABASE_PROPERTY ('default_table_group', 'tg_8');
Create a table and add it to a specified table group
You can execute the following statements to create a table and explicitly add it to a specified table group:
BEGIN;
CREATE TABLE table_name (
col type,
....
);
CALL SET_TABLE_PROPERTY('table_name', 'table_group', '<tg_name>');
COMMIT;
The following table describes the parameters in the preceding syntax.
Parameter | Data type | Description |
table_name | TEXT | The name of the table that you want to create. |
tg_name | TEXT | The name of the specified table group. The shard count of the table group is the number of shards in the table group. |
Examples:
-- Create a table named tbl1 and add it to the table group named tg_8.
BEGIN;
CREATE TABLE tbl1 (
col1 text
);
CALL SET_TABLE_PROPERTY('tbl1', 'table_group', 'tg_8');
COMMIT;
Use the resharding feature to migrate tables to a new table group
In Hologres, shards are used to improve the parallelism of data processing. You can set the shard count to an appropriate value to improve the query or write performance. In most cases, the default shard count for a Hologres instance can meet your requirements. You do not need to change the shard count.
For example, after a Hologres instance is scaled out from 32 CPU cores to 128 CPU cores, the shard count for existing databases in the Hologres instance does not change. If you want to improve the performance of a database, we recommend that you increase the shard count for the database. If you create a database in the Hologres instance, the default shard count for the current instance specifications applies. For more information about instance specifications and shards, see Instance specifications.
After a Hologres instance is scaled in or out, the shard count for existing databases does not change. You can execute statements to change the shard count and import data again. Hologres provides the resharding feature. After you change the shard count, you can use the resharding feature to rebalance the existing tables and data among shards. You do not need to create tables and import data again. This simplifies the procedure and achieves optimal performance.
Limits
The resharding feature is supported only in Hologres V0.10 and later. You can view the version of your Hologres instance on the instance details page in the Hologres console. If the version of your Hologres instance is earlier than V0.10, manually upgrade your Hologres instance in the Hologres console or join a Hologres DingTalk group to apply for an instance upgrade. For more information about how to manually upgrade a Hologres instance, see Instance upgrades. For more information about how to join a Hologres DingTalk group, see Obtain online support for Hologres.
The resharding operation is performed on a single table at a time. When you perform the resharding operation on a table, data writes by using Realtime Compute for Apache Flink or Data Integration must be stopped. You can still query data in the table. In Hologres V1.1 and later, you can specify
set table readonly
to implement automatic failover for real-time data writes.Resharding consumes CPU resources and additional storage resources. We recommend that you perform the resharding operation during off-peak hours.
If binary logging is enabled for a table, you must disable binary logging before you perform the resharding operation on the table. You can enable binary logging again after resharding is complete. For more information, see Subscribe to Hologres binary logs.
You cannot perform the resharding operation on a table if the table contains an auto-increment field of the SERIAL data type or the table is created with the DEFAULT constraint. If the table contains an auto-increment field of the SERIAL data type, an error occurs during resharding. If the table is created with the DEFAULT constraint, the DEFAULT constraint becomes ineffective after resharding is complete.
You cannot perform the resharding operation on a table that has dependent objects, such as a view. You must remove the relevant dependent objects before you perform the resharding operation. Otherwise, the error message
ERROR: resharding table xxx can not executed because other objects depend on it
is reported.You can perform the resharding operation in only databases that use the simple permission model (SPM). Databases that use the standard PostgreSQL authorization model do not support resharding. For more information about Hologres permission models, see Overview.
The dynamic partitioning property is not retained after the resharding operation.
In Hologres V2.0 and later, column comments are retained after the resharding operation. If the version of your Hologres instance is earlier than V2.0, back up the column comments before the resharding operation and manually add the comments after the resharding operation is complete.
Syntax
If the version of your Hologres instance is V2.0.24 or later, we recommend that you perform the resharding operation in the HoloWeb console. For more information, see Table resharding.
If the version of your Hologres instance is earlier than V2.0.24, you can execute the following statement to migrate some business tables to a new table group. If the version of your Hologres instance is earlier than V0.10, upgrade your instance.
NoteYou must create a table group before you can migrate tables to the table group. For more information about how to create a table group, see Create a table group.
When you migrate tables, you must stop writing data to the tables. You can still query data from these tables.
After all tables in the original table group are migrated, the original table group is automatically dropped. If you want to create multiple table groups based on your business requirements, we recommend that you set the shard count to an appropriate value for each table group.
When you migrate partitioned tables, you need to migrate only parent tables.
-- Syntax supported by Hologres V1.1 and later: CALL HG_MOVE_TABLE_TO_TABLE_GROUP('<table_name>','<new_table_group_name>'); -- Syntax supported by Hologres V0.10: CALL HG_UPDATE_TABLE_SHARD_COUNT('<table_name>','<new_table_group_name>');
Parameters
Parameter
Description
Example
table_name
The name of the table that you want to migrate. You can execute one of the preceding statements to migrate a single table at a time. If you want to migrate multiple tables, execute the statement multiple times. If the table that you want to migrate is a partitioned table, you need to migrate only the parent table.
new_table
table_group_name
The name of the table group to which you want to migrate a table.
new_tg
Exception handling
During table resharding, the table is configured as read-only and then migrated to the new table group by using a temporary table. Resharding tasks involve data import and are time-consuming. During the process, exceptions, such as out of memory (OOM) and manual termination of tasks, may occur. When the exceptions occur, the original table is still in the read-only state, and a temporary table named <initial_table_name>_xxxxxxxx exists in the database. To handle the exceptions, follow the following instructions:
If the version of your Hologres instance is earlier than V2.0.24, upgrade the instance or submit a ticket. For more information about how to upgrade an instance, see Instance upgrades.
If the version of your Hologres instance is V2.0.24 or later, comply with the following rules:
If you perform the resharding operation in the HoloWeb console, click Continue or Cancel as required. For more information, see Table resharding.
If you perform the resharding operation by executing the preceding statement, perform one of the following operations to continue or cancel the resharding operation:
If you want to continue the resharding operation, troubleshoot the issue and execute the CALL HG_MOVE_TABLE_TO_TABLE_GROUP statement on the table again.
If you want to cancel the resharding operation and restore to the original state, execute the following statements in sequence:
-- Disable the read-only property of the original table. CALL set_table_property('<schema_name>.<table_name>','readonly','false'); -- Clear the temporary table. -- Obtain the name of the temporary partitioned table. SELECT schema_name,target_temp_table_name FROM hologres.hg_resharding_properties WHERE reshard_table_name = <schema_name>.<table_name> AND is_parent_table IS TRUE; -- Obtain the name of the temporary non-partitioned table. SELECT schema_name,target_temp_table_name FROM hologres.hg_resharding_properties WHERE reshard_table_name = <schema_name>.<table_name> AND is_parent_table IS FALSE AND is_sub_table IS FALSE; -- Clear the temporary table. DROP TABLE IF EXISTS <schema_name>.<target_temp_table_name>; -- Clear the system table. The system table records the resharding progress of the current table. If you want to cancel the resharding task, you must clear the data in the system table. CALL hologres.hg_internal_clear_resharding_properties('<schema_name>.<table_name>');
Drop a table group
You can execute the following statement to drop an empty table group. If a table exists in a table group, you cannot drop the table group.
CALL HG_DROP_TABLE_GROUP('<tg_name>');
Examples:
-- Drop the table group named tg_8.
CALL HG_DROP_TABLE_GROUP('tg_8');
Query the allocation relationship between shards and worker nodes
Shards need to be evenly allocated to workers. For more information, see the Relationship between shard counts and worker compute nodes section in the "Basic concepts" topic. If shards are not evenly allocated to workers, a resource skew may occur, which causes inefficient use of resources. Hologres V1.3 allows you to query the allocation relationship among worker nodes, table groups, and shard counts in the current database in the worker_info system view. This helps resolve the issue of uneven resource allocation. For more information about how to query the shard allocation among worker nodes, see Query the shard allocation among workers.
Best practices
In most cases, we recommend that you do not create a table group or change the shard count of a table group. If you want to create table groups based on your business requirements, you can plan the table groups based on the best practices. For more information, see Best practices for specifying table groups.
Troubleshooting
Resharding is a complex process that involves multiple steps, such as the creation of temporary tables, changes in the read-only state of source tables, data writes to destination tables, changes in table names, and recording of the synchronization status. If an exception occurs when you perform one or more steps, the system may enter an unknown state. In this case, you can use the following solution for troubleshooting.
If the error message internal error: Get rundown is not allowed in recovering state
is returned, the table that is being updated is in the read-only state. In this case, you cannot perform an operation such as the INSERT, UPDATE, or DELETE operation on the table unless the table is not in the read-only state.
Execute the following statement to query the table that is in the read-only state:
select * from hologres.hg_table_properties where property_key ='readonly' and property_value='true';
Execute the following statement to disable the read-only state for the table:
call set_table_property('<table_name>','readonly','false');
table_name specifies the name of the table for which you want to disable the read-only state.