This topic describes the syntax, clauses, and parameters of the CREATE TABLE statement that is used to create a partitioned table, and partitioning methods that can be used in the CREATE TABLE statement. The syntax is applicable only to databases in AUTO mode.
Usage notes
Before you use the syntax that is used to create a partitioned table, make sure that you set the mode parameter to auto when you create the current logical database. The value auto indicates the automatic partitioning mode. If you do not set this parameter to auto, you cannot use the syntax. You can execute the
SHOW CREATE DATBASE db_name
statement to query the partitioning mode of the current logical database. Example:CREATE DATABASE part_db mode='auto'; Query OK, 1 row affected (4.29 sec) SHOW CREATE DATABASE part_db; +----------+-----------------------------------------------+ | DATABASE | CREATE DATABASE | +----------+-----------------------------------------------+ | part_db | CREATE DATABASE `part_db` /* MODE = 'auto' */ | +----------+-----------------------------------------------+ 1 row in set (0.18 sec)
For more information about the syntax that is used to create a database, see CREATE DATABASE.
If the primary key of a partitioned table does not contain a partition key and is not an auto-increment primary key, make sure that the primary key is unique.
If features related to level-2 partitions are required when a partitioned table is being created, the version of your PolarDB-X instance is 5.4.17-16952556 or later.
Syntax
CREATE [PARTITION] TABLE [IF NOT EXISTS] tbl_name
(create_definition, ...)
[table_options]
[table_partition_definition]
[local_partition_definition]
create_definition:
col_name column_definition
| mysql_create_definition
| [UNIQUE] GLOBAL INDEX index_name [index_type] (index_sharding_col_name,...)
[global_secondary_index_option]
[index_option] ...
index_sharding_col_name:
col_name [(length)] [ASC | DESC]
index_option:
KEY_BLOCK_SIZE [=] value
| index_type
| WITH PARSER parser_name
| COMMENT 'string'
index_type:
USING {BTREE | HASH}
# Create a global secondary index.
global_secondary_index_option:
[COVERING (col_name,...)]
[partition_options]
[VISIBLE|INVISIBLE]
table_options:
table_option [[,] table_option] ...
table_option: {
# Specify the table group to which the partitioned table belongs.
TABLEGROUP [=] value,...,}
# Specify the type of the partitioned table.
table_partition_definition:
single
| broadcast
| partition_options
# Specify the partition.
partition_options:
partition_columns_definition
[subpartition_columns_definition]
[subpartition_specs_definition]/* Specify the templated level-2 partition.*/
partition_specs_definition
# Specify the partition key column of the level-1 partition.
partition_columns_definition:
PARTITION BY
HASH({column_name | partition_func(column_name)}) partitions_count
| KEY(column_list) partitions_count
| RANGE ({column_name | partition_func(column_name)})
| RANGE COLUMNS(column_list)
| LIST ({column_name | partition_func(column_name)})
| LIST COLUMNS(column_list)
| CO_HASH({column_expr_list}) partitions_count
# Specify the partition key column of the level-2 partition.
subpartition_columns_definition:
SUBPARTITION BY
HASH({column_name | partition_func(column_name)}) subpartitions_count
| KEY(column_list) subpartitions_count
| RANGE ({column_name | partition_func(column_name)})
| RANGE COLUMNS(column_list)
| LIST ({column_name | partition_func(column_name)})
| LIST COLUMNS(column_list)
| CO_HASH({column_expr_list}) partitions_count
column_expr_list:
{column_name | partition_func(column_name)},{column_name | partition_func(column_name)}[,{column_name | partition_func(column_name)},...]
partitions_count:
PARTITIONS partition_count
subpartitions_count:
SUBPARTITIONS partition_count
# Specify the partitioning function.
partition_func:
YEAR
| TO_DAYS
| TO_MONTHS
| TO_WEEKS
| TO_SECOND
| UNIX_TIMESTAMP
| MONTH
| DAYOFWEEK
| DAYOFMONTH
| DAYOFYEAR
| SUBSTR
| SUBSTRING
| RIGHT
| LEFT
# Specify three types of level-1 partitions.
partition_specs_definition:
hash_partition_list
| range_partition_list
| list_partition_list
# Specify three types of level-2 partitions.
subpartition_specs_definition:
hash_subpartition_list
| range_subpartition_list
| list_subpartition_list
# Specify the HASH or KEY subpartition of the level-1 partition.
hash_partition_list:
/* All subpartitions in the level-1 partition are of the HASH partitioning type.*/
| ( hash_partition [, hash_partition, ...] )
hash_partition:
PARTITION partition_name [partition_spec_options] /* Specify the pure level-1 partition or templated subpartition.*/
| PARTITION partition_name subpartitions_count [subpartition_specs_definition] /* Specify the non-templated subpartition under the level-1 partition.*/
# Specify the HASH or KEY subpartition of the level-2 partition.
hash_subpartition_list:
| empty
| ( hash_subpartition [, hash_subpartition, ...] )
hash_subpartition:
SUBPARTITION subpartition_name [partition_spec_options]
# Specify the RANGE or RANGE COLUMNS subpartition of the level-1 partition.
range_partition_list:
( range_partition [, range_partition, ... ] )
range_partition:
PARTITION partition_name VALUES LESS THAN (range_bound_value) [partition_spec_options] /* Specify the pure level-1 partition or templated subpartition.*/
| PARTITION partition_name VALUES LESS THAN (range_bound_value) [[subpartitions_count] [subpartition_specs_definition]] /* Specify the non-templated subpartition under the level-1 partition.*/
# Specify the RANGE or RANGE COLUMNS subpartition of the level-2 partition.
range_subpartition_list:
( range_subpartition [, range_subpartition, ... ] )
range_subpartition:
SUBPARTITION subpartition_name VALUES LESS THAN (range_bound_value) [partition_spec_options]
range_bound_value:
maxvalue /* Specify the maximum number of RANGE partitions.*/
| expr /* Specify the range boundary value for a single partition key column.*/
| value_list /* Specify the range boundary values for multiple partition key columns.*/
# Specify the LIST or LIST COLUMNS subpartition of the level-1 partition.
list_partition_list:
(list_partition [, list_partition ...])
list_partition:
PARTITION partition_name VALUES LESS THAN (range_bound_value) [partition_spec_options] /* Specify the pure level-1 partition or templated subpartition.*/
| PARTITION partition_name VALUES IN (list_bound_value) [[subpartitions_count] [subpartition_specs_definition]] /* Specify the non-templated subpartition under the level-1 partition.*/
# Specify the LIST or LIST COLUMNS subpartition of the level-2 partition.
list_subpartition_list:
(list_subpartition [, list_subpartition ...])
list_subpartition:
SUBPARTITION subpartition_name VALUES IN (list_bound_value) [partition_spec_options]
list_bound_value:
default /* Specify the default LIST partition.*/
| value_set
value_set:
value_list /* Specify a set of values for a single partition key column.*/
| (value_list) [, (value_list), ...] /* Specify a set of values for multiple partition key columns.*/
value_list:
value [, value, ...]
partition_spec_options:
[[STORAGE] ENGINE [=] engine_name]
[COMMENT [=] 'string']
[LOCALITY [=] locality_option]
table_option:
[[STORAGE] ENGINE [=] engine_name]
[COMMENT [=] 'string']
[{CHARSET | CHARACTER SET} [=] charset]
[COLLATE [=] collation]
[TABLEGROUP [=] table_group_id]
[LOCALITY [=] locality_option]
locality_option:
'dn=storage_inst_id_list'
storage_inst_id_list:
storage_inst_id[,storage_inst_id_list]
local_partition_definition:
LOCAL PARTITION BY RANGE (column_name)
[STARTWITH 'yyyy-MM-dd']
INTERVAL interval_count [YEAR|MONTH|DAY]
[EXPIRE AFTER expire_after_count]
[PRE ALLOCATE pre_allocate_count]
[PIVOTDATE pivotdate_func]
[DISABLE SCHEDULE]
pivotdate_func:
NOW()
| DATE_ADD(...)
| DATE_SUB(...)
The syntaxes of PolarDB-X DDL statements are developed based on that of MySQL. The preceding content indicates the differences between the syntaxes of PolarDB-X and MySQL. For more information about the detailed syntaxes, see MySQL documentation.
Terms
Partition key: one or more columns based on which a table is horizontally partitioned.
Partition key column: a column based on which PolarDB-X routes data to the partitions of the table. A partition key column is a part of the partition key of the table. A partition key can contain one or more partition key columns.
Vector partition key: a partition key that contains one or more partition key columns.
Single-column partition key: a partition key that contains only one partition key column.
Prefix partition key column: If a vector partition key contains N partition key columns, the first K partition key columns are defined as a prefix partition key column of the vector partition key. N is a number that is greater than 1, and K is a number that is greater than or equal to 1 and smaller than or equal to N.
Partitioning function: the function that uses partition key columns as input parameters and returns results based on which PolarDB-X routes data to the partitions of the table.
Partition pruning: a feature that is used to optimize queries by filtering out the partitions that do not need to be scanned based on the partitioning configurations and query conditions.
Hot partition splitting: a feature that is used to balance the workload of partitions. When hot data exists in a prefix partition key column of the vector partition key, you can split hot data into multiple partitions based on another partition key column.
Physical partition: the partition in a data node that corresponds to a physical table shard. A physical partition corresponds to a physical table shard.
Logical partition: a virtual partition that corresponds to one or more physical partitions. Logical partition represents a logical concept. For example, when you create a partitioned table that contains level-2 partitions, the level-1 partitions of the level-2 partitions are logical partitions.
Parameters
Component | description |
CHARSET | CHARACTER SET | Specifies the default character set that is used for columns in the partitioned table. Valid values:
|
COLLATE | Specifies the default string that is used for columns in the partitioned table. Valid values:
|
TABLEGROUP | Specifies the table group to which the partitioned table belongs. If this parameter is not specified, PolarDB-X automatically assigns the partitioned table to an existing table group or creates a table group. In a table group, all tables must be partitioned by using the same partitioning method. |
LOCALITY | Specifies the data nodes on which the partitioned table is deployed. |
Non-partitioned tables
PolarDB-X allows you to specify the SINGLE keyword to create a non-partitioned table. Sample statement:
CREATE TABLE single_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
primary key(id)
) SINGLE;
Broadcast tables
PolarDB-X allows you to specify the BROADCAST keyword to create a broadcast table. When you create a broadcast table in a PolarDB-X instance, the broadcast table is replicated on each data node of the PolarDB-X instance. Sample statement:
CREATE TABLE broadcast_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
primary key(id)
) BROADCAST;
Partitioned tables
Partitioning types
PolarDB-X allows you to configure the syntax in the clause that is used to create a partition to create a partitioned table that suits your business requirements. PolarDB-X provides the following partitioning types:
HASH partitioning: This partitioning type uses the built-in consistent hashing to calculate the hash value of a specified expression that contains a partitioning function or partition key column and routes data to partitions.contains a partitioning function. The HASH partitioning type includes KEY partitioning and HASH partitioning based on whether expressions containing partitioning functions are used or partition key columns are used as partition keys.
RANGE partitioning: This partitioning type compares and calculates the value of a specified partition key column or the value returned by a specified expression that contains a partitioning function to determine the range of predefined partitions in which data is distributed and routes the data to partitions. The RANGE partitioning type includes RANGE COLUMNS partitioning and RANGE partitioning based on whether expressions containing partitioning functions are used or partition key columns are used as partition keys.
LIST partitioning: This partitioning type compares and calculates the value of a specified partition key column or the value returned by a specified expression that contains a partitioning function to determine the value collection of predefined partitions in which data is distributed and routes the data to partitions. This type of partitioning method is similar to the RANGE partitioning method. The LIST partitioning type includes LIST COLUMNS partitioning and LIST partitioning based on whether multiple partition key columns are used as partition keys and the usage method.
COHASH partitioning: As a new partitioning type provided by PolarDB-X, it is suitable for scenarios where a table is partitioned based on different partition key columns whose values are similar.
HASH partitioning
The HASH partitioning type in PolarDB-X includes HASH partitioning and KEY partitioning. HASH partitioning and KEY partitioning are supported by the native MySQL syntax. PolarDB-X is syntactically compatible with the syntaxes of HASH partitioning and KEY partitioning that are used to create partitioned tables in MySQL to provide flexible and powerful partition management capabilities such as splitting, merging, and migration and support hot partition splitting in vector partition keys. However, PolarDB-X routes data to partitions in a different manner from MySQL. PolarDB-X re-specifies data routing in HASH partitioning and KEY partitioning. For more information about the syntaxes of different partitioning types, see Partitioning Types. The following table lists the differences between KEY partitioning and HASH partitioning.
Partitioning type | Supported partition key | Whether support partitioning functions | Statement syntax | Limits | Routing policy (point query) |
KEY partitioning (default) | Single-column partition key | Disabled | PARTITION BY KEY(c1) |
|
|
Vector partition key | Disabled | PARTITION BY KEY(c1,c2,...,cn) |
|
| |
Hash | Single-column partition key | Disabled | PARTITION BY HASH(c1) |
| The routing strategy of PARTITION BY HASH(c1) is the same as that of PARTITION BY KEY(c1). |
Enabled | PARTITION BY HASH(YEAR(c1)) |
| |||
Vector partition key | Disabled | PARTITIONBY HASH(c1,c2,...,cn) |
|
|
Example 1-1:KEY partitioning
KEY partitioning is the default partitioning type for PolarDB-X. KEY partitioning supports vector partition keys. If you want to use the name and id columns as the partition key of a table and set the number of partitions to 8, you can execute the following statement to create the table:
CREATE TABLE key_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
birthday datetime not null,
primary key(id)
)
PARTITION BY KEY(name, id)
PARTITIONS 8;
In this example, a vector partition key that consists of the name and id columns is used to partition the table. By default, data is routed to partitions based on only values in the name column. Therefore, to optimize an SQL statement by using partition pruning, you need to specify a condition only based on the name column in the WHERE clause of the SQL statement.
## The following SQL statement meets the partition pruning condition and scans data only in one partition.
SELECT id from key_tbl where name='Jack';
If data in the name column is not evenly distributed or specific data is frequently accessed, you can use another partition key column in the vector, such as the id column, to split partitions for performance optimization. For more information, see ALTER TABLEGROUP.
If a vector partition key consists of N partition key columns and data is routed to the first K (1 ≤ K ≤ N) partition key columns, you need to specify a condition for partition pruning only based on the first K columns of the vector partition key in the WHERE clause of the SQL statement.
Example 1-2: HASH partitioning
If you want to use the id column as the partition key to horizontally partition a table and set the number of partitions to 8, you can execute the following statement to create the table by using HASH partitioning:
CREATE TABLE hash_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
birthday datetime not null,
primary key(id)
)
partition by hash(id)
partitions 8;
In HASH partitioning, you can use partitioning functions, such as YEAR and TO_DAYS, to convert timestamps to integers. If you want to use the birthday column as the partition key of a table and set the number of partitions to 8, you can execute the following statement to create the table:
CREATE TABLE hash_tbl_todays(
id bigint not null auto_increment,
bid int,
name varchar(30),
birthday datetime not null,
primary key(id)
)
PARTITION BY HASH(TO_DAYS(birthday))
PARTITIONS 8;
The following partitioning functions are supported by PolarDB-X:
YEAR
MONTH
DAYOFMONTH
DAYOFWEEK
DAYOFYEAR
TO_DAYS
TO_MONTHS
TO_WEEKS
TO_SECONDS
UNIX_TIMESTAMP
SUBSTR/SUBSTRING
Therefore, except that the data types of partition key columns of the SUBSTR or SUBSTRING function must be STRING, the data types of partition key columns of other partitioning functions must be the following types: DATE, DATETIME, and TIMESTAMP.
Example 1-3: Extended HASH partitioning
PolarDB-X extends the syntax of HASH partitioning and allows you to use vector partition keys that are supported by the native MySQL syntax. For information about the syntaxes of different partitioning types, see Partitioning Types. Sample statement:
CREATE TABLE hash_tbl2(
id bigint not null auto_increment,
bid int,
name varchar(30),
birthday datetime not null,
primary key(id)
)
PARTITION BY HASH(name, birthday)
PARTITIONS 8;
If you use the HASH partitioning type to partition a table, a vector partition key is used. Data in the table is routed to partitions based on the hash values of values in all partition key columns. You must specify a condition for partition pruning based on all partition key columns in the WHERE clause of an SQL statement. In the following examples, SQL1 meets the partition pruning condition of the hash_tbl2 table, and SQL2 does not meet the partition pruning condition of the hash_tbl2 table.
## The following SQL statement SQL1 meets the partition pruning condition. When the statement is executed, only the specified partition is scanned.
SELECT id from hash_tbl2 where name='Jack' and birthday='1990-11-11';
## The following SQL statement SQL2 does not meet the partition pruning condition. When the statement is executed, all partitions are scanned.
SELECT id from hash_tbl2 where name='Jack';
In HASH partitioning, data is routed based on the hash values of values in all partition key columns. Therefore, data is more evenly distributed in HASH partitioning than in KEY partitioning in which data is routed based on the hash values of values in only specific columns of the vector partition key. HASH partitioning does not support hot partition splitting because all partition key columns are used for data partitioning and no partition key columns are available for hot partition splitting.
Limits
Limits on data types
Integer types: BIGINT, BIGINT UNSINGED, INT UNSINGED, INT, MEDIUMINT, MEDIUMINT UNSINGED, SMALLINT, SMALLINT UNSINGED, TINYINT, and TINYINT UNSINGED.
Date and time types: DATETIME, DATE, and TIMESTAMP.
String types: CHAR and VARCHAR.
Limits on syntaxes
If the partition key of a table is a single-column partition key and the data type of the partition key column is DATETIME, DATE, or TIMESTAMP, specific partitioning functions are supported.
If the partition key of a table is a vector partition key, partitioning functions and hot partition splitting are not supported.
By default, a partitioned table can contain up to 8,192 partitions.
By default, a partition key can consist of up to five partition key columns.
Data balancing
KEY partitioning and HASH partitioning are implemented based on the built-in consistent hashing algorithm MurmurHash3. This algorithm is widely tested in the industry and is proven to have low data collision and high performance.
When you use KEY partitioning or HASH partitioning, data distribution across different partitions becomes balanced when the number of different values of the partition key is larger than 3,000 based on the MurmurHash3 algorithm. Data is distributed in a more balanced manner when the partition key includes more different values.
RANGE partitioning
The RANGE partitioning type in PolarDB-X includes RANGE partitioning and RANGE COLUMNS partitioning. RANGE partitioning and RANGE COLUMNS partitioning are supported by the native MySQL syntax. For more information about the syntaxes of different partitioning types, see Partitioning Types. The following table lists the differences between RANGE partitioning and RANGE COLUMNS partitioning.
Partitioning type | Supported partition key | Whether support partitioning functions | Statement syntax | Limits | Routing policy (point query) |
Range Columns | Single-column partition key and vector partition key | Disabled | PARTITION BY RANGE COLUMNS (c1,c2,...,cn) ( PARTITION p1 VALUES LESS THAN (1,10,...,1000), PARTITION p2 VALUES LESS THAN (2,20,...,2000) ...) | Hot partition splitting is supported. If a large number of rows contains the same value such as 88 in the c1 partition key column, you can split the hot data based on values in the c2 partition key column. |
|
Range | Single-column partition key | Enabled | PARTITION BY RANGE(YEAR(c1)) ( PARTITION p1 VALUES LESS THAN (2019), PARTITION p2 VALUES LESS THAN (2021) ...) |
|
|
Example 2-1: RANGE COLUMNS partitioning
RANGE COLUMNS partitioning supports vector partition keys but does not support partitioning functions. For example, you can execute the following statement to create a table that is partitioned based on the specified ranges of values in the order_id and order_time columns:
CREATE TABLE orders(
order_id int,
order_time datetime not null)
PARTITION BY range columns(order_id,order_time)
(
PARTITION p1 VALUES LESS THAN (10000,'2021-01-01'),
PARTITION p2 VALUES LESS THAN (20000,'2021-01-01'),
PARTITION p3 VALUES LESS THAN (30000,'2021-01-01'),
PARTITION p4 VALUES LESS THAN (40000,'2021-01-01'),
PARTITION p5 VALUES LESS THAN (50000,'2021-01-01'),
PARTITION p6 VALUES LESS THAN (MAXVALUE,MAXVALUE)
);
RANGE COLUMNS partitioning does not support partition keys that contain TIMESTAMP or TIME data.
Example 2-2: RANGE partitioning
RANGE partitioning supports only single-column partition keys. If you want to use a column that contains date and time values as the partition key, you must use an expression that contains a partitioning function such as YEAR, TO_DAYS, TO_SECONDS, or MONTH to convert the date and time values in the column to integer values.
Columns that contain strings cannot be used as partition key columns in RANGE partitioning.
For example, you can execute the following statement to create a table that is partitioned based on the specified ranges of values in the order_time column and store data in different partitions by quarter:
CREATE TABLE orders_todays(
id int,
order_time datetime not null)
PARTITION BY RANGE(to_days(order_time))
(
PARTITION p1 VALUES LESS THAN (to_days('2021-01-01')),
PARTITION p2 VALUES LESS THAN (to_days('2021-04-01')),
PARTITION p3 VALUES LESS THAN (to_days('2021-07-01')),
PARTITION p4 VALUES LESS THAN (to_days('2021-10-01')),
PARTITION p5 VALUES LESS THAN (to_days('2022-01-01')),
PARTITION p6 VALUES LESS THAN (MAXVALUE)
);
In RANGE partitioning, only columns that contain integer values can be used as partition key columns. In RANGE partitioning, a partition key can contain only one partition key column.
Limits
Limits on data types
Integer types: BIGINT, BIGINT UNSINGEDINT, INT, INT UNSINGED, MEDIUMINT, MEDIUMINT UNSINGED, SMALLINT, SMALLINT UNSINGED, TINYINT, and TINYINT UNSINGED.
Date and time types: DATETIME and DATE.
String types: CHAR and VARCHAR.
Limits on syntaxes
RANGE COLUMNS partitioning and RANGE partitioning do not support NULL as the boundary of a value range.
RANGE COLUMNS partitioning does not support the columns of the TIMESTAMP type as partition key columns.
In RANGE partitioning, only columns that contain integer values can be used as partition key columns. To use a column of the TIMESTAMP type as a partition key column, you must use the UNIX_TIMSTAMP partitioning function to convert the timestamp values to timestamps in the same time zone.
RANGE partitioning does not support hot partition splitting.
If a NULL value is specified in an SQL request, the NULL value is considered as the minimum value during data routing.
By default, a partitioned table can contain up to 8,192 partitions.
By default, a partition key can consist of up to five partition key columns.
LIST partitioning
The LIST partitioning type in PolarDB-X includes LIST partitioning and LIST COLUMNS partitioning. LIST partitioning and LIST COLUMNS partitioning are supported by the native MySQL syntax. For more information about the syntaxes of different partitioning types, see Partitioning types. LIST partitioning and LIST COLUMNS partitioning in PolarDB-X also support DEFAULT partitioning. The following table lists the differences between LIST partitioning and LIST COLUMNS partitioning.
Partitioning type | Supported partition key | Whether support partitioning functions | Statement syntax | Limits | Routing policy (point query) |
List Columns | Single-column partition key and vector partition key | Disabled | PARTITION BY LIST COLUMNS (c1,c2,...,cn) ( PARTITION p1 VALUES IN ((1,10,...,1000),(2,20,...,2000) ), PARTITION p2 VALUES IN ((3,30,...,3000),(3,30,...,3000) ), ...) | Hot partition splitting is not supported. |
|
List | Single-column partition key | Enabled | PARTITION BY LIST(YEAR(c1)) ( PARTITION p1 VALUES IN (2018,2019), PARTITION p2 VALUES IN (2020,2021) ...) | Hot partition splitting is not supported. |
Example 3-1: LIST COLUMNS partitioning
LIST COLUMNS partitioning supports vector partition keys. For example, you can execute the following statement to create a table that is partitioned based on values in the country and city columns:
CREATE TABLE orders_region(
id int,
country varchar(64),
city varchar(64),
order_time datetime not null)
PARTITION BY LIST COLUMNS(country,city)
(
PARTITION p1 VALUES IN (('China','Hangzhou'), ('China','Beijing')),
PARTITION p2 VALUES IN (('United States','NewYork'),('United States','Chicago')),
PARTITION p3 VALUES IN (('Russian','Moscow'))
);
LIST COLUMNS partitioning does not support partition keys that contain TIMESTAMP or TIME data.
Example 3-2: LIST partitioning
LIST partitioning supports only single-column partition keys. If you want to use a column that contains date and time values as the partition key column, you must use a partitioning function such as YEAR, MONTH, DAYOFMONTH, TO_DAYS, or TO_SECONDS to convert timestamp values in the partition key column to integers.
For example, you can execute the following statement to create a table that is partitioned based on values in the order_time column and store data in different partitions by year:
CREATE TABLE orders_years(
id int,
country varchar(64),
city varchar(64),
order_time datetime not null)
PARTITION BY LIST(YEAR(order_time))
(
PARTITION p1 VALUES IN (1990,1991,1992,1993,1994,1995,1996,1997,1998,1999),
PARTITION p2 VALUES IN (2000,2001,2002,2003,2004,2005,2006,2007,2008,2009),
PARTITION p3 VALUES IN (2010,2011,2012,2013,2014,2015,2016,2017,2018,2019)
);
In LIST partitioning, only columns that contain integer values can be used as partition key columns. Columns that contain strings cannot be used as partition key columns.
Example 3-3: LIST COLUMNS partitioning or LIST partitioning with a DEFAULT partition
PolarDB-X allows you to create a LIST COLUMNS partitioning or LIST partitioning with a DEFAULT partition. Data that is not defined in common partitions is routed to the DEFAULT partition.
Only one DEFAULT partition can be specified, and it can appear only as the last partition.
CREATE TABLE orders_region(
id int,
country varchar(64),
city varchar(64),
order_time datetime not null)
PARTITION BY LIST COLUMNS(country,city)
(
PARTITION p1 VALUES IN (('China','Hangzhou'), ('China','Beijing')),
PARTITION p2 VALUES IN (('United States','NewYork'),('United States','Chicago')),
PARTITION p3 VALUES IN (('Russian','Moscow')),
PARTITION pd VALUES IN (DEFAULT)
);
CREATE TABLE orders_years(
id int,
country varchar(64),
city varchar(64),
order_time datetime not null)
PARTITION BY LIST(YEAR(order_time))
(
PARTITION p1 VALUES IN (1990,1991,1992,1993,1994,1995,1996,1997,1998,1999),
PARTITION p2 VALUES IN (2000,2001,2002,2003,2004,2005,2006,2007,2008,2009),
PARTITION p3 VALUES IN (2010,2011,2012,2013,2014,2015,2016,2017,2018,2019),
PARTITION pd VALUES IN (DEFAULT)
);
Limits
Limits on data types
Integer types: BIGINT, BIGINT UNSINGEDINT, INT, INT UNSINGED, MEDIUMINT, MEDIUMINT UNSINGED, SMALLINT, SMALLINT UNSINGED, TINYINT, and TINYINT UNSINGED.
Date and time types: DATETIME and DATE.
String types: CHAR and VARCHAR.
Limits on syntaxes
In LIST COLUMNS partitioning, the columns of the TIMESTAMP type cannot be used as partition key columns.
In LIST partitioning, only columns that contain integer values can be used as partition key columns.
LIST COLUMNS partitioning and LIST partitioning do not support hot partition splitting.
By default, a partitioned table can contain up to 8,192 partitions.
By default, a partition key can consist of up to five partition key columns.
COHASH partitioning
The COHASH partitioning type is exclusive to PolarDB-X.
Supported versions
The instance version must be 5.4.18-17047709 or later.
Scenarios
The COHASH partitioning type is suitable for the following scenarios:
A business table contains one or more similar columns. For example, the table contains a c1 and a c2 columns in which the last four characters of the column values are always the same. You want to partition the table by the columns and include the equivalent query conditions of the c1 or c2 column in queries. You also want to route the conditions to the same partition.
To use this partitioning type in this case, you must maintain the similarity between the values of different partition key columns in a partitioned table.
Example 4-1: Use partitioning functions to define the similarity between the values of different partition key columns in a COHASH partitioned table
Assume that the order_id and buyer_id columns of an orders table contain the same last six characters in the same row. If you want to partition the orders table by the last six characters in the order_id and buyer_id columns, and want the equivalent query conditions of the columns in the same row to be routed to the same partition, execute the following statements:
CREATE TABLE t_orders(
id bigint not null auto_increment,
order_id bigint,
buyer_id bigint,
order_time datetime not null,
primary key(id)
)
PARTITION BY CO_HASH(
RIGHT('order_id',6) /* Partition the table by the last six characters of the c1 column values.*/,
RIGHT('buyer_id',6) /* Partition the table by the last six characters of the c2 column values.*/
)
PARTITIONS 8;
Example 4-2: Use the RANGE_HASH syntactic sugar to migrate a PolarDB-X 1.0 instance to a PolarDB-X 2.0 instance
You want to migrate a PolarDB-X 1.0 instance to a PolarDB-X 2.0 instance. The source instance has an orders table that uses the range_hash sharding schema. The table is defined by using the DBPARTIITION BY RANGE_HASH(`order_id`,`buyer_id`, 6)
syntax. In this case, the partitioned table corresponding to the table in the database of destination instance in AUTO mode is defined by using the following statement:
CREATE TABLE orders(
id bigint not null auto_increment,
buyer_id bigint,
order_id bigint,
...
primary key(id)
)
PARTITION BY RANGE_HASH(order_id, buyer_Id,6)
PARTITIONS 8;
PolarDB-X automatically converts the RANGE_HASH syntax to the corresponding COHASH partitioning definition. For example, PolarDB-X automatically converts the RANGE_HASH(order_id, buyer_Id,6) clause in the preceding statements to the following COHASH partitioning definition:
CREATE TABLE orders(
id bigint not null auto_increment,
buyer_id bigint,
order_id bigint,
...
primary key(id)
)
PARTITION BY CO_HASH(
RIGHT('order_id',6) /* Partition the table by the last six characters of the c1 column values.*/,
RIGHT('buyer_id',6) /* Partition the table by the last six characters of the c2 column values.*/
)
PARTITIONS 8;
Comparison between COHASH partitioning, HASH partitioning, and KEY partitioning
The COHASH partitioning type is similar to the HASH and KEY partitioning types. The following table compares the types.
Difference | CO_HASH | KEY | Hash |
Syntax | PARTITION BY CO_HASH(c1, c2) PARTITOINS 8 | PARTITION BY KEY(c1, c2) PARTITOINS 8 | PARTITION BY HASH(c1, c2) PARTITOINS 8 |
Single-column partition key | Not supported | Supported | Supported |
Vector partition key | Supported | Supported | Supported |
Partitioning functions on vector partition key columns | Supported. Example: PARTITION BY CO_HASH( /* Partition the table by the last four characters of the c1 column values.*/ RIGHT(c1, 4), /* Partition the table by the last four characters of the c2 column values.*/ RIGHT(c2, 4) ) PARTITOINS 8 | Not supported | Not supported |
Relationship between partition key columns | Values in the columns are similar. You need to maintain the similarity between the values of different partition key columns in a partitioned table. Examples:
| Similar to the prefix of a federated index. | Similar to the prefix of a federated index. |
Equivalent query for partition key columns with prefixes, partition pruning, and examples | Supported. Examples:
| Supported. Examples:
| Not supported. Partition pruning is supported only when equivalent conditions for all partition key columns are included. Examples:
|
Equivalent query for partition key columns without prefixes, partition pruning, and examples | Supported. Equivalent conditions for all partition key columns support partition pruning,. Examples:
| Not supported. Equivalent conditions for partitions without prefixes must result in a scan for all partitions. Examples:
| Not supported. Equivalent conditions for partitions without prefixes must result in a scan for all partitions. Examples:
|
Range query | Not supported. All partitions are scanned. | Not supported. All partitions are scanned. | Not supported. All partitions are scanned. |
Routing policy (point query) |
| Refer to the preceding content related to KEY partitioning and HASH partitioning. | Refer to the preceding content related to KEY partitioning and HASH partitioning. |
Hot partition splitting | Not supported. Further hot partition splitting cannot be performed on a hot key value such as c1='88'. | Supported | Not supported |
Partition management such as partition splitting, merging, and migration | Supported | Supported | Supported |
Level-2 partition | Supported | Supported | Supported |
Usage notes
You must ensure the similarity between the values of different partition key columns. Partition tables only check routing results. The values of different partition key columns in a COHASH partitioned table has similarity maintained by you. Therefore, the values of different partition key columns in the table in each row must be routed to the same partition. However, the similarity between the values of different partition key columns may be destroyed. You must ensure the similarity between the values of different partition key columns. PolarDB-X only check routing results but not the similarity.
For example, you specify that the last four characters of the c1 and c2 column values are the same. If c1=1001234 and c2=1320 are routed to the shard 0,
insert (c1,c2) values (100234,1320)
is supported. However, the last four characters of the column values are different.
Limits are imposed when DLM statements are used to modify partition key columns. The values of different partition key columns in a COHASH partitioned table are similar. Therefore, to prevent incorrect data distribution, PolarDB-X imposes the following limits when you use DML statements to modify the values of partition key columns in the table:
If the values of different partition key columns in the same row in the VALUES clause of the INSERT or REPLCAE statement are routed to different partitions, the values cannot be inserted and an error is reported.
If you want to use the SET clause of the UPDATE or UPSERT statement to modify the values of multiple partition key columns, you must modify the values of all partition key columns. For example, the c1 and c2 columns are partition key columns. You can use the
UPDATE t1 SET c1='xx',c2='yy' WHERE id=1
statement to modify the values of the columns. After you use the SET clause to modify the values of multiple partition key columns, if the new values of the columns are routed to different partitions, the UPDATE or UPSERT statement is prohibited and an error is reported.After you use the COHASH partitioning type for GSI tables, if all DML operations performed on the primary table, such as INSERT or UPDATE operations, cause that the values of different partition key columns in the GSI tables of the primary table in the same row are routed to different partitions, the operations are also prohibited and an error is reported.
The prefix of partition key columns in a COHASH partitioned table is 0. The values of different partition key columns in a COHASH partitioned table are similar. Therefore, the columns need to be defined by partitioning functions such as SUBSTR, LEFT, or RIGHT. After some digits of the integer type are truncated, a prefix 0 generates. For example, you specify that the last four characters of the c1 and c2 column values are the same. If c1=1000034 and c2=34, the last four characters of the c1 column values are '0034'. After all original values of partition key columns in a COHASH partitioned table that use the integer type are truncated, the original values are automatically converted to the integer corresponding to the new values and then are routed. Therefore, the string '0034' is converted to the string '34'. Then, the hash of the string '34' is calculated and routed to partitions. This way, the prefix of the string '0034' is automatically processed.
Limits on partitioning functions
RIGHT
LEFT
SUBSTR
Limits on data types
Integer types: BIGINT, BIGINT UNSINGEDINT, INT, INT UNSINGED, MEDIUMINT, MEDIUMINT UNSINGED, SMALLINT, SMALLINT UNSINGED, TINYINT, and TINYINT UNSINGED.
Fixed-point type: DECIMAL. The decimal part of this data type must be zero.
Date and time types: Not supported.
String types: CHAR and VARCHR.
Limits on syntaxes
If a partitioning function is used for a partition key column, multiple nested partitioning functions are unavailable for the column. Example:
SUBSTR(SUBSTR(c1,-6),-4)
.
When you use the RANGE_HASH syntactic sugar, the last length value cannot be a negative.
The types of all partition key columns must be identical in the following terms:
Charsets and collation.
Length and precision definition.
By default, a partitioned table can contain up to 8,192 partitions.
By default, a partition key can consist of up to five partition key columns.
Level-2 partitions
Similar to MySQL, PolaDB-X allows you to use the syntax of level-2 partitions to create a partitioned table that contains level-2 partitions. A level-2 partition is a subpartition of all level-1 partitions based on a specified partition key column and partition policy.
Each level-1 partition of the level-2 partition actually becomes one logical partition and corresponds to a set of level-2 partitions.
Each subpartition of the level-2 partition actually becomes one physical partition and corresponds to one physical table shard on data nodes.
Templated and non-templated syntaxes
In PolarDB-X, level-2 partitions are divided into two types of partitions: templated partitions and non-templated partitions.
Templated level-2 partitions: The number of level-2 partitions under each level-1 partition is always consistent with the boundary value of each level-1 partition.
Non-templated level-2 partitions: The number of level-2 partitions under each level-1 partition can be inconsistent with the boundary value of each level-1 partition.
Limits on syntaxes
In a partitioned table that contains level-2 partitions, the number of physical partitions cannot exceed 8,192 by default.
In a partitioned table that contains non-templated level-1 partitions, the name of each level-2 partition must be unique and cannot be duplicate with the names of all level-1 partitions.
In a partitioned table that contains templated level-1 partitions, the name of each templated level-2 partition must be unique and cannot be duplicate with the names of all level-1 partitions.
After level-2 partitions are used, the number of level-2 partitions in a partitioned table is the total number of level-2 partitions under all level-1 partitions. Therefore, the number of partitions in the partitioned table increases exponentially. To avoid negative effects caused by excessive partitioning operations or errors caused by the failure to maintain the total number of partitions, we recommend that you carefully control the number of shards in level-1 partitions and level-2 partitions.
Example 5-1: Template level partitions
/*
* Specify templated subpartitions based on LIST partitioning and KEY partitioning.
* The Level-1 partition is divided into three level-2 partitions based on LIST COLUMNS partitioning.
* Each level-2 partition is divided into four subpartitions based on KEY partitioning.
* Therefore, the total number of physical partitions is 12.
*/
CREATE TABLE sp_tbl_list_key_tp(
id int,
country varchar(64),
city varchar(64),
order_time datetime not null,
PRIMARY KEY(id)
)
PARTITION BY LIST COLUMNS(country,city)
SUBPARTITION BY KEY(id) SUBPARTITIONS 4
(
PARTITION p1 VALUES IN (('China','Hangzhou')),
PARTITION p2 VALUES IN (('Russian','Moscow')),
PARTITION pd VALUES IN (DEFAULT)
);
Example 5-2: Non-templated level-2 partitions
/*
* Specify non-templated subpartitions based on LIST partitioning and KEY partitioning.
* The Level-1 partition is divided into three level-2 partitions based on LIST COLUMNS partitioning.
* Each level-2 partition is divided into one or more subpartitions based on KEY partitioning.
* The specified numbers of subpartitions in each level-2 partition are 2, 3, and 4.
* Therefore, the total number of physical partitions is 9.
*/
CREATE TABLE sp_tbl_list_key_ntp(
id int,
country varchar(64),
city varchar(64),
order_time datetime not null,
PRIMARY KEY(id)
)
PARTITION BY LIST COLUMNS(country,city)
SUBPARTITION BY KEY(id)
(
PARTITION p1 VALUES IN (('China','Hangzhou')) SUBPARTITIONS 2,
PARTITION p2 VALUES IN (('Russian','Moscow')) SUBPARTITIONS 3,
PARTITION pd VALUES IN (DEFAULT) SUBPARTITIONS 4
);
Default automatic partitioning
If you do not specify a partition key in the CREATE TABLE statement, PolarDB-X performs default automatic partitioning by using KEY partitioning based on the primary key to create level-1 partitioned tables. If you do not specify the primary key, PolarDB-X uses the implicit primary key as the partition key.
The number of default partitions into which a table is partitioned by using the default automatic partitioning method can be calculated by using the following formula: Number of logical nodes in the PolarDB-X instance x 8. For example, if the number of logical nodes in a PolarDB-X instance is 2, the number of default partitions is 16.
By default, the primary table is partitioned based on the primary key, and all indexes of the primary table are automatically partitioned based on the index key column and primary key column.
The following example shows the standard syntax of the CREATE TABLE statement. In the example, the primary key column is the id column, and the index key column is the name column.
CREATE TABLE auto_part_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
primary key(id),
index idx_name (name)
);
If you execute the SHOW CREATE TABLE
statement to query information about the CREATE TABLE statement, the standard syntax of the CREATE TABLE statement is queried and all partition information is not returned.
show create table auto_part_tbl;
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| auto_part_tbl | CREATE TABLE `auto_part_tbl` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`bid` int(11) DEFAULT NULL,
`name` varchar(30) DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `idx_name` (`name`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8 |
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.06 sec)
If you execute the SHOW FULL CREATE TABLE
statement to query information about the CREATE TABLE statement, all partition information of the preceding primary table and index table are returned.
show full create table auto_part_tbl;
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| auto_part_tbl | CREATE PARTITION TABLE `auto_part_tbl` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`bid` int(11) DEFAULT NULL,
`name` varchar(30) DEFAULT NULL,
PRIMARY KEY (`id`),
GLOBAL INDEX /* idx_name_$a870 */ `idx_name` (`name`) PARTITION BY KEY (`name`, `id`) PARTITIONS 16,
LOCAL KEY `_local_idx_name` (`name`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8
PARTITION BY KEY(`id`)
PARTITIONS 16
/* tablegroup = `tg108` */ |
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.03 sec)
The preceding execution result shows that:
The primary table named auto_part_tbl is automatically partitioned based on the id column into 16 partitions by using KEY partitioning.
The index of the primary table is idx_name that is a global index, and the index table is partitioned into 16 partitions based on the 'name' and 'id' partition key columns.
Manual partitioning
If you specify a partition key column, partitioning function, and partitioning type in the CREATE TABLE statement, a table that is manually partitioned is created after the statement is executed. For more information, see Manually create a partitioned table (AUTO mode).
Supported data types
Table 4 Data types of partition key columns in different partitioning types
Data type | HASH partitioning | RANGE partitioning | LIST partitioning | |||||
HASH | Key | Range | Range Columns | List | List Columns | |||
Single partition key column | Multiple partition key columns | |||||||
Integer | TINYINT | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
TINYINT UNSIGNED | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
SMALLINT | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
SMALLINT UNSIGNED | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
MEDIUMINT | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
MEDIUMINT UNSIGNED | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
INT | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
INT UNSIGNED | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
BIGINT | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
BIGINT UNSIGNED | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
Fixed-point | DECIMAL | Supported (Partitioning functions are not supported by partition key columns that contain the values of this data type.) | Supported | Supported | Supported | Supported (Columns that use this data type only accept values whose decimal part is zero.) | Not supported | Supported (Columns that use this data type only accept values whose decimal part is zero.) |
Date and time | DATE | Supported (Partitioning functions are supported by partition key columns that contain the values of this data type.) | Supported | Supported | Supported (Partitioning functions are necessary for partition key columns that contain the values of this data type.) | Supported | Supported (Partitioning functions are necessary for partition key columns that contain the values of this data type.) | Supported |
DATETIME | Supported (Partitioning functions are supported by partition key columns that contain the values of this data type.) | Supported | Supported | Supported (Partitioning functions are supported by partition key columns that contain the values of this data type.) | Supported | Supported (Partitioning functions are supported by partition key columns that contain the values of this data type.) | Supported | |
TIMESTAMP | Supported (Partitioning functions are supported by partition key columns that contain the values of this data type.) | Supported | Supported | Not supported | Not supported | Not supported | Not supported | |
String | CHAR | Supported (Partitioning functions are not supported by partition key columns that contain the values of this data type.) | Supported | Supported | Not supported | Supported | Not supported | Supported |
VARCHAR | Supported (Partitioning functions are not supported by partition key columns that contain the values of this data type.) | Supported | Supported | Not supported | Supported | Not supported | Supported | |
Binary | BINARY | Supported (Partitioning functions are not supported by partition key columns that contain the values of this data type.) | Supported | Supported | Not supported | Not supported | Not supported | Not supported |
VARBINARY | Supported (Partitioning functions are not supported by partition key columns that contain the values of this data type.) | Supported | Supported | Not supported | Not supported | Not supported | Not supported |
Data types of partition key columns and routing policies
The data routing strategy that is used during partitioning is determined by the data type of partition key columns, especially in HASH partitioning and KEY partitioning. If you use columns that contain different data types as partition key columns, data is routed to partitions in different manners based on different hashing or comparison algorithms. For example, comparison algorithms can be used to route data based on either case-sensitive comparison or case-insensitive comparison. The data routing strategy of MySQL is also determined by the data types of partition key columns.
In the following example, a table named tbl_int is partitioned into 1,024 partitions based on a partition key column that contains INT values, and a table named tbl_bigint is partitioned into 1,024 partitions based on a partition key column that contains BIGINT values. The data types of the partition key columns of the two tables are different. Therefore, PolarDB-X routes data to different partitions when the same value 12345678 is queried in the two tables.
show create table tbl_int;
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tbl_int | CREATE TABLE `tbl_int` (
`a` int(11) NOT NULL,
KEY `auto_shard_key_a` USING BTREE (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
PARTITION BY KEY(`a`)
PARTITIONS 1024 |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)
show create table tbl_bigint;
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tbl_bigint | CREATE TABLE `tbl_bigint` (
`a` bigint(20) NOT NULL,
KEY `auto_shard_key_a` USING BTREE (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
PARTITION BY KEY(`a`)
PARTITIONS 1024 |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.10 sec)mysql> create table if not exists tbl_bigint(a bigint not null)
-> partition by key(a) partitions 1024;
Query OK, 0 rows affected (28.41 sec)
explain select * from tbl_int where a=12345678;
+---------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+---------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_int[p260]", sql="SELECT `a` FROM `tbl_int` AS `tbl_int` WHERE (`a` = ?)") |
| HitCache:false |
| Source:PLAN_CACHE |
| TemplateId: c90af636 |
+---------------------------------------------------------------------------------------------------+
4 rows in set (0.45 sec)
explain select * from tbl_bigint where a=12345678;
+------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_bigint[p477]", sql="SELECT `a` FROM `tbl_bigint` AS `tbl_bigint` WHERE (`a` = ?)") |
| HitCache:false |
| Source:PLAN_CACHE |
| TemplateId: 9b2fa47c |
+------------------------------------------------------------------------------------------------------------+
4 rows in set (0.02 sec)
The case-sensitiveness, character set, and collation of partition key columns
The character set and collation of partition key columns of a partitioned table determine the data routing strategy that is used to route data in the table. For example, the character set and collation determine whether case-sensitive comparison is required. If the collation of the partitioned table is case-sensitive, case-sensitive comparison is performed during data routing based on a hashing or comparison algorithm. If the collation of the partitioned table is case-insensitive, case-sensitive comparison is not performed during data routing. By default, partition key columns that contain strings use the UTF-8 character set and the utf8_general_ci collation that does not require case-sensitive comparison.
Example 1
If you want to perform data routing on a partitioned table in a case-sensitive manner, set the collation of the partitioned table to a case-sensitive collation, such as utf8_bin, when you create the table. In the following sample statement, CHARACTER SET utf8 COLLATE utf8_bin is specified for the tbl_varchar_cs partitioned table. In this case, the system routes the 'AbcD' and 'abcd' strings to different partitions of the table.
show create table tbl_varchar_cs;
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tbl_varchar_cs | CREATE TABLE `tbl_varchar_cs` (
`a` varchar(64) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
KEY `auto_shard_key_a` USING BTREE (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8
PARTITION BY KEY(`a`)
PARTITIONS 64 |
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.07 sec)
explain select a from tbl_varchar_cs where a in ('AbcD');
+-------------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+-------------------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_varchar_cs[p29]", sql="SELECT `a` FROM `tbl_varchar_cs` AS `tbl_varchar_cs` WHERE (`a` IN(?))") |
| HitCache:false |
| Source:PLAN_CACHE |
| TemplateId: 2c49c244 |
+-------------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.11 sec)
explain select a from tbl_varchar_cs where a in ('abcd');
+-------------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+-------------------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_varchar_cs[p11]", sql="SELECT `a` FROM `tbl_varchar_cs` AS `tbl_varchar_cs` WHERE (`a` IN(?))") |
| HitCache:true |
| Source:PLAN_CACHE |
| TemplateId: 2c49c244 |
+-------------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.02 sec)
Example 2
If you want to perform data routing on a partitioned table in a case-insensitive manner, set the collation of the partitioned table to a case-insensitive collation, such as utf8_general_ci, when you create the table. In the following sample statement, CHARACTER SET utf8 COLLATE utf8_general_ci is specified for the tbl_varchar_ci partitioned table. In this case, the system routes the 'AbcD' and 'abcd' strings to the same partition.
show create table tbl_varchar_ci;
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tbl_varchar_ci | CREATE TABLE `tbl_varchar_ci` (
`a` varchar(64) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
KEY `auto_shard_key_a` USING BTREE (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8
PARTITION BY KEY(`a`)
PARTITIONS 64 |
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.06 sec)
explain select a from tbl_varchar_ci where a in ('AbcD');
+------------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+------------------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_varchar_ci[p4]", sql="SELECT `a` FROM `tbl_varchar_ci` AS `tbl_varchar_ci` WHERE (`a` IN(?))") |
| HitCache:false |
| Source:PLAN_CACHE |
| TemplateId: 5c97178e |
+------------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.15 sec)
explain select a from tbl_varchar_ci where a in ('abcd');
+------------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+------------------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_varchar_ci[p4]", sql="SELECT `a` FROM `tbl_varchar_ci` AS `tbl_varchar_ci` WHERE (`a` IN(?))") |
| HitCache:true |
| Source:PLAN_CACHE |
| TemplateId: 5c97178e |
+------------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.02 sec)
Change the character set and collation of partition key columns
The data routing strategy of a partitioned table is determined by the data type of partition key columns of the table. After you change the character set and collation of the partition key columns, PolarDB-X redistributes all data on partitions. Proceed with caution when you change the character set and collation of partition key columns of a table.
Data type truncation and conversion of partition key columns
Truncate values in partition key columns
When you execute an SQL statement to query or insert data, if a partition key value that is specified by a constant expression exceeds the valid range of the data type of the partition key columns, PolarDB-X truncates the specified value based on the data type of the partition key columns and then uses the truncated value to calculate the routing strategy.
For example, the data type of the partition key column of a table named tbl_smallint is SMALLINT. The range of valid SMALLINT values is [-32768, 32767]. If you insert a value that is not in the range of valid SMALLINT values, such as 12745678 or -12345678, the value is truncated to 32767 or -32768. The following example shows how partition key values are truncated.
show create table tbl_smallint;
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tbl_smallint | CREATE TABLE `tbl_smallint` (
`a` smallint(6) NOT NULL,
KEY `auto_shard_key_a` USING BTREE (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
PARTITION BY KEY(`a`)
PARTITIONS 128 |
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.06 sec)
set sql_mode='';
Query OK, 0 rows affected (0.00 sec)
insert into tbl_smallint values (12345678),(-12345678);
Query OK, 2 rows affected (0.07 sec)
select * from tbl_smallint;
+--------+
| a |
+--------+
| -32768 |
| 32767 |
+--------+
2 rows in set (3.51 sec)
explain select * from tbl_smallint where a=12345678;
+------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+------------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_smallint[p117]", sql="SELECT `a` FROM `tbl_smallint` AS `tbl_smallint` WHERE (`a` = ?)") |
| HitCache:false |
| Source:PLAN_CACHE |
| TemplateId: afb464d5 |
+------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.16 sec)
explain select * from tbl_smallint where a=32767;
+------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+------------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_smallint[p117]", sql="SELECT `a` FROM `tbl_smallint` AS `tbl_smallint` WHERE (`a` = ?)") |
| HitCache:true |
| Source:PLAN_CACHE |
| TemplateId: afb464d5 |
+------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.03 sec)
If you specify a primary key value that is not in the range of valid SMALLINT values in a query request, PolarDB-X also truncates the value, and then routes data based on the primary key value that is truncated. When you run a query on the tbl_smallint table, PolarDB-X returns the same result when you set the value of the partition key column to 12345678 or 32767 in query requests.
Convert the data type of values in partition key columns
When you execute an SQL statement to query or insert data, if the data type of a partition key column that is specified by a constant expression is different from the actual data type of the column, PolarDB-X implicitly converts the data type and then performs routing based on the value of the new data type. However, the conversion of data types may fail. For example, the abc
string cannot be converted to an integer.
When PolarDB-X converts the data types of partition key columns, different operations are performed for DQL, DML, and DDL statements.
DQL statements in which the data types of partition key columns in the WHERE clause are converted
If the data types are converted, data is routed to partitions based on the new data types.
If the data types fail to be converted, the conditions that are specified based on the partition key columns are ignored, and all data in the table is queried.
DML statements such as INSERT and REPLACE
If the data types are converted, data is routed to partitions based on the new data types.
If the data types fail to be converted, an error is returned and the statement is not executed.
DDL statements related to partitioned tables such as CREATE TABLE and ALTER TABLE
If the data types are converted, an error is returned and the statement is not executed because data type conversion is not supported in DDL operations.
If the data types fail to be converted, an error is returned and the statement is not executed.
Differences between the syntaxes of the CREATE TABLE statement in MySQL and PolarDB-X
Difference | MySQL | PolarDB-X |
Whether the partition key is required to contain the primary key | Required. | Not required. |
KEY partitioning | Data is routed based on modulo operations by the number of partitions. | Data is routed by using the consistent hashing algorithm. |
HASH partitioning |
|
|
Partitioning function | Supported. In the PARTITION BY HASH(expr(col)) statement, expr can be a common expression, such as YEAR(col) + 1. | Supported when specific requirements are met. In the PARTITION BY HASH(expr(col)) statement, expr can be only one of the following functions. Other operators such as the plus operator (+), minus operator (-), multiplication operator (*), and division operator (/) cannot be used in the expression.
|
Data type of partition key column | In KEY partitioning, columns of all data types can be used as partition key columns. | In KEY partitioning, only columns that contain the values of the integer, data and time, and string types can be used as partition key columns. |
Character set of partition key column | All common character sets are supported. | Only the following character sets are supported:
|
Level-2 partition | Supported | Supported |