This topic describes how to use the HASH function.
Description
- If the HASH function uses different shard keys for database sharding and table sharding,
the values of the database shard key are divided by the number of database shards
to obtain the remainders into which data is partitioned. If the key values are strings,
the strings are first converted into hash values and then used for route calculation.
For example, HASH(8) is equivalent to 8%D. D specifies the number of database shards. HASH("ABC") is equivalent to hashcode("ABC").abs()%D. D specifies the number of database shards into which data is partitioned.
- If the HASH function uses the same shard key for database sharding and table sharding,
the values of the shard key are divided by the total number of table shards to obtain
the remainders.
For example, two database shards are created. Each database shard has four table shards. Database shard 0 stores Table shards 0 to 3 and Database shard 1 stores Table shards 4 to 7. If a key value is 15, the record for the key value 15 is distributed to Table shard 7 in Database shard 1 based on the following equation: 15 % (2 * 4) = 7.
Precautions
The output of the HASH function can be evenly distributed only when the values in a shard key column are evenly distributed.
Limits
- A shard key must be of an INTEGER or STRING type.
- The shard key that is of the STRING type is case-sensitive when data routing is performed.
Common scenarios
- Database sharding is performed based on user IDs or order IDs.
- Values of shard keys are strings.
If you want to use the HASH function only for database sharding based on the values of the ID column, you can execute the following DDL statement:
create table test_hash_tb (
id int,
name varchar(30) DEFAULT NULL,
create_time datetime DEFAULT NULL,
primary key(id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 dbpartition by HASH(ID);