All Products
Search
Document Center

PolarDB:Generated columns

Last Updated:Aug 18, 2023

A generated column is a column that contains values representing the calculation result of an expression. This topic describes how to create generated columns and how to create indexes on generated columns.

Create a generated column

Syntax

col_name data_type [GENERATED ALWAYS] AS (expr)
  [VIRTUAL | STORED | LOGICAL] [NOT NULL | NULL]
  [UNIQUE [KEY]] [[PRIMARY] KEY]
  [COMMENT 'string']

You can create the following types of generated columns:

  • VITRUAL: The value of the generated column is not stored and does not occupy storage space. The value is calculated by the data node each time the column is read.

    Note

    If you do not specify the type, a generated column of the VITRUAL type is created by default.

  • STORED: The value of the generated column is calculated by the data node when the data row is inserted or updated. The result is stored on the data node and occupies storage space.

  • LOGICAL: Similar to the STORED type, the value of the generated column is calculated when the data row is inserted or updated. However, the value is calculated by the compute node and then stored on the data node as a common column. The generated column of this type can be used as a partition key.

Considerations

Only an instance of PolarDB-X Enterprise Edition V5.4.17 or later supports this feature.

Similar to the columns in MySQL, generated columns has the following limits compared with ordinary columns:

  • Same limits as those in MySQL

    • You cannot specify a default value for a generated column.

    • You cannot set the AUTO_INCREMENT attribute for a generated column, or reference a column that contains the AUTO_INCREMENT attribute for a generated column.

    • You cannot use deterministic functions, such as UUID(), CONNECTION_ID(), and NOW(), in the expression of a generated column.

    • You cannot use variables in the expression of a generated column.

    • You cannot use subqueries in the expression of a generated column.

    • You cannot explicitly specify the values of generated columns in INSERT and UPDATE statements. The values of generated columns can only be automatically calculated by the database.

  • Different limits from those in MySQL

    • You cannot add generated columns to tables for which cold data archiving is enabled.

    • Generated columns of the VIRTUAL or STORED type cannot be used as partition keys, primary keys, or unique keys.

    • You cannot reference stored functions in generated columns of the VIRTUAL or STORED type.

    • If a global secondary index includes generated columns of the VIRTUAL or STORED type, the global secondary index must also include all columns that are referenced in the expressions of the generated column.

    • You cannot reference a generated column of the VIRTUAL or STORED type in the expression of a generated column of the LOGICAL type.

    • You cannot modify the type of a generated column of the LOGICAL type and the type of a referenced column in the expression of a generated column.

    • Only the following types can be used for a generated column of the LOGICAL type and for the referenced columns in its expressions:

      • Integer types (BIGINT, INT, MEDUMINT, SMALLINT, TINYINT);

      • Date types (DATETIME, DATE, and TIMESTAMP). Date type columns that contain the ON UPDATE CURRENT_TIMESTAMP attribute are not supported.

      • String types (CHAR and VARCHAR).

Examples: generated columns of the VIRTUAL or STORED type

Example 1

Use a generated column to automatically calculate the hypotenuse length of a right-angled triangle (the hypotenuse length is equal to the square root of the sum of the squares of the two right-angled sides) :

CREATE TABLE triangle (
  sidea DOUBLE,
  sideb DOUBLE,
  sidec DOUBLE AS (SQRT(sidea * sidea + sideb * sideb))
);
INSERT INTO triangle (sidea, sideb) VALUES(1,1),(3,4),(6,8);

Insert data to the hive_hbase_table table:

INSERT INTO triangle (sidea, sideb) VALUES(1,1),(3,4),(6,8);
select * from triangle;
+-------+-------+--------------------+
| sidea | sideb | sidec              |
+-------+-------+--------------------+
|   1.0 |   1.0 | 1.4142135623730951 |
|   3.0 |   4.0 |                5.0 |
|   6.0 |   8.0 |               10.0 |
+-------+-------+--------------------+

Example 2

Create partition table t1 that contains the generated column b:

CREATE TABLE `t1` (
	`a` int(11) NOT NULL,
	`b` int(11) GENERATED ALWAYS AS (`a` + 1),
	PRIMARY KEY (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4  dbpartition by hash(`a`)

Insert data to the hive_hbase_table table:

INSERT INTO t1(a) VALUES (1);
SELECT * FROM t1;
+---+---+
| a | b |
+---+---+
| 1 | 2 |
+---+---+

Examples: generated columns of the LOGICAL type

Generated columns of the LOGICAL type can be used as partition keys to implement more flexible partitioning strategies. You can also reference User-defined functions in generated columns of the LOGICAL type.

Example 1: Select the last two characters of a string field as the partition key.

CREATE TABLE `t2` (
	`a` int(11) NOT NULL,
	`b` varchar(32) DEFAULT NULL,
	`c` varchar(2) GENERATED ALWAYS AS (SUBSTR(`b`, -2)) LOGICAL,
	PRIMARY KEY (`a`),
	KEY `auto_shard_key_c` USING BTREE (`c`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4  dbpartition by hash(`c`)

Example 2: Create a generated column of the LOGICAL type that references a User-defined function.

Create a user-defined function my_abs:

DELIMITER &&
CREATE FUNCTION my_abs (
	a INT
)
RETURNS INT
BEGIN
	IF a < 0 THEN
		RETURN -a;
	ELSE
		RETURN a;
	END IF;
END&&
DELIMITER ;

Create table t3 that contains a generated column b. The expression of the column contains the user-defined function my_abs:

CREATE TABLE `t3` (
	`a` int(11) NOT NULL,
	`b` int(11) GENERATED ALWAYS AS (MY_ABS(`a`)) LOGICAL,
	PRIMARY KEY (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4  dbpartition by hash(`b`);

INSERT INTO t3 (a) VALUES(1),(-1);

EXPLAIN SELECT * FROM t3 WHERE b = 1;
+-----------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN                                                                                     |
+-----------------------------------------------------------------------------------------------------------+
| LogicalView(tables="TEST_000002_GROUP.t3_WHHZ", sql="SELECT `a`, `b` FROM `t3` AS `t3` WHERE (`b` = ?)")  |
+-----------------------------------------------------------------------------------------------------------+

SELECT * FROM t3 WHERE b = 1;
+----+------+
| a  | b    |
+----+------+
| -1 |    1 |
|  1 |    1 |
+----+------+

Create an index on a generated column

PolarDB-X supports creating indexes on generated columns.

Considerations

  • Only an instance of PolarDB-X Enterprise Edition V5.4.17 or later supports this feature.

  • You can create local indexes on all types of generated columns.

  • You can create a global index on a generated column of the LOGICAL type.

  • You cannot create global indexes on generated columns of the VIRTUAL and STORED type.

Examples

Example 1: Create a local index on a generated column of the VIRTUAL or STORED type.

CREATE TABLE t4 (
    a BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    c JSON,
    g INT AS (c->"$.id") VIRTUAL
) DBPARTITION BY HASH(a);

CREATE INDEX `i` ON `t4`(`g`);

INSERT INTO t4 (c) VALUES
  ('{"id": "1", "name": "Fred"}'),
  ('{"id": "2", "name": "Wilma"}'),
  ('{"id": "3", "name": "Barney"}'),
  ('{"id": "4", "name": "Betty"}');

EXPLAIN EXECUTE SELECT c->>"$.name" AS name FROM t4 WHERE g > 2;
+------+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id   | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+------+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| 1    | SIMPLE      | t4    | NULL       | range | i             | i    | 5       | NULL | 1    | 100      | Using where |
+------+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+

Example 2: Create a local index on a generated column of the LOGICAL type.

CREATE TABLE t5 (
    a BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    c varchar(32),
    g char(2) AS (substr(`c`, 2)) LOGICAL
) DBPARTITION BY HASH(a);

CREATE INDEX `i` ON `t5`(`g`);

INSERT INTO t5 (c) VALUES
  ('1111'),
  ('1112'),
  ('1211'),
  ('1311');

EXPLAIN EXECUTE SELECT c AS name FROM t5 WHERE g = '11';
+------+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------+
| id   | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                    |
+------+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------+
| 1    | SIMPLE      | t5    | NULL       | ref  | i             | i    | 8       | NULL | 4    | 100.00   | Using XPlan, Using where |
+------+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------+

Example 3: Create a global index on a generated column of the LOGICAL type.

CREATE TABLE t6 (
    a BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    c varchar(32),
    g char(2) AS (substr(`c`, 2)) LOGICAL
) DBPARTITION BY HASH(a);

CREATE GLOBAL INDEX `g_i` ON `t6`(`g`) COVERING(`c`) DBPARTITION BY HASH(`g`);

INSERT INTO t6 (c) VALUES
  ('1111'),
  ('1112'),
  ('1211'),
  ('1311');

EXPLAIN SELECT c AS name FROM t6 WHERE g = '11';
+---------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN                                                                                               |
+---------------------------------------------------------------------------------------------------------------------+
| IndexScan(tables="TEST_DRDS_000000_GROUP.g_i_J1MT", sql="SELECT `c` AS `name` FROM `g_i` AS `g_i` WHERE (`g` = ?)") |
+---------------------------------------------------------------------------------------------------------------------+

Expression index

When PolarDB-X is creating an index, if an index entry is an expression instead of a column, PolarDB-X automatically converts the expression into a generated column of the VIRTUAL type and add the column to the table. After all index entries are processed, PolarDB-X continues to create the index based on your definitions. The expression in the index definition is replaced with the corresponding generated column.

Considerations

  • Only an instance of PolarDB-X Enterprise Edition V5.4.17 or later supports this feature.

  • By default, the expression index feature is disabled. You can enable this feature by turning on the switch ENABLE_CREATE_EXPRESSION_INDEX.

    SET GLOBAL ENABLE_CREATE_EXPRESSION_INDEX=TRUE;
  • Global indexes are not supported.

  • Unique indexes are not supported.

  • You cannot create an expression index by executing a table creation statement. After the table is created, you can execute the ALTER TABLE or CREATE INDEX statement to create an expression index.

  • By default, if you use a DROP INDEX statement to delete an expression index, the generated columns that are automatically created when you create the expression index are not deleted. You must execute the ALTER TABLE statement in DRDS or AUTO mode to manually delete the generated columns. For more information, see ALTER TABLE (DRDS mode) or ALTER TABLE (AUTO mode) operation

Examples

Example 1: Create an expression index.

  1. Create the table t7.

    CREATE TABLE t7 (
        a BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
        c varchar(32)
    ) DBPARTITION BY HASH(a);
  2. Create the expression index i.

    CREATE INDEX `i` ON `t7`(substr(`c`, 2));
  3. After the expression index is created, the table has the following structure:

    CREATE TABLE `t7` (
    	`a` bigint(20) NOT NULL AUTO_INCREMENT BY GROUP,
    	`c` varchar(32) DEFAULT NULL,
    	`i$0` varchar(32) GENERATED ALWAYS AS (substr(`c`, 2)) VIRTUAL,
    	PRIMARY KEY (`a`),
    	KEY `i` (`i$0`)
    ) ENGINE = InnoDB dbpartition by hash(`a`)

Because the index entry of index i is an expression, a generated column i$0 is added to the table. The expression of this generated column is the same as the expression of the index entry. After that, index i is created. The index entry is replaced with the corresponding generated column.

After the expression index is created, the performance of the following SQL statement is improved:

EXPLAIN EXECUTE SELECT * FROM t7 WHERE substr(`c`, 2) = '11';
+------+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id   | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+------+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| 1    | SIMPLE      | t7    | NULL       | ref  | i             | i    | 131     | const | 1    | 100      | NULL  |
+------+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+

Execute the EXPLAIN EXECUTE statement to obtain the execution plan on the data node. The result indicates that index i is selected:

EXPLAIN EXECUTE SELECT * FROM t7 WHERE substr(`c`, 2) = '11';
+------+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id   | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+------+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| 1    | SIMPLE      | t7    | NULL       | ref  | i             | i    | 131     | const | 1    | 100      | NULL  |
+------+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+

Example 2: Use multiple expressions in an index.

Create the table t8:

CREATE INDEX idx ON t8(
  a + 1, 
  b, 
  SUBSTR(c, 2)
);

After the expression index is created, the table has the following structure:

CREATE TABLE `t8` (
	`a` int(11) NOT NULL,
	`b` int(11) DEFAULT NULL,
	`c` varchar(32) DEFAULT NULL,
	`idx$0` bigint(20) GENERATED ALWAYS AS (`a` + 1) VIRTUAL,
	`idx$2` varchar(32) GENERATED ALWAYS AS (substr(`c`, 2)) VIRTUAL,
	PRIMARY KEY (`a`),
	KEY `idx` (`idx$0`, `b`, `idx$2`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4  dbpartition by hash(`a`)

For index idx, the first and third index entries are expressions. Therefore, two generated columns idx$0 and idx$2 are added to the table. The expressions of these two generated columns are the same as the expressions of the first and third index entries. After that, index idx is created. The first and third index entries are replaced with the corresponding generated columns.

After the expression index is created, the performance of the following SQL statement is improved:

SELECT * FROM t8 WHERE a+1=10 AND b=20 AND SUBSTR(c,2)='ab';

Execute the EXPLAIN EXECUTE statement to obtain the execution plan on the data node. The result indicates that index idx is selected:

EXPLAIN EXECUTE SELECT * FROM t4 WHERE a+1=10 AND b=20 AND SUBSTR(c,2)='ab';
+------+-------------+-------+------------+------+---------------+------+---------+-------------------+------+----------+-------+
| id   | select_type | table | partitions | type | possible_keys | key  | key_len | ref               | rows | filtered | Extra |
+------+-------------+-------+------------+------+---------------+------+---------+-------------------+------+----------+-------+
| 1    | SIMPLE      | t8    | NULL       | ref  | idx           | idx  | 145     | const,const,const | 1    | 100      | NULL  |
+------+-------------+-------+------------+------+---------------+------+---------+-------------------+------+----------+-------+