All Products
Search
Document Center

Hologres:CREATE TABLE

最終更新日:Jul 22, 2024

Data is stored and organized in tables. You can improve the data processing efficiency and data analytics capabilities by configuring table indexes and properties when you create a table. This topic describes the syntax of the CREATE TABLE statement and how to configure indexes when you create a table in Hologres. This topic also describes how to create a table in the HoloWeb console.

Syntax of the CREATE TABLE statement

  • Syntax

    The syntax of the CREATE TABLE statement in Hologres is compatible with that in PostgreSQL.

    Note
    • In all Hologres versions, you can execute multiple data definition language (DDL) statements in a transaction. In Hologres V2.0 and later, you can execute multiple data manipulation language (DML) statements in a transaction.

    • Hologres V2.1 and later support the create table with property syntax to simplify the configuration of table properties.

    We recommend that you preferentially use the CREATE TABLE WITH syntax. This syntax significantly improves DDL statement performance in specific scenarios, such as scenarios in which the CREATE TABLE IF NOT EXISTS statement is frequently executed to create tables with the same names as existing tables and configure table properties.

    • Syntax supported in Hologres V2.1 and later:

      BEGIN;
      CREATE TABLE [ IF NOT EXISTS] [schema_name.] table_name ([
          { 
          column_name column_type [column_constraints, [...]]
          | table_constraints
          [,...]
          }
      ])
      [WITH (
          property = 'value',
          [, ...]
      )]
      ;
      COMMENT ON COLUMN < tablename.column > IS 'value';
      COMMENT ON TABLE < tablename > IS 'value';
      COMMIT;
    • Syntax supported in all Hologres versions:

      begin;
      create table [if not exists] [schema_name.]table_name ([
        {
         column_name column_type [column_constraints, [...]]
         | table_constraints
         [, ...]
        }
      ]);
      
      call set_table_property('<table_name>', property, value);
      comment on column <tablename.column> is 'value';
      comment on table <tablename> is 'value';
      commit;
  • Parameters

    • column_type: the data type of a column. For more information about the data types that are supported by Hologres, see Data types.

    • set_table_property or create table with (supported by Hologres V2.1 and later): used to configure table properties. For more information, see Configure properties for a table in this topic.

Introduction to indexes

Hologres is compatible with the PostgreSQL ecosystem. The syntax that is used to create a table in Hologres is the same as that in PostgreSQL. However, the indexes that are supported by Hologres differ from the indexes that are supported by PostgreSQL. For more information about the indexes supported by Hologres, see Configure properties for a table. If you configure appropriate indexes when you create a table, the data that you want to query can be quickly located after an SQL statement is executed. This reduces I/O resource consumption and accelerates queries with fewer computing resources. The following figure shows the execution process from executing an SQL statement to obtaining data. The following figure can help you understand each index used in the execution process and configure appropriate indexes based on your business requirements.建表索引总结

  1. In this example, you perform an SQL query on a partitioned table and locate the partition in which the desired data resides by using the partition pruning feature.

  2. You can use a distribution key to locate the shard in which the desired data resides.

  3. You can use an event time column to locate the file in which the desired data resides. An event time column is previously referred to as a segment key.

  4. You can use a clustering key to sort data in a file and locate the file block in which the desired data resides.

  5. You can use a bitmap index to locate the row in which the desired data resides. A bitmap index is an index in files.

The following table describes the scenarios for different indexes.

Index

Description

Sample query statement

Distribution key

Columns on which the GROUP BY operation is frequently performed or columns that are used to join multiple tables are specified to constitute a distribution key. This reduces data shuffling and implements local joins.

select * from tbl1 join tbl2 on tbl1.a=tbl2.c;

Clustering key

Range query columns or filter query columns are specified to constitute a clustering key. Index-based data filtering complies with the leftmost matching principle. We recommend that you specify no more than two columns to constitute a clustering key.

select sum(a) from tb1 where a > 100 and a < 200;

Bitmap

Equivalent query columns are specified to constitute a bitmap index.

select * from tb1 where a =100;

Event time column (segment key)

Columns that are strongly correlated with time, such as log and traffic columns, are specified to constitute an event time column.

select sum(a) from tb1 where ts > '2020-01-01' and a < '2020-03-02';

Configure properties for a table

Hologres provides set_table_property and create table with for you to configure table properties. Appropriate settings of table properties help Hologres sort and query data in an efficient manner. You must execute the parameters related to data storage layout together with the CREATE TABLE statement.

  • Syntax

    -- Syntax supported in Hologres V2.1 and later:
    create table <table_name> (...)
    with (property = 'value'[, ...]);
    
    -- Syntax supported in all Hologres versions:
    call set_table_property('<table_name>', property, 'value');
    Note

    The set_table_property function and the CREATE TABLE statement must be executed in the same transaction.

    Hologres allows you to execute SQL statements for configuring table properties that include but are not limited to the following properties:

    • Syntax supported in Hologres V2.1 and later:

      CREATE TABLE <table_name> (...)
      WITH (
          orientation = '[column | row | row,column]',
          table_group = '[tableGroupName]',
          distribution_key = 'columnName[,...]]',
          clustering_key = '[columnName{:asc]} [,...]]',
          event_time_column = '[columnName [,...]]',
          bitmap_columns = '[columnName [,...]]',
          dictionary_encoding_columns = '[columnName [,...]]',
          time_to_live_in_seconds = '<non_negative_literal>'
          [,storage_mode, ...]
      );
    • Syntax supported in all Hologres versions:

      BEGIN;
      CREATE TABLE <table_name> (...);
      call set_table_property('table_name', 'orientation', '[column | row]');
      call set_table_property('table_name', 'table_group', '[tableGroupName]');
      call set_table_property('table_name', 'distribution_key', '[columnName[,...]]');
      call set_table_property('table_name', 'clustering_key', '[columnName{:asc]} [,...]]');
      call set_table_property('table_name', 'event_time_column', '[columnName [,...]]');
      call set_table_property('table_name', 'bitmap_columns', '[columnName [,...]]');
      call set_table_property('table_name', 'dictionary_encoding_columns', '[columnName [,...]]');
      call set_table_property('table_name', 'time_to_live_in_seconds', '<non_negative_literal>');
      COMMIT;
  • Properties

    The following table describes the properties and their settings in different types of tables.

    Property

    Description

    Column-oriented table

    Row-oriented table

    Row-column hybrid table

    Recommended setting

    Modifiable after a table is created

    orientation

    The storage mode of the table.

    Default value: column.

    row

    row,column

    column

    No. If you want to modify this property, create another table.

    table_group

    The table group to which the table belongs.

    By default, the default table group is used.

    By default, the default table group is used.

    By default, the default table group is used.

    Use the default value.

    No. If you want to modify this property, create another table or reshard the existing table.

    distribution_key

    The distribution key of the table.

    By default, the primary key is configured as the distribution key. You can modify the distribution key based on your business requirements.

    By default, the primary key is configured as the distribution key.

    By default, the primary key is configured as the distribution key.

    We recommend that you specify only one of the primary key columns as the distribution key.

    No. If you want to modify this property, create another table.

    clustering_key

    The clustering key of the table.

    By default, the property is empty.

    By default, the primary key is used as the clustering key.

    By default, the property is empty.

    We recommend that you specify only one column as the clustering key. The values in the column can be sorted only in ascending order.

    No. If you want to modify this property, create another table.

    event_time_column

    The event time column of the table.

    By default, the first non-null timestamp column is used as the event time column.

    Not supported.

    By default, the first non-null timestamp column is used as the event time column.

    We recommend that you specify a timestamp column.

    No. If you want to modify this property, create another table.

    bitmap_columns

    The bitmap index of the table.

    Configure this property based on your business requirements.

    Not supported.

    Configure this property based on your business requirements.

    We recommend that you specify less than 10 columns that are used for equality comparisons.

    Yes. For more information, see ALTER TABLE.

    dictionary_encoding_columns

    The dictionary encoding on the columns of the table.

    Configure this property based on your business requirements.

    Not supported.

    Configure this property based on your business requirements.

    We recommend that you specify less than 10 columns with low cardinality.

    Yes. For more information, see ALTER TABLE.

    time_to_live_in_seconds

    The time-to-live (TTL) of the table.

    Configure this property based on your business requirements.

    Configure this property based on your business requirements.

    Configure this property based on your business requirements.

    Use the default value.

    Yes. For more information, see ALTER TABLE.

    • orientation

      This property specifies whether a database table uses the column-oriented or row-oriented storage mode in Hologres. Hologres V1.1 and later support the row-column hybrid storage mode. The following syntax is used when you specify this property. Different storage modes apply to different query scenarios. When you create a table, the column-oriented storage mode is used by default. If you want to use another storage mode, explicitly specify the storage mode when you create a table. For more information, see Storage models of tables: row-oriented storage, column-oriented storage, and row-column hybrid storage.

      call set_table_property('table_name', 'orientation', '[column | row |row,column]');
    • table_group

      In Hologres, a table group is a unique concept of logical storage and is used to manage shard counts. A table group corresponds to a group of shards. If you do not create a table group when you create a database, a default table group named in the <db>_tg_default format is automatically created when you create the first table in the database. If no table group is specified when you create other tables, the tables are automatically created in the default table group. In most cases, you do not need to specify a table group. You can use the default table group. If your Hologres instance has more than 256 CPU cores, we recommend that you configure different table groups and specify the shard count for each table group based on your business requirements. This improves performance. For more information, see User guide of table groups and shard counts.

      call set_table_property('table_name', 'table_group', '[tableGroupName]');
    • distribution_key

      This property specifies a distribution key. The distribution_key property is used to specify the distribution policy of data in tables. Data is distributed to each shard based on the distribution key. After this property is configured, the entries with the same distribution key value are distributed to the same shard. A distribution key is important for distributed computing, and can help improve query performance and increase queries per second (QPS). For more information, see Distribution key.

      call set_table_property('table_name', 'distribution_key', '[columnName[,...]]');
    • clustering_key

      This property specifies a clustering key. You can use the following syntax to configure a clustering key. Hologres sorts data in a file based on the clustering key. By default, the data is sorted in ascending order. A clustering key can help accelerate range queries and filter queries on index columns and improve query performance. For more information, see Clustering key.

      call set_table_property('table_name', 'clustering_key', '[columnName{:asc} [,...]]');
    • event_time_column

      This property specifies an event time column. You can use the following syntax to configure an event time column. Files are sorted based on the event time column. When the event time column is hit, Hologres can quickly locate the file where the required data resides. We recommend that you set the event_time_column property to columns whose data monotonically increases or decreases, such as timestamp columns. The event_time_column property is applicable to time-related columns, such as log and traffic columns. Appropriate settings of this property can improve query performance. For more information, see Event time column (segment key).

      call set_table_property('table_name', 'event_time_column', '[columnName [,...]]');
    • bitmap_columns

      This property specifies a bitmap index. You can use the following syntax to configure a bitmap index. The bitmap index can be used to quickly locate the row where data that meets conditions resides. You can specify columns that meet equivalent query conditions as bitmap index columns. By default, all columns of the TEXT type in a column-oriented table are implicitly specified as bitmap index columns. For more information, see Bitmap index.

      call set_table_property('table_name', 'bitmap_columns', '[columnName{:[on|off]}[,...]]');
    • dictionary_encoding_columns

      This property specifies a dictionary encoding column. You can use the following syntax to configure a dictionary encoding column. Dictionary encoding specifies whether to build dictionary mappings for the values of specific columns. Dictionary mappings can be used to convert string comparisons into numeric comparisons to accelerate queries, such as the queries that involve the GROUP BY or FILTER operation. By default, all columns of the TEXT type in a column-oriented table are specified as dictionary encoding columns. Hologres V0.9 and later automatically determine whether to build dictionary mappings based on data characteristics.

      call set_table_property('table_name', 'dictionary_encoding_columns', '[columnName{:[on|off|auto]}[,...]]');
    • time_to_live_in_seconds (not recommended)

      This property specifies the TTL of the data in a table. Unit: seconds. You can use the following syntax to configure the TTL.

      • The system counts the start of the TTL from the time when data is first written to the table, not the time when the data is updated. If you do not configure the TTL, the data in the table is retained for 100 years by default. In Hologres V1.3.24 and later, the minimum TTL can be set to one day, which is equivalent to 86,400 seconds. For more information about TTL, see Other PostgreSQL statements.

      • The TTL does not specify an accurate time. If the TTL expires, the data in the table is deleted within a period of time instead of at a specified point in time. The data is deleted, but the table where the data resides still exists. Therefore, duplicate primary keys may appear. We recommend that you do not use the TTL to manage the lifecycle of data in the production environment. We recommend that you manage the lifecycle of data based on partitions. For more information, see CREATE PARTITION TABLE.

      call set_table_property('table_name', 'time_to_live_in_seconds', '<non_negative_literal>');

Limits

  • You can specify multiple columns to constitute a primary key. The values of a column or multiple columns that are used to constitute a primary key must be unique and cannot be null. You can specify multiple columns to constitute a primary key of a table in only one statement. Columns of the following data types cannot be specified to constitute a primary key: FLOAT, DOUBLE, NUMERIC, ARRAY, JSON, DATE, and other complex data types. Primary key columns must be not nullable. You cannot modify a primary key. To modify a primary key, create another table. The following sample code shows how to specify the id and ds columns to constitute the primary key of a table:

    BEGIN;
    CREATE TABLE public.test (
     "id" text NOT NULL,
     "ds" text NOT NULL,
    PRIMARY KEY (id,ds)
    );
    CALL SET_TABLE_PROPERTY('public.test', 'orientation', 'column');
    COMMIT;
  • The following table describes the parameters that can be configured as the column_constraints or table_constraints properties.

    Parameter

    column_constraints

    table_constraints

    primary key

    Supported

    Supported

    not null

    Supported

    -

    null

    Supported

    -

    unique

    Not supported

    Not supported

    check

    Not supported

    Not supported

    default

    Supported

    Not supported

  • Keywords, reserved parameters, system parameters, special characters, and uppercase and lowercase letters must be enclosed in double quotation marks (") for escaping.

    • Hologres imposes the following limits on keywords: Column names cannot start with hg_, and schema names cannot start with holo_ or hg_pg_. You must also comply with the specifications that are imposed by PostgreSQL on keywords, reserved parameters, and system parameters. For more information, see SQL Key Words and System Columns. If you use PostgreSQL keywords as column names, you must enclose the keywords in double quotation marks (") for escaping.

    • Table names and column names are not case-sensitive. To define a table name or column name that is in uppercase, contains special characters, or starts with a digit, you can enclose the name in double quotation marks (") for escaping.

    • If a custom field has the same name as a system field, such as ctid, an error message may be reported.

    • Hologres V2.0 and later improve the syntax for configuring properties for a table whose column names need to be escaped. If you want to configure a table property for columns whose names need to be escaped, you need to use the new syntax. If you still want to use the original syntax, you need to enable the original syntax by using a Grand Unified Configuration (GUC) parameter.

      -- Enable the original syntax at the session level.
      set hg_disable_parse_holo_property = on;
      
      -- Enable the original syntax at the database level.
      alter database <db_name> set hg_disable_parse_holo_property = on;

      The following statements provide examples.

      Note

      When you query a column whose name starts with a digit, you must also enclose the name in double quotation marks (") to escape the name. Otherwise, the column name fails to be parsed.

      create table "TBL" (a int);
      select relname from pg_class where relname = 'TBL';
      insert into "TBL" values (-1977);
      select * from "TBL";
      ------------------------------------------------------------------
      -- Syntax that is used to configure properties for columns whose names need to be escaped for Hologres V2.0 and later
      begin;
      create table tbl (c1 int not null);
      call set_table_property('tbl', 'clustering_key', '"c1":asc'); 
      commit;
      -- Syntax that is used to configure properties for columns whose names need to be escaped for versions earlier than Hologres V2.0
      begin;
      create table tbl (c1 int not null);
      call set_table_property('tbl', 'clustering_key', '"c1:asc"'); 
      commit;
      ------------------------------------------------------------------
      -- Syntax that is used to configure properties for multiple columns whose names contain uppercase letters for Hologres V2.1 and later
      begin;
      create table tbl ("C1" int not null, c2 text not null) with (clustering_key = '"C1",c2');
      commit;
      -- Syntax that is used to configure properties for multiple columns whose names contain uppercase letters for Hologres V2.0 and later
      begin;
      create table tbl ("C1" int not null, c2 text not null);
      call set_table_property('tbl', 'clustering_key', '"C1",c2'); 
      commit;
      -- Syntax that is used to configure properties for multiple columns whose names contain uppercase letters for versions earlier than Hologres V2.0
      begin;
      create table tbl ("C1" int not null, c2 text not null);
      call set_table_property('tbl', 'clustering_key', '"C1,c2"'); 
      commit;
      ------------------------------------------------------------------
      create table "Tab_$A%*" (a int);
      select relname from pg_class where relname = 'Tab_$A%*';
      insert into "Tab_$A%*" values (-1977);
      select * from "Tab_$A%*";
      ------------------------------------------------------------------
      create table tbl ("2c" int not null);
      insert into tbl values (3), (4);
      select "2c" from tbl;
  • When you create a table, if no table with the same name exists and the semantics is correct, the table can be created successfully. If the IF NOT EXISTS parameter is not specified and a table with the same name exists, an error is returned. If the IF NOT EXISTS parameter is specified and a table with the same name exists, Hologres displays a notice, skips the steps to create the table, and returns SUCCEED. The following table describes the rules.

    Scenario

    Response when IF NOT EXISTS is specified

    Response when IF NOT EXISTS is not specified

    A table with the same name exists.

    NOTICE: relation “xx“already exists, skippingSUCCEED

    ERROR: relation is already exists.

    No table with the same name exists.

    SUCCEED

    SUCCEED

  • A table name cannot exceed 64 bytes in length. Otherwise, the table name is truncated.

  • You cannot modify the data types of a table after the table is created. If you need to modify the data types, create another table.

  • You must specify a primary key for a row-oriented table or a row-column hybrid table. A primary key is optional for a column-oriented table.

  • You cannot change the column order after the table is created. If you need to change the column order, create another table.

  • You cannot modify the orientation, distribution_key, clustering_key, or event_time_column property after a table is created. This is because these properties determine the storage layout of data after the data is written. If you want to modify these properties, create another table. The bitmap_columns and dictionary_encoding_columns properties can be modified after a table is created. These properties do not affect the storage layout of data.

  • You cannot change the not null columns in an existing table to the nullable columns and cannot change the nullable columns to the not null columns. If you want to change these columns, create another table.

Query a table schema

You can execute the following statements to query a table schema:

create extension hg_toolkit; -- This statement takes effect at the database level. You need to execute the statement in a database only once.
select hg_dump_script('[<schema_name>.]<table_name>');
Note
  • You can also go to the Metadata Management tab in the HoloWeb console to query a table schema by using DDL statements.

  • In Hologres V2.2 and later, table properties returned by hg_dump_script is displayed in the WITH syntax instead of the CALL syntax. This helps improve the convenience and readability of the CREATE TABLE statement.

Examples

  • Create a column-oriented table and configure the primary key.

    Note

    The distribution key must be a subset of the columns that constitute the primary key.

    begin;
    CREATE TABLE tbl (
     "id" bigint NOT NULL,
     "name" text NOT NULL,
     "age" bigint,
     "class" text NOT NULL,
     "reg_timestamp" timestamptz NOT NULL,
    PRIMARY KEY (id,age)
    );
    call set_table_property('tbl', 'orientation', 'column');
    call set_table_property('tbl', 'distribution_key', 'id');
    call set_table_property('tbl', 'clustering_key', 'age');
    call set_table_property('tbl', 'event_time_column', 'reg_timestamp');
    call set_table_property('tbl', 'bitmap_columns', 'name,class');
    call set_table_property('tbl', 'dictionary_encoding_columns', 'class:auto');
    commit;
  • Create a partitioned table and configure the primary key.

    Note

    If you configure a primary key for a partitioned table, you must include the partition key column in the primary key.

    begin;
    CREATE TABLE www (
     name text NOT NULL,
     ds text NOT NULL,
     age text NOT NULL,
    PRIMARY KEY (name,ds)
    )
    PARTITION BY LIST(ds);
    CALL SET_TABLE_PROPERTY('www', 'orientation', 'column');
    commit;

    You need to create child tables for a partitioned table. For more information about how to create a child table, see CREATE PARTITION TABLE.

  • Create a table and specify default values for the columns.

    begin;
    CREATE TABLE tbl_default (    
      smallint_col smallint DEFAULT 0,    
      int_col int DEFAULT 0,    
      bigint_col bigint DEFAULT 0,    
      boolean_col boolean DEFAULT FALSE,    
      float_col real DEFAULT 0.0,    
      double_col double precision DEFAULT 0.0,    
      decimal_col decimal(2, 1) DEFAULT 0.0,    
      text_col text DEFAULT 'N',    
      char_col char(2) DEFAULT 'N',    
      varchar_col varchar(200) DEFAULT 'N',    
      timestamptz_col timestamptz DEFAULT now(),    
      date_col date DEFAULT now(),    
      timestamp_col timestamp DEFAULT now()
    );
    commit;

Create an internal table in the HoloWeb console

You can use HoloWeb to create an internal table in a visualized manner, without the need to write SQL statements. To create an internal table in the HoloWeb console, perform the following steps:

  1. Log on to the HoloWeb console. For more information, see Connect to HoloWeb.

  2. In the top navigation bar of the HoloWeb console, click Metadata Management. Then, click Tables.

    You can also click Instances Connected in the left-side navigation pane of the Metadata Management tab. Click the desired instance and click the database that you want to manage. Right-click the schema that you want to manage and select Create Internal Table.

  3. On the Create Internal Table page, configure the parameters based on your business requirements. The following table describes the parameters.

    Parameter description (show more)

    Section

    Parameter

    Description

    Basic Information

    Mode

    The name of the schema.

    You can select the default schema named public or a custom schema.

    Table Name

    The name of the Hologres internal table that you want to create.

    Description

    The description of the Hologres internal table that you want to create.

    Fields

    Field Name

    The name of each column in the internal table.

    Data Type

    The data type of each column.

    Primary Key

    Specifies whether to configure the column as the primary key for the internal table.

    Nullable

    Specifies whether the column can be null.

    Array

    Specifies whether the column is an ordered array of elements.

    Description

    The description of the column.

    Actions

    The operations that you can perform on the column. You can delete the column, move the column up, or move the column down.

    Attributes

    Storage Mode

    The storage mode of the table. Valid values: Column-oriented storage, Row-oriented storage, and Row-column Storage.

    Default value: Column-oriented storage.

    Data Lifecycle

    The TTL of the data in the table. The system counts the start of the TTL from the time when data is first written to the table. If the TTL expires, the data of the table is deleted in a period of time instead of at a specified point in time.

    Default value: Maximum.

    Binlog

    Specifies whether to enable binary logging for the table. For more information, see Subscribe to Hologres binary logs.

    Lifecycle of Binary Logs

    The TTL of binary logs. For more information, see Subscribe to Hologres binary logs. Default value: Maximum.

    Distribution Column

    The distribution key of the table. For more information, see Distribution key.

    Event Time Column

    The event time column of the table. For more information, see Event time column (segment key).

    Clustering Key

    The clustering key of the table. For more information, see Clustering key.

    Dictionary Encoding Columns

    The dictionary encoding column of the table. For more information, see Dictionary encoding.

    Bitmap Column

    The bitmap column of the table. For more information, see Bitmap index.

    Partitioned Tables

    N/A

    Select partition key columns of the internal table.

  4. In the upper-right corner, click Submit. After you click Submit, you can refresh the left-side instance list. The created internal table is displayed under the schema that you selected.

More operations:

  • Edit an internal table

    1. Go to the Metadata Management page and double-click the internal table that you want to edit in Instances Connected.

    2. On the table information tab, click Edit Table to modify properties of the table, such as adding columns and changing the TTL of the data in the table.

    3. Click Submit.

  • Delete an internal table

    1. Go to the Metadata Management page, right-click the internal table that you want to delete in Instances Connected, and then select Delete Table.删除内部表

    2. In the Delete table message, click OK.

  • View data in a table

    1. In Instances Connected, double-click the internal table whose data you want to view.

    2. On the table information tab, click Data preview to preview the data in the table.数据预览

  • View DDL statements

    On the table details tab, click DDL statement to view the DDL statements of the table.DDL语句

What to do next

After you create a table, you can import data into the table by executing the INSERT statement, synchronizing data, or migrating data. For more information, see INSERT, Overview, and Data migration.

References

  • For more information about how to create a table by copying the structure and data of an existing table, see CREATE TABLE AS. For more information about how to create a table by copying the schema and properties of an existing table, see CREATE TABLE LIKE.

  • For more information about how to create a partitioned table, see CREATE PARTITION TABLE.

  • For more information about how to create a foreign table, see CREATE FOREIGN TABLE.

  • For more information about how to modify table properties, see ALTER TABLE.