All Products
Search
Document Center

Realtime Compute for Apache Flink:Manage Hologres catalogs

Last Updated:Dec 24, 2024

Hologres catalogs provide metadata, such as databases, tables, and partitions. The metadata also includes functions and information that are required to access data stored in a database or other external systems. After you create a Hologres catalog, you can directly read Hologres metadata in the development console of Realtime Compute for Apache Flink without the need to manually register Hologres tables. Hologres catalogs improve the efficiency of draft development and ensure data accuracy. This topic describes how to create, view, use, and drop a Hologres catalog in the development console of Realtime Compute for Apache Flink.

Limits

  • Only Flink 1.13 and later support the configuration of Hologres catalogs.

  • You cannot modify catalogs.

  • Only Realtime Compute for Apache Flink whose engine version is vvr-6.0.1-flink-1.15 or later allows you to configure table attributes when you create a table.

  • Only Realtime Compute for Apache Flink whose engine version is vvr-6.0.3-flink-1.15 or later allows you to configure the default parameters and table attributes of a Hologres table when you create a Hologres catalog.

  • Realtime Compute for Apache Flink allows you to read data from and write data to only Hologres internal tables. Therefore, only Hologres exclusive instances can be used when you create a Hologres catalog. Hologres Shared Cluster instances are not supported.

Create a Hologres catalog

You can create a Hologres catalog on the UI or by executing an SQL statement. We recommend that you create a Hologres catalog on the UI.

Create a Hologres catalog on the UI (recommended)

  1. Go to the Catalogs page.

    1. Log on to the management console of Realtime Compute for Apache Flink. Find the workspace that you want to manage and click Console in the Actions column.

    2. In the left-side navigation pane, click Catalogs.

  2. On the Catalog List page, click Create Catalog. In the Create Catalog dialog box, click Hologres on the Built-in Catalog tab in the Choose Catalog Type step and click Next.

  3. Configure the parameters in the Configure Catalog step.

    Important

    After you create a Hologres catalog, the parameter configuration cannot be modified. If you want to modify the parameter configuration, you must delete the Hologres catalog that you created and create a Hologres catalog again.

    HiveUI

    Parameter

    Description

    Required

    Remarks

    catalogname

    The name of the Hologres catalog.

    Yes

    The name can contain only lowercase letters and digits. It cannot contain uppercase letters or special characters such as hyphens (-) and underscores (_).

    endpoint

    The endpoint of the Hologres instance.

    Yes

    For more information, see Instance configurations.

    username

    The AccessKey ID of the Alibaba Cloud account.

    Yes

    For more information, see Console operations.

    Note
    • To protect your AccessKey pair, we recommend that you configure the AccessKey secret by using variables. For more information, see Manage variables.

    • The Alibaba Cloud account to which the AccessKey ID belongs must be able to access all Hologres databases. For more information about Hologres database permissions, see Hologres permission models.

    password

    The AccessKey secret of the Alibaba Cloud account.

    Yes

    dbname

    The name of the default Hologres database that you want to access.

    Yes

    N/A.

  4. Click Confirm.

    Note

    You can view the catalog that you create in the Catalogs pane on the left side of the Catalog List page.

Create a Hologres catalog by executing an SQL statement

  1. In the code editor of the Scripts tab on the SQL Editor page, enter the following statement to create a Hologres catalog:

    CREATE CATALOG <catalogname> WITH (
      'type' = 'hologres',
      'endpoint' = '<endpoint>', 
      'username' = '<AccessKey>',
      'password' = '<AccessSecret>',
      'dbname' = '<dbname>'
    );

    The following table describes the parameters.

    Parameter

    Description

    Required

    Remarks

    catalogname

    The name of the Hologres catalog.

    Yes

    The name can contain only lowercase letters and digits. It cannot contain uppercase letters or special characters such as hyphens (-) and underscores (_).

    type

    The type of the catalog.

    Yes

    Set the value to hologres.

    endpoint

    The endpoint of the Hologres instance.

    Yes

    For more information, see Instance configurations.

    username

    The AccessKey ID of the Alibaba Cloud account.

    Yes

    For more information, see Console operations.

    Note
    • To protect your AccessKey pair, we recommend that you configure the AccessKey secret by using variables. For more information, see Manage variables.

    • The Alibaba Cloud account to which the AccessKey ID belongs must be able to access all Hologres databases. For more information about Hologres database permissions, see Hologres permission models.

    password

    The AccessKey secret of the Alibaba Cloud account.

    Yes

    dbname

    The name of the default Hologres database that you want to access.

    Yes

    N/A.

    ignore-non-persisted-options

    Specifies whether to ignore non-persistence options when you use a Hologres catalog to create a table that uses non-persistence options.

    No

    Valid values:

    • true: The table can be created and all non-persistence options are ignored. This is the default value.

    • false: An error that indicates that the table fails to be created is returned.

    Note

    The persistence of Hologres catalog table options allows the system to reacquire the consistent information that is specified in the DDL statement when the system reads the table information from the catalog again. Only the following persistence options are supported: endpoint, username, password, and dbname.

    Other parameters supported by the Hologres connector

    When you create a Hologres catalog, you can configure the parameters, such as the parameters in the WITH clause of a DDL statement that is used to create a Hologres table. If you configure the parameters for the catalog, the parameters are automatically configured for the table in the Hologres catalog.

    No

    Only Realtime Compute for Apache Flink whose engine version is vvr-6.0.3-flink-1.15 or later supports this parameter.

    Note
    • If you want to configure the parameters, you must set the ignore-non-persisted-options parameter to true.

    • In Realtime Compute for Apache Flink whose engine version is earlier than vvr-6.0.3-flink-1.15, you can only use SQL hints to configure the parameters for each table.

  2. Select the code that is used to create a catalog and click Run that appears on the left side of the code.

    image..png

View a Hologres catalog

After you create a Hologres catalog, you can perform the following operations to view the metadata in the Hologres catalog.

  1. Go to the Catalogs page.

    1. Log on to the management console of Realtime Compute for Apache Flink.

    2. Find the workspace that you want to manage and click Console in the Actions column.

    3. In the left-side navigation pane, click Catalogs.

  2. On the Catalog List page, find the desired catalog and view the Catalog Name and Type columns of the catalog.

    Note
    • If you want to view the databases and tables in the catalog, click View in the Actions column.

    • If the public schema is used, the schema prefix is not added to a table name. In this case, only the table name is used.

Use a Hologres catalog

Note
  • If the public schema is used, ${table_name} is used instead of ${schema_name.table_name}. This is because the prefix of the schema name is not added to the table name.

  • The update data in the table that is read by using a Hologres catalog can be consumed. By default, the ignoredelete property of the table is set to false and the mutatetype property of the table is set to insertorupdate. For more information about the ignoredelete and the mutatetype properties, see Merge data into a wide table.

Create a Hologres table

Create a Hologres table on the UI

  1. Go to the Catalogs page.

    1. Log on to the management console of Realtime Compute for Apache Flink.

    2. Find the workspace that you want to manage and click Console in the Actions column.

    3. In the left-side navigation pane, click Catalogs.

  2. On the Catalog List page, find the desired catalog and click View in the Actions column.

  3. On the page that appears, find the desired database and click View in the Actions column.

  4. On the page that appears, click Create Table.

  5. On the Built-in tab of the Create Table dialog box, click Connection Type and select a table type.

  6. Click Next.

  7. Enter the table creation statement and configure related parameters. Sample code:

    CREATE TABLE `${catalog_name}`.`${db_name}`.`${table_name}` (
     id INT,
     name STRING
    ) WITH (
     'connector' = 'hologres'
    );
  8. Click Confirm.

Create a Hologres table by executing an SQL statement

  1. In the code editor of the Scripts tab on the SQL Editor page, enter the following statement:

    You can use one of the following methods to create a Hologres table:

    • Execute the USE CATALOG HoloName statement to directly reference the information of the Hologres service.

      USE CATALOG ${catalog_name};
      CREATE TABLE `${db_name}`.`${schema_name.table_name}`(
        ...
       ) WITH (
         'connector' = 'hologres'
       );

      For more information about the USE syntax, see USE statements.

    • Directly reference the information about the Hologres service in DDL statements.

      CREATE TABLE `${catalog_name}`.`${db_name}`.`${schema_name.table_name}` (
        ...
      ) WITH (
        'connector' = 'hologres'
      );
    • Configure physical table properties in DDL statements.

      CREATE TABLE `${catalog_name}`.`${db_name}`.`${schema_name.table_name}` (
        ...
      ) WITH (
        'connector' = 'hologres',
        'table_property.orientation' = 'column',
        'table_property.distribution_key' = 'a',
        'table_property.clustering_key' = 'b:desc',
        'table_property.bitmap_columns' = 'a,b',
        'table_property.segment_key' = 'c',
        'table_property.time_to_live_in_seconds' = '86400',
        'table_property.binlog.level' = 'replica',
        'table_property.binlog.ttl' = '86400'
      );
      Note
      • When you create a table for a registered Hologres service, you must set the connector parameter to hologres in the WITH clause. Other parameters such as endpoint are optional.

      • When you create a Hologres table, specific physical table attributes of the table that you specify in the WITH clause cannot be modified.

      • Realtime Compute for Apache Flink whose engine version is earlier than vvr-6.0.3-flink-1.15 does not allow you to modify the physical table properties that are specified in the WITH clause when you create a physical table. You can log on to the Hologres console to modify the physical table properties. Only Realtime Compute for Apache Flink whose engine version is vvr-6.0.3-flink-1.15 or later allows you to modify physical table properties.

      • You cannot add or modify parameters in the WITH clause in a DDL statement that is used to create a Hologres table. You can use SQL hints to add or modify these parameters in the INSERT statement.

      • Realtime Compute for Apache Flink whose engine version is vvr-6.0.5-flink-1.15 or later allows you to configure column descriptions when you create a Hologres table. This way, the related physical table inherits the column descriptions of the Hologres table.

  2. Select the table creation statement and click Run that appears on the left side of the code.

    image

When you create a Hologres table, you can configure physical table properties in the WITH clause. You can configure table property settings based on your business requirements to sort and query data in an efficient manner. The following table describes the supported property parameters and examples. The parameters whose names start with the table_property. prefix indicate physical table properties.

Parameter

Example

Allow modification

table_property.orientation

'table_property.orientation' = 'row,column'

No

table_property.table_group

'table_property.table_group' = 'table_group_xxx'

table_property.distribution_key

'table_property.distribution_key' = 'a,b'

table_property.clustering_key

'table_property.clustering_key' = 'a,b:desc'

table_property.segment_key

'table_property.segment_key' = 'c,d'

table_property.bitmap_columns

'table_property.bitmap_columns' = 'a:on,b:off'

Yes

table_property.dictionary_encoding_columns

'table_property.dictionary_encoding_columns' = 'a:on,b:off,c:auto'

table_property.time_to_live_in_seconds

'table_property.time_to_live_in_seconds' = '864000'

table_property.binlog.level

'table_property.binlog.level' = 'replica'

table_property.binlog.ttl

'table_property.binlog.ttl' = '86400'

enableTypeNormalization

'enableTypeNormalization' = 'true'

Specifies whether to enable the type normalization mode when you use a Hologres catalog to create a table. Valid values:

  • true: The type normalization mode is enabled. When you create a Hologres physical table, data types with higher precision after type normalization are used. The type normalization mode has the following rules:

    • The TINYINT, SMALLINT, INT, and BIGINT data types are converted into the BIGINT data type.

    • The CHAR, VARCHAR, and STRING data types are converted into the STRING data type.

    • The FLOAT and DOUBLE data types are converted into the DOUBLE data type.

    • Other data types are converted based on the data type mappings between Hologres and Realtime Compute for Apache Flink fields. For more information, see Data type mappings.

      Note

      If a data type is changed, the type normalization mode complies with the following rules:

      • When the type normalization mode is enabled, the SMALLINT and INT data types are converted into BIGINT. If you change the SMALLINT data type to the INT data type, the change is considered successful. Therefore, the deployment that executes the CREATE TABLE AS statement runs as expected.

      • When the type normalization mode is enabled, the FLOAT data type is converted into DOUBLE and the BIGINT data type is converted into BIGINT. If you change the FLOAT data type to the BIGINT data type, a data type incompatibility error is returned.

  • false: The type normalization mode is disabled. A Hologres physical table is created based on the data type mappings between Hologres and Realtime Compute for Apache Flink fields. This is the default value.

Note
  • In specific scenarios such as CREATE TABLE AS, you may need to adjust the data type precision of fields in the source table, such as from VARCHAR(10) to VARCHAR(20).

  • For Realtime Compute for Apache Flink whose engine version is earlier than vvr-6.0.5-flink-1.15, if you change the data type of a field in the source table of a deployment that executes the CREATE TABLE AS statement, the deployment may fail. As a result, you must recreate the result table.

  • Realtime Compute for Apache Flink whose engine version is vvr-6.0.5-flink-1.15 or later supports the type normalization mode. In this case, the data type change in the source table does not lead to a failure of the deployment only if the data types before and after the change can be converted to the same data type based on type normalization rules.

  • We recommend that you enable the type normalization mode the first time you start a deployment that executes the CREATE TABLE AS statement. If you do not enable the type normalization mode the first time you start the deployment, the type normalization mode does not take effect. In this case, you must delete the downstream table and restart the deployment without using the state data.

Note

The table properties supported by Hologres catalogs are the same as the table properties supported by the Hologres service. The table_property. prefix is added to the names of the properties for Hologres catalogs to distinguish between the table properties of the Hologres service and Hologres catalogs. For more information about the parameters, see CREATE TABLE and Subscribe to Hologres binary logs.

Modify a Hologres table

Note

Only Realtime Compute for Apache Flink whose engine version is vvr-6.0.5-flink-1.15 or later allows you to modify a Hologres table.

The following examples show the modifications that you can make on a Hologres table.

  • Modify table properties

    ALTER TABLE `${catalog_name}`.`${db_name}`.`${schema_name.table_name}` SET (
      'table_property.binlog.level' = 'replica',
      'table_property.binlog.ttl' = '64700'
    );
    Note

    You can modify only some table properties. For more information, see Create a Hologres table.

  • Rename a table

    ALTER TABLE `${catalog_name}`.`${db_name}`.`${schema_name.table_name}` 
    RENAME TO `${catalog_name}`.`${db_name}`.`${schema_name.new_table_name}`;
  • Add columns to a table

    ALTER TABLE `${catalog_name}`.`${db_name}`.`${schema_name.table_name}` 
    ADD <column_name> <column_datatype> COMMENT '<column_comment>';
  • Modify the name of a column

    ALTER TABLE `${catalog_name}`.`${db_name}`.`${schema_name.table_name}` 
    RENAME <old_column_name> TO <new_column_name>;
  • Modify the comment of a column

    ALTER TABLE `${catalog_name}`.`${db_name}`.`${schema_name.table_name}` 
    MODIFY <column_name> <original_column_type> COMMENT '<new_column_comment>';

Read data from a Hologres table

INSERT INTO ${other_sink_table}
SELECT ...
FROM `${catalog_name}`.`${db_name}`.`${schema_name.table_name}`

Insert result data into a Hologres table

INSERT INTO `${catalog_name}`.`${db_name}`.`${schema_name.table_name}`
SELECT ... 
FROM ${other_source_table}

Use the Hologres catalog that you created as the catalog of the destination store that is used in the CREATE TABLE AS statement

CREATE TABLE IF NOT EXISTS `${catalog_name}`.`${db_name}`.`${schema_name.table_name}`
WITH (
  'connector' = 'hologres'
) AS TABLE ${other_source_table};

The CREATE TABLE AS statement allows you to configure physical table attributes in the WITH clause. When you create a destination table, you can configure these attributes for the table. For more information about the table properties supported by Hologres catalogs, see Create a Hologres table.

To ensure that data can be written to Hologres when data is synchronized from the source table, the Hologres catalog is forced to modify the schema of the result table in the following scenarios:

  • A column whose data type is DECIMAL is used as the primary key in the schema of the source table.

    Hologres does not support the primary keys of the DECIMAL type. Therefore, Hologres changes the data type of the column to BIGINT. If the change does not meet your business requirements, you can use the CREATE TABLE AS statement to convert the data type of the referenced column to the STRING type and recreate a primary key.

  • The schema of the source table contains values of the TIME, TIMESTAMP, or TIMESTAMP_LTZ type. The precision of these data types is greater than 6.

    The precision of the time types supported by Hologres is 6. To ensure that data can be written to Hologres, Realtime Compute for Apache Flink implicitly discards the parts that are higher than the highest precision supported by Hologres.

Use the Hologres catalog that you created as the catalog of the destination store that is used in the CREATE DATABASE AS statement

CREATE DATABASE IF NOT EXISTS `${catalog_name}`.`${db_name}`
WITH (
  'sink.parallelism' = '5' -- Specify the degree of parallelism for each result table. 
) AS DATABASE ${other_source_database};

During data synchronization, you can declare the parameters of the result table in the WITH clause. When the job starts, these parameters take effect on the result tables to which you want to synchronize data. For more information about the parameters that can be modified, see Create a Hologres result table.

The CREATE DATABASE AS statement allows you to specify the schemaname parameter together with the parameters of the Hologres result table in the WITH clause. This way, data can be synchronized to the specified schema of the destination Hologres database. The following table describes the schemaname parameter.

Parameter

Description

Required

Remarks

schemaname

The name of the schema.

No

Default value: public.

Note

You may need to configure different table properties for each destination table. You cannot configure properties for each table in the WITH clause. Therefore, you cannot execute the CREATE DATABASE AS statement to configure physical table properties. If you want to configure table properties, you must manually create a destination table before you start a CREATE DATABASE AS deployment. For more information about the table properties supported by Hologres catalogs, see the "Create a Hologres table" section.

Drop a Hologres catalog

Drop a Hologres catalog on the UI

  1. Go to the Catalogs page.

    1. Log on to the management console of Realtime Compute for Apache Flink.

    2. Find the workspace that you want to manage and click Console in the Actions column.

    3. In the left-side navigation pane, click Catalogs.

  2. On the Catalog List page, find the desired catalog and click Delete in the Actions column.

  3. In the message that appears, click Delete.

    Note

    After you drop the catalog, you can view the Catalogs pane on the left side of the Catalog List page to check whether the catalog is dropped.

Drop a Hologres catalog by executing an SQL statement

  1. In the code editor of the Scripts tab on the SQL Editor page, enter the following statement:

    DROP CATALOG ${catalog_name}

    catalog_name is the name of the Hologres catalog that you want to drop from the development console of Realtime Compute for Apache Flink.

    Warning

    The drop operation does not affect the deployments that are running. However, deployments that are not published or deployments that you want to suspend and then resume are affected. Proceed with caution.

  2. Right-click the statement that is used to drop the catalog and select Run from the shortcut menu.

  3. View the Catalogs pane on the left side of the Catalog List page to check whether the catalog is dropped.