All Products
Search
Document Center

ApsaraDB for SelectDB:Migrate data from a StarRocks database

Last Updated:Sep 10, 2024

This topic describes how to use a catalog to migrate offline data from a StarRocks database to an ApsaraDB for SelectDB instance.

Prerequisites

  • The IP address of the StarRocks data source is added to the IP address whitelist of the SelectDB instance. For more information, see Configure an IP address whitelist.

  • You understand what a catalog is and the operations that you can perform on catalogs. For more information, see Data lakehouse.

Sample environment

In this example, data is migrated from the SR_t table in the StarRocks database named starRocks_db to the test_SR2SelectDB table in the SelectDB database named test_db. You can modify the corresponding parameters based on the actual cases when you migrate data. Sample environment:

  • Destination database: test_db

  • Destination table: test_SR2SelectDB

  • Source database: starRocks_db

  • Source data table: SR_t

Prepare the sample source data

Log on to your StarRocks data source and perform the following steps:

  1. Create a database.

    CREATE DATABASE starRocks_db;
  2. Create a table.

    CREATE TABLE SR_t
    (
        id int,
        name string,
        age int
    )
    DISTRIBUTED BY HASH(id) BUCKETS 4
    PROPERTIES("replication_num" = "1");
  3. Insert data into the table.

    INSERT INTO SR_t VALUES
    (1, 'Alice', 25),
    (2, 'Bob', 30),
    (3, 'Charlie', 35),
    (4, 'David', 40),
    (5, 'Eve', 45);

Procedure

  1. Connect to the SelectDB instance. For more information, see Connect to an instance.

    Note

    If you log on to the ApsaraDB for SelectDB instance by using Data Management (DMS), the SWITCH command becomes invalid. We recommend that you connect to the ApsaraDB for SelectDB instance by using a MySQL client.

  2. Create a Java Database Connectivity (JDBC) catalog for the StarRocks database. For more information, see JDBC data source.

    CREATE CATALOG starrocks_catalog PROPERTIES (
        "type"="jdbc",
        "user"="root",
        "password"="123456",
        "jdbc_url" = "jdbc:mysql://127.0.0.1:3306/demo",
        "driver_url" = "mysql-connector-java-8.0.25.jar",
        "driver_class" = "com.mysql.cj.jdbc.Driver",
        "checksum" = "fdf55dcef04b09f2eaf42b75e61ccc9a"
    )

    Parameters

    Parameter

    Required

    Default value

    Description

    user

    Yes

    No default value

    The account that is used to log on to the StarRocks database.

    password

    Yes

    No default value

    The password that is used to log on to the StarRocks database.

    jdbc_url

    Yes

    No default value

    The JDBC URL that is used to connect to the StarRocks database. The JDBC URL must include the endpoint of the StarRocks database.

    driver_url

    Yes

    No default value

    The name of the JAR package of the JDBC driver.

    Note
    • We recommend that you use the mysql-connector-java-8.0.25.jar package.

    • If you want to use another JAR package, submit a ticket for consultation.

    driver_class

    Yes

    No default value

    The class name of the JDBC driver.

    We recommend that you set this parameter to com.mysql.cj.jdbc.Driver.

    lower_case_table_names

    (This parameter is renamed lower_case_meta_names for ApsaraDB for SelectDB instances that run SelectDB Core V4.0.)

    No

    "false"

    Specifies whether to migrate the names of the databases and tables of the external JDBC data source in lowercase. Valid values:

    true: maintains the mapping of lowercase names to actual names in the remote system and allows you to query the databases and tables whose names are not in lowercase. In this case, the names of databases, tables, and columns are converted to lowercase names.

    false: does not allow you to query the databases and tables whose names are not in lowercase.

    Important
    • For ApsaraDB for SelectDB instances that run SelectDB Core V3.0, take note of the following items:

      • If the lower_case_table_names parameter of the frontend (FE) is set to 1 or 2, you must set the lower_case_table_names parameter of the catalog to true.

      • If the lower_case_table_names parameter of the FE is set to 0, you can set the lower_case_table_names parameter of the catalog to true or false.

    • For ApsaraDB for SelectDB instances that run SelectDB Core V4.0, take note of the following items:

      • If the lower_case_table_names parameter of the FE is set to 0 or 2, the names of databases, tables, and columns are not converted to lowercase names.

      • If the lower_case_table_names parameter of the FE is set to 1, the names of tables are converted to lowercase names, and the names of databases and columns remain unchanged.

    only_specified_database

    No

    "false"

    Specifies whether to migrate data only from specific databases. Valid values:

    true: migrates data from specific databases of the data source that is specified in the JDBC URL.

    false: migrates data from all the databases of the data source that is specified in the JDBC URL.

    include_database_list

    No

    ""

    The names of the databases from which you want to migrate data. If you set the only_specified_database parameter to true and want to migrate data from multiple databases, separate multiple database names with commas (,). Database names are case-sensitive.

    exclude_database_list

    No

    ""

    The names of the databases from which you do not want to migrate data. If you set the only_specified_database parameter to true and want to exclude multiple databases, separate multiple database names with commas (,). Database names are case-sensitive.

    meta_names_mapping

    No

    ""

    If the external data source has databases whose names differ only in case, such as DORIS and doris, an error is reported due to ambiguity when ApsaraDB for SelectDB queries the catalog. To resolve the name conflict, you must specify the meta_names_mapping parameter.

    For more information, see Lowercase name synchronization in Apache Doris documentation.

    Important

    This parameter is applicable only to SelectDB instances that run SelectDB Core V4.0.

  3. Query the catalog.

    SHOW CATALOGS; -- Check whether the catalog is created.

    The following output is returned:

    +--------------+--------------+----------+-----------+-------------------------+---------------------+------------------------+
    | CatalogId    | CatalogName  | Type     | IsCurrent | CreateTime              | LastUpdateTime      | Comment                |
    +--------------+--------------+----------+-----------+-------------------------+---------------------+------------------------+
    | 436009309195 | SR_catalog | jdbc      |           | 2024-08-06 17:09:08.058 | 2024-07-19 18:04:37 |                        |
    |            0 | internal     | internal | yes       | UNRECORDED              | NULL                | Doris internal catalog |
    +--------------+--------------+----------+-----------+-------------------------+---------------------+------------------------+
  4. Optional. Switch to the SR_catalog directory of the external catalog.

    You can view and access data in the SR_catalog directory of the external catalog in the same way as you view and access data in an internal catalog.

    Note

    ApsaraDB for SelectDB allows you only to read data in an external catalog.

    SWITCH SR_catalog;
  5. Optional. Switch to the internal directory of the internal catalog.

    If you do not perform Step 4, skip this step.

    SWITCH internal;
  6. Optional. Create data.

    If you have created the destination database, skip this step.

    CREATE database test_db;
  7. Switch to the destination database.

    USE test_db;
  8. Create a table.

    If you have created the destination table, check whether the data types of the columns in the destination table are mapped to those in the source table of the StarRocks database.

    If you have not created the destination table, make sure that the data types of the columns in the destination table are mapped to those in the source table of the StarRocks database when you create the table.

    For more information about mappings between the data types of columns, see the "Data type mappings" section of the JDBC data source topic.

    CREATE TABLE test_SR2SelectDB
    (
        id int,
        name string,
        age int
    )
    DISTRIBUTED BY HASH(id) BUCKETS 4
    PROPERTIES("replication_num" = "1");
  9. Migrate data.

    INSERT INTO test_SR2SelectDB SELECT *  FROM doris_catalog.SR_db.SR_t;
  10. Query the imported data in the destination table.

    SELECT *  FROM test_SR2SelectDB;

Migrate incremental data

In actual production environments, StarRocks data is mainly divided into offline data and incremental data. In most cases, you migrate data from StarRocks to SelectDB because you need to replicate data to a data warehouse to accelerate queries. To migrate incremental data, you can use one of the following methods:

  • Replicate and write StarRocks data to SelectDB when the StarRocks data is generated.

  • Read data from StarRocks partitions by using periodic jobs and write the data to SelectDB.