This topic describes how to use a catalog to migrate offline data from a StarRocks database to an ApsaraDB for SelectDB instance.
Prerequisites
The IP address of the StarRocks data source is added to the IP address whitelist of the SelectDB instance. For more information, see Configure an IP address whitelist.
You understand what a catalog is and the operations that you can perform on catalogs. For more information, see Data lakehouse.
Sample environment
In this example, data is migrated from the SR_t table in the StarRocks database named starRocks_db to the test_SR2SelectDB table in the SelectDB database named test_db. You can modify the corresponding parameters based on the actual cases when you migrate data. Sample environment:
Destination database: test_db
Destination table: test_SR2SelectDB
Source database: starRocks_db
Source data table: SR_t
Prepare the sample source data
Log on to your StarRocks data source and perform the following steps:
Create a database.
CREATE DATABASE starRocks_db;
Create a table.
CREATE TABLE SR_t ( id int, name string, age int ) DISTRIBUTED BY HASH(id) BUCKETS 4 PROPERTIES("replication_num" = "1");
Insert data into the table.
INSERT INTO SR_t VALUES (1, 'Alice', 25), (2, 'Bob', 30), (3, 'Charlie', 35), (4, 'David', 40), (5, 'Eve', 45);
Procedure
Connect to the SelectDB instance. For more information, see Connect to an instance.
NoteIf you log on to the ApsaraDB for SelectDB instance by using Data Management (DMS), the
SWITCH
command becomes invalid. We recommend that you connect to the ApsaraDB for SelectDB instance by using a MySQL client.Create a Java Database Connectivity (JDBC) catalog for the StarRocks database. For more information, see JDBC data source.
CREATE CATALOG starrocks_catalog PROPERTIES ( "type"="jdbc", "user"="root", "password"="123456", "jdbc_url" = "jdbc:mysql://127.0.0.1:3306/demo", "driver_url" = "mysql-connector-java-8.0.25.jar", "driver_class" = "com.mysql.cj.jdbc.Driver", "checksum" = "fdf55dcef04b09f2eaf42b75e61ccc9a" )
Parameters
Parameter
Required
Default value
Description
user
Yes
No default value
The account that is used to log on to the StarRocks database.
password
Yes
No default value
The password that is used to log on to the StarRocks database.
jdbc_url
Yes
No default value
The JDBC URL that is used to connect to the StarRocks database. The JDBC URL must include the endpoint of the StarRocks database.
driver_url
Yes
No default value
The name of the JAR package of the JDBC driver.
NoteWe recommend that you use the
mysql-connector-java-8.0.25.jar
package.If you want to use another JAR package, submit a ticket for consultation.
driver_class
Yes
No default value
The class name of the JDBC driver.
We recommend that you set this parameter to
com.mysql.cj.jdbc.Driver
.lower_case_table_names
(This parameter is renamed lower_case_meta_names for ApsaraDB for SelectDB instances that run SelectDB Core V4.0.)
No
"false"
Specifies whether to migrate the names of the databases and tables of the external JDBC data source in lowercase. Valid values:
true: maintains the mapping of lowercase names to actual names in the remote system and allows you to query the databases and tables whose names are not in lowercase. In this case, the names of databases, tables, and columns are converted to lowercase names.
false: does not allow you to query the databases and tables whose names are not in lowercase.
ImportantFor ApsaraDB for SelectDB instances that run SelectDB Core V3.0, take note of the following items:
If the
lower_case_table_names
parameter of the frontend (FE) is set to1
or2
, you must set thelower_case_table_names
parameter of the catalog totrue
.If the
lower_case_table_names
parameter of the FE is set to0
, you can set the lower_case_table_names parameter of the catalog totrue
orfalse
.
For ApsaraDB for SelectDB instances that run SelectDB Core V4.0, take note of the following items:
If the
lower_case_table_names
parameter of the FE is set to0
or2
, the names of databases, tables, and columns are not converted to lowercase names.If the
lower_case_table_names
parameter of the FE is set to1
, the names of tables are converted to lowercase names, and the names of databases and columns remain unchanged.
only_specified_database
No
"false"
Specifies whether to migrate data only from specific databases. Valid values:
true: migrates data from specific databases of the data source that is specified in the JDBC URL.
false: migrates data from all the databases of the data source that is specified in the JDBC URL.
include_database_list
No
""
The names of the databases from which you want to migrate data. If you set the
only_specified_database
parameter to true and want to migrate data from multiple databases, separate multiple database names with commas (,). Database names are case-sensitive.exclude_database_list
No
""
The names of the databases from which you do not want to migrate data. If you set the
only_specified_database
parameter to true and want to exclude multiple databases, separate multiple database names with commas (,). Database names are case-sensitive.meta_names_mapping
No
""
If the external data source has databases whose names differ only in case, such as DORIS and doris, an error is reported due to ambiguity when ApsaraDB for SelectDB queries the catalog. To resolve the name conflict, you must specify the
meta_names_mapping
parameter.For more information, see Lowercase name synchronization in Apache Doris documentation.
ImportantThis parameter is applicable only to SelectDB instances that run SelectDB Core V4.0.
Query the catalog.
SHOW CATALOGS; -- Check whether the catalog is created.
The following output is returned:
+--------------+--------------+----------+-----------+-------------------------+---------------------+------------------------+ | CatalogId | CatalogName | Type | IsCurrent | CreateTime | LastUpdateTime | Comment | +--------------+--------------+----------+-----------+-------------------------+---------------------+------------------------+ | 436009309195 | SR_catalog | jdbc | | 2024-08-06 17:09:08.058 | 2024-07-19 18:04:37 | | | 0 | internal | internal | yes | UNRECORDED | NULL | Doris internal catalog | +--------------+--------------+----------+-----------+-------------------------+---------------------+------------------------+
Optional. Switch to the SR_catalog directory of the external catalog.
You can view and access data in the SR_catalog directory of the external catalog in the same way as you view and access data in an internal catalog.
NoteApsaraDB for SelectDB allows you only to read data in an external catalog.
SWITCH SR_catalog;
Optional. Switch to the internal directory of the internal catalog.
If you do not perform Step 4, skip this step.
SWITCH internal;
Optional. Create data.
If you have created the destination database, skip this step.
CREATE database test_db;
Switch to the destination database.
USE test_db;
Create a table.
If you have created the destination table, check whether the data types of the columns in the destination table are mapped to those in the source table of the StarRocks database.
If you have not created the destination table, make sure that the data types of the columns in the destination table are mapped to those in the source table of the StarRocks database when you create the table.
For more information about mappings between the data types of columns, see the "Data type mappings" section of the JDBC data source topic.
CREATE TABLE test_SR2SelectDB ( id int, name string, age int ) DISTRIBUTED BY HASH(id) BUCKETS 4 PROPERTIES("replication_num" = "1");
Migrate data.
INSERT INTO test_SR2SelectDB SELECT * FROM doris_catalog.SR_db.SR_t;
Query the imported data in the destination table.
SELECT * FROM test_SR2SelectDB;
Migrate incremental data
In actual production environments, StarRocks data is mainly divided into offline data and incremental data. In most cases, you migrate data from StarRocks to SelectDB because you need to replicate data to a data warehouse to accelerate queries. To migrate incremental data, you can use one of the following methods:
Replicate and write StarRocks data to SelectDB when the StarRocks data is generated.
Read data from StarRocks partitions by using periodic jobs and write the data to SelectDB.