This topic describes how to connect a Java Database Connectivity (JDBC) data source to ApsaraDB for SelectDB and how to use the JDBC data source in ApsaraDB for SelectDB. This helps you perform federated analytics on data sources that are compatible with the JDBC protocol.
Overview
A JDBC catalog allows you to connect an external data source to ApsaraDB for SelectDB over the standard JDBC protocol. After the external data source is connected to ApsaraDB for SelectDB, ApsaraDB for SelectDB synchronizes the metadata of the databases and tables from the external data source. This accelerates access to data in the external data source.
You can select MySQL, PostgreSQL, Oracle, SQL Server, ClickHouse, Apache Doris, SAP High-performance ANalytic Appliance (HANA), Trino, Presto, or OceanBase as a JDBC data source.
Syntax
CREATE CATALOG <catalog_name>
PROPERTIES ("key"="value", ...)
Parameter description
Parameter | Required | Default value | Description |
user | Yes | N/A | The account that is used to connect to the database. |
password | Yes | N/A | The password of the account that is used to connect to the database. |
jdbc_url | Yes | N/A | The JDBC URL that is used to connect to the database. |
driver_url | Yes | N/A | The name of the JAR package of the JDBC driver. |
driver_class | Yes | N/A | The name of the JDBC driver class. |
lower_case_table_names | No | "false" | Specifies whether to synchronize the names of the databases and tables of the external JDBC data source in lowercase. |
only_specified_database | No | "false" | Specifies whether to synchronize data from only the specified database. |
include_database_list | No | "" | The names of the databases from which data is synchronized. This parameter takes effect only when you set the |
exclude_database_list | No | "" | The names of the databases that are excluded. This parameter takes effect only when you set the |
Specify the driver package path
To create a JDBC catalog, you must specify the driver package path of the corresponding database. You can specify the driver package path by using one of the following methods to set the driver_url
parameter:
Set the driver_url parameter to the package name. Example:
mysql-connector-java-8.0.25.jar
. In this case, ApsaraDB for SelectDB automatically searches for JAR packages in the localjdbc_drivers/
directory. By default, the directory includes themysql-connector-java-8.0.25.jar
,postgresql-42.5.1.jar
,mssql-jdbc-11.2.3.jre8.jar
, andojdbc8.jar
JAR packages. You can use the JAR packages based on your business requirements.Set the driver_url parameter to the HTTP URL of the package. Example:
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/mysql-connector-java-8.0.25.jar
. The system downloads the driver file from this HTTP URL. Only HTTP URLs without authentication are supported.
Synchronize lowercase table names
If the lower_case_table_names
parameter is set to true
, ApsaraDB for SelectDB can query data from databases and tables whose names are not in lowercase by maintaining mappings of lowercase names to actual names in the external data sources.
ApsaraDB for SelectDB 2.x supports the feature only for Oracle databases. When you query data from an Oracle database, ApsaraDB for SelectDB automatically converts the names of all databases and tables from lowercase to uppercase in the query statement. For example,
an Oracle database has a TEST table in the TEST space. If you set the
lower_case_table_names
parameter to true when you create a catalog in ApsaraDB for SelectDB, ApsaraDB for SelectDB can query the TEST table by executing theSELECT * FROM oracle_catalog.test.test
statement. ApsaraDB for SelectDB automatically convertstest.test
intoTEST.TEST
and then queries the Oracle database by using the converted names. This is a default behavior. In this case, you cannot query tables whose names are in lowercase in this Oracle database.When you query data from other types of databases, you must specify the actual names of databases and tables in the query statement.
ApsaraDB for SelectDB 3.x and later support the feature for all types of databases. ApsaraDB for SelectDB converts the names of all databases and tables in the query statement into the actual database and table names when you query data from the databases and tables. If you upgrade ApsaraDB for SelectDB from an earlier version to 3.x, you need to execute the
REFRESH <catalog_name>
statement to make ApsaraDB for SelectDB 3.x take effect.However, if the database or table names differ only in case, such as
SelectDB
andselectdb
, ApsaraDB for SelectDB cannot query the databases or tables due to name ambiguity.If you set the
lower_case_table_names
frontend (FE) parameter to 1 or 2, you must set thelower_case_table_names
parameter of the JDBC catalog to true. If you set thelower_case_table_names
FE parameter to 0, you can set the lower_case_table_names parameter of the JDBC catalog to true or false. By default, the lower_case_table_names parameter of the JDBC catalog is set to false. This ensures the consistency and predictability of configurations of ApsaraDB for SelectDB for processing the internal and external tables.
Specify the databases from which data is synchronized
The following parameters are used to specify the databases from which data is synchronized:
only_specified_database
: specifies whether to synchronize data from only the specified database. For example, you can specify the database to be connected by setting thejdbc_url
parameter in MySQL and the schema to be connected by specifying thejdbc_url
parameter in PostgreSQL.include_database_list
: the names of the databases from which data is synchronized. This parameter takes effect only when you set theonly_specified_database
parameter to true. Separate multiple database names with commas (,). Database names are case-sensitive.exclude_database_list
: the names of the databases that are excluded. This parameter takes effect only when you set theonly_specified_database
parameter to true. Separate multiple database names with commas (,). Database names are case-sensitive.
If the value of the
include_database_list
parameter is the same as that of theexclude_database_list
parameter, the value of theexclude_database_list
parameter takes precedence.If you use the preceding parameters to connect to an Oracle database, you must use a JAR package of JDBC driver whose version is later than 8.
Examples
MySQL
Create a MySQL data source
CREATE CATALOG jdbc_mysql PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:mysql://127.0.0.1:3306/demo",
"driver_url" = "mysql-connector-java-8.0.25.jar",
"driver_class" = "com.mysql.cj.jdbc.Driver",
"checksum" = "fdf55dcef04b09f2eaf42b75e61ccc9a"
)
If a compute driver checksum error occurs, you can temporarily add the following property to block the checksum detection mechanism:
"checksum" = "fdf55dcef04b09f2eaf42b75e61ccc9a"
.
Level mapping
ApsaraDB for SelectDB | MySQL |
Catalog | MySQL server |
Database | Database |
Table | Table |
Data type mappings
MySQL | ApsaraDB for SelectDB | Remarks |
BOOLEAN | TINYINT | N/A |
TINYINT | TINYINT | N/A |
SMALLINT | SMALLINT | N/A |
MEDIUMINT | INT | N/A |
INT | INT | N/A |
BIGINT | BIGINT | N/A |
UNSIGNED TINYINT | SMALLINT | ApsaraDB for SelectDB does not support the UNSIGNED data type. Therefore, ApsaraDB for SelectDB expands the data type by an order of magnitude. |
UNSIGNED MEDIUMINT | INT | ApsaraDB for SelectDB does not support the UNSIGNED data type. Therefore, ApsaraDB for SelectDB expands the data type by an order of magnitude. |
UNSIGNED INT | BIGINT | ApsaraDB for SelectDB does not support the UNSIGNED data type. Therefore, ApsaraDB for SelectDB expands the data type by an order of magnitude. |
UNSIGNED BIGINT | LARGEINT | N/A |
FLOAT | FLOAT | N/A |
DOUBLE | DOUBLE | N/A |
DECIMAL | DECIMAL | N/A |
UNSIGNED DECIMAL(p,s) | DECIMAL(p+1,s) / STRING | If the value of |
DATE | DATE | N/A |
TIMESTAMP | DATETIME | N/A |
DATETIME | DATETIME | N/A |
YEAR | SMALLINT | N/A |
TIME | STRING | N/A |
CHAR | CHAR | N/A |
VARCHAR | VARCHAR | N/A |
JSON | JSON | N/A |
SET | STRING | N/A |
BIT | BOOLEAN/STRING | BIT(1) is mapped to BOOLEAN, and other BIT data types are mapped to STRING. |
TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT | STRING | N/A |
BLOB, MEDIUMBLOB, LONGBLOB, and TINYBLOB | STRING | N/A |
TINYSTRING, STRING, MEDIUMSTRING, and LONGSTRING | STRING | N/A |
BINARY and VARBINARY | STRING | N/A |
Other | UNSUPPORTED | N/A |
PostgreSQL
Create a PostgreSQL data source
CREATE CATALOG jdbc_postgresql PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:postgresql://127.0.0.1:5432/demo",
"driver_url" = "postgresql-42.5.1.jar",
"driver_class" = "org.postgresql.Driver",
"checksum" = "20c8228267b6c9ce620fddb39467d3eb"
);
Level mapping
When you map a PostgreSQL data source to an ApsaraDB for SelectDB instance, a database in ApsaraDB for SelectDB is mapped to a schema in a specific catalog in PostgreSQL. In the preceding example, a schema is specified in demo
in the jdbc_url
parameter. The tables of the database in ApsaraDB for SelectDB are mapped to the tables of the schema in PostgreSQL. The following table describes the mapping relationships:
ApsaraDB for SelectDB | PostgreSQL |
Catalog | Database |
Database | Schema |
Table | Table |
ApsaraDB for SelectDB executes the SELECT nspname FROM pg_namespace WHERE has_schema_privilege('<UserName>', nspname, 'USAGE');
statement to query all schemas that you can access and maps the schemas to the databases in ApsaraDB for SelectDB.
Data type mappings
PostgreSQL | ApsaraDB for SelectDB | Remarks |
BOOLEAN | BOOLEAN | N/A |
SMALLINT and INT2 | SMALLINT | N/A |
INTEGER and INT4 | INT | N/A |
BIGINT and INT8 | BIGINT | N/A |
DECIMAL and NUMERIC | DECIMAL | N/A |
REAL and FLOAT4 | FLOAT | N/A |
DOUBLE PRECISION | DOUBLE | N/A |
SMALLSERIAL | SMALLINT | N/A |
SERIAL | INT | N/A |
BIGSERIAL | BIGINT | N/A |
CHAR | CHAR | N/A |
VARCHAR and TEXT | STRING | N/A |
TIMESTAMP | DATETIME | N/A |
DATE | DATE | N/A |
JSON and JSONB | JSON | N/A |
TIME | STRING | N/A |
INTERVAL | STRING | N/A |
POINT, LINE, ISEG, BOX, PATH, POLYGON, and CIRCLE | STRING | N/A |
CIDR, INET, and MAC ADDR | STRING | N/A |
BIT | BOOLEAN and STRING | BIT(1) is mapped to BOOLEAN, and other BIT data types are mapped to STRING. |
UUID | STRING | N/A |
Other | UNSUPPORTED | N/A |
Oracle
Create an Oracle data source
CREATE CATALOG jdbc_oracle PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:oracle:thin:@127.0.0.1:1521:helowin",
"driver_url" = "ojdbc8.jar",
"driver_class" = "oracle.jdbc.driver.OracleDriver",
"checksum" = "621a393d7be9ff0f2fec6fbba2c8f9b6"
);
Level mapping
When you map an Oracle data source to an ApsaraDB for SelectDB instance, a database in ApsaraDB for SelectDB is mapped to a user in the Oracle database. Tables of the database in ApsaraDB for SelectDB is mapped to the tables that the user has permissions to access in the Oracle data source. The following table describes the mapping relationships:
ApsaraDB for SelectDB | Oracle |
Catalog | Database |
Database | User |
Table | Table |
You cannot synchronize data from synonym tables.
Data type mappings
Oracle | ApsaraDB for SelectDB | Remarks |
NUMBER(p) and NUMBER(p,0) | TINYINT, SMALLINT, INT, BIGINT, and LARGEINT | ApsaraDB for SelectDB selects the data type based on the value of p:
|
NUMBER(p,s),[ if(s>0 && p>s) ] | DECIMAL(p,s) | N/A |
NUMBER(p,s),[ if(s>0 && p < s) ] | DECIMAL(s,s) | N/A |
NUMBER(p,s),[ if(s<0) ] | TINYINT, SMALLINT, INT, BIGINT, and LARGEINT | If |
NUMBER | N/A | ApsaraDB for SelectDB does not support the NUMBER data type without p and s specified in Oracle. |
DECIMAL | DECIMAL | N/A |
FLOAT and REAL | DOUBLE | N/A |
DATE | DATETIME | N/A |
TIMESTAMP | DATETIME | N/A |
CHAR and NCHAR | STRING | N/A |
VARCHAR2 and NVARCHAR2 | STRING | N/A |
LONG, RAW, LONG RAW, and INTERVAL | STRING | N/A |
Other | UNSUPPORTED | N/A |
SQL Server
Create an SQL Server data source
If you use ApsaraDB for SelectDB 3.0.8 or later, you must include encrypt=false
in the JDBC URL to ensure that ApsaraDB for SelectDB can access the data in the SQL Server data source.
CREATE CATALOG jdbc_sqlserver PROPERTIES (
"type"="jdbc",
"user"="SA",
"password"="SelectDB123456",
"jdbc_url" = "jdbc:sqlserver://localhost:1433;DataBaseName=SelectDB_test;encrypt=false",
"driver_url" = "mssql-jdbc-11.2.3.jre8.jar",
"driver_class" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
"checksum" = "b204274eb02a848ac405961e6f43e7bd"
);
Level mapping
When you map an SQL Server data source to an ApsaraDB for SelectDB instance, a database in ApsaraDB for SelectDB is mapped to a schema in SQL Server. In the preceding example, a schema mapped to the SelectDB_test database is specified in the jdbc_url
parameter. Tables of the database in ApsaraDB for SelectDB are mapped to the tables of the schema in SQL Server. The following table describes the mapping relationships:
ApsaraDB for SelectDB | SQL Server |
Catalog | Database |
Database | Schema |
Table | Table |
Data type mappings
SQL Server | ApsaraDB for SelectDB |
BIT | BOOLEAN |
TINYINT | SMALLINT |
SMALLINT | SMALLINT |
INT | INT |
BIGINT | BIGINT |
REAL | FLOAT |
FLOAT | DOUBLE |
MONEY | DECIMAL(19,4) |
SMALLMONEY | DECIMAL(10,4) |
DECIMAL and NUMERIC | DECIMAL |
DATE | DATE |
DATETIME, DATETIME2, and SMALLDATETIME | DATETIMEV2 |
CHAR, VARCHAR, TEXT, NCHAR, NVARCHAR, and NTEXT | STRING |
BINARY and VARBINARY | STRING |
TIME and DATETIMEOFFSET | STRING |
Other | UNSUPPORTED |
Apache Doris
JDBC catalogs also allow you to connect ApsaraDB for SelectDB to Apache Doris data sources.
Create an Apache Doris data source
CREATE CATALOG jdbc_doris PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:mysql://127.0.0.1:9030?useSSL=false",
"driver_url" = "mysql-connector-java-8.0.25.jar",
"driver_class" = "com.mysql.cj.jdbc.Driver",
"checksum" = "fdf55dcef04b09f2eaf42b75e61ccc9a"
)
Data type mappings
Apache Doris | ApsaraDB for SelectDB | Remarks |
BOOLEAN | BOOLEAN | N/A |
TINYINT | TINYINT | N/A |
SMALLINT | SMALLINT | N/A |
INT | INT | N/A |
BIGINT | BIGINT | N/A |
LARGEINT | LARGEINT | N/A |
FLOAT | FLOAT | N/A |
DOUBLE | DOUBLE | N/A |
DECIMALV3 | DECIMALV3 and STRING | ApsaraDB for SelectDB selects the data type based on the precision and scale of the DECIMAL field. |
DATE | DATE | N/A |
DATETIME | DATETIME | N/A |
CHAR | CHAR | N/A |
VARCHAR | VARCHAR | N/A |
STRING | STRING | N/A |
TEXT | STRING | N/A |
HLL | HLL | To query data of the HyperLogLog (HLL) type, you need to configure the |
ARRAY | ARRAY | To map the ARRAY data type, refer to the preceding types. Nested complex ARRAY types are not supported. |
BITMAP | BITMAP | To query data of the BITMAP type, you need to configure the |
Other | UNSUPPORTED | N/A |
ClickHouse
Create a ClickHouse data source
CREATE CATALOG jdbc_clickhouse PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:clickhouse://127.0.0.1:8123/demo",
"driver_url" = "clickhouse-jdbc-0.4.2-all.jar",
"driver_class" = "com.clickhouse.jdbc.ClickHouseDriver",
"checksum" = "9be22a93267dc4b066e0a3aefc2dd024"
);
Level mapping
ApsaraDB for SelectDB | ClickHouse |
Catalog | ClickHouse server |
Database | Database |
Table | Table |
Data type mappings
ClickHouse | ApsaraDB for SelectDB |
BOOL | BOOLEAN |
STRING | STRING |
DATE and DATE32 | DATE |
DATETIME and DATETIME64 | DATETIME |
FLOAT32 | FLOAT |
FLOAT64 | DOUBLE |
INT8 | TINYINT |
INT16 and UINT8 | SMALLINT |
INT32 and UINT16 | INT |
INT64 and UINT32 | BIGINT |
INT128 and UINT64 | LARGEINT |
INT256, UINT128, and UINT256 | STRING |
DECIMAL | DECIMALV3 and STRING |
ENUM, IPv4, IPv6, and UUID | STRING |
ARRAY | ARRAY |
Other | UNSUPPORTED |
SPA HANA
Create an SPA HANA data source
CREATE CATALOG jdbc_hana PROPERTIES (
"type"="jdbc",
"user"="SYSTEM",
"password"="SAPHANA",
"jdbc_url" = "jdbc:sap://localhost:31515/TEST",
"driver_url" = "ngdbc.jar",
"driver_class" = "com.sap.db.jdbc.Driver"
)
Level mapping
ApsaraDB for SelectDB | SAP HANA |
Catalog | Database |
Database | Schema |
Table | Table |
Data type mappings
SAP HANA | ApsaraDB for SelectDB |
BOOLEAN | BOOLEAN |
TINYINT | TINYINT |
SMALLINT | SMALLINT |
INTEGER | INT |
BIGINT | BIGINT |
SMALLDECIMAL | DECIMALV3 |
DECIMAL | DECIMALV3 and STRING |
REAL | FLOAT |
DOUBLE | DOUBLE |
DATE | DATE |
TIME | STRING |
TIMESTAMP | DATETIME |
SECONDDATE | DATETIME |
VARCHAR | STRING |
NVARCHAR | STRING |
ALPHANUM | STRING |
SHORTTEXT | STRING |
CHAR | CHAR |
NCHAR | CHAR |
Trino or Presto
Create a Trino or Presto data source
Trino
CREATE CATALOG jdbc_trino PROPERTIES ( "type"="jdbc", "user"="hadoop", "password"="", "jdbc_url" = "jdbc:trino://localhost:9000/hive", "driver_url" = "trino-jdbc-389.jar", "driver_class" = "io.trino.jdbc.TrinoDriver" );
Presto
CREATE CATALOG jdbc_presto PROPERTIES ( "type"="jdbc", "user"="hadoop", "password"="", "jdbc_url" = "jdbc:presto://localhost:9000/hive", "driver_url" = "presto-jdbc-0.280.jar", "driver_class" = "com.facebook.presto.jdbc.PrestoDriver" );
Level mapping
When you map a Trino or Presto data source to an ApsaraDB for SelectDB instance, a database in ApsaraDB for SelectDB is mapped to a schema in the specified catalog in Trino or Presto. Tables of the database in ApsaraDB for SelectDB are mapped to the tables of the schema in Trino or Presto. The following table describes the mapping relationships:
ApsaraDB for SelectDB | Trino or Presto |
Catalog | Catalog |
Database | Schema |
Table | Table |
Data type mappings
Trino or Presto | ApsaraDB for SelectDB |
BOOLEAN | BOOLEAN |
TINYINT | TINYINT |
SMALLINT | SMALLINT |
INTEGER | INT |
BIGINT | BIGINT |
DECIMAL | DECIMAL, DECIMALV3, and STRING |
REAL | FLOAT |
DOUBLE | DOUBLE |
DATE | DATE |
TIMESTAMP | DATETIME |
VARCHAR | TEXT |
CHAR | CHAR |
ARRAY | ARRAY |
Others | UNSUPPORTED |
OceanBase
Create an OceanBase data source
CREATE CATALOG jdbc_oceanbase PROPERTIES (
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:oceanbase://127.0.0.1:2881/demo",
"driver_url" = "oceanbase-client-2.4.2.jar",
"driver_class" = "com.oceanbase.jdbc.Driver"
)
When you connect an ApsaraDB for SelectDB instance to an OceanBase data source, ApsaraDB for SelectDB automatically identifies whether the OceanBase data source is in MySQL or Oracle mode. For more information about the level mapping and data type mapping, see the "MySQL" and "Oracle" sections of this topic.
Query data
Example
SELECT * FROM mysql_catalog.mysql_database.mysql_table WHERE k1 > 1000 AND k3 ='term';
The keywords in the database may be used as the column names. To ensure such a query can be performed as expected in this case, ApsaraDB for SelectDB adds escape characters to the column names and table names in SQL statements based on the rules of different databases. For example, ApsaraDB for SelectDB adds single quotation marks (``) for MySQL data sources, brackets ([]) for SQL Server data sources, and double quotation marks("") for PostgreSQL and Oracle data sources. This may require that the column names must be case-sensitive. You can query the query statements sent to the databases by executing the EXPLAIN statement.
Predicate pushdown
When you execute a query statement that contains filter conditions, such as WHERE dt = '2022-01-01'
, ApsaraDB for SelectDB can push down the filter conditions to the external data source and then directly exclude data that does not meet the filter conditions at the data source level. This reduces the amount of data to be queried and transferred. This also improves query performance and reduces the load on external data sources.
If the enable_ext_func_pred_pushdown
variable is set to true, the function conditions in the WHERE
clause are also pushed down to the external data source. Only MySQL supports the feature. If functions that are not supported by MySQL are used, you can set this parameter to false. ApsaraDB for SelectDB automatically identifies some functions that are not supported by MySQL and filters conditions to be pushed down. You can query a specific statement by executing the EXPLAIN statement.
The following functions cannot be pushed down: DATE_TRUNC
and MONEY_FORMAT
.
Line limit
If a query statement contains the LIMIT keyword, ApsaraDB for SelectDB converts the query statement into statements that meet semantic requirements of different data sources.
Write data
After you create a JDBC catalog in ApsaraDB for SelectDB, you can directly write data by executing the INSERT INTO
statement, write the query results to the JDBC catalog, or import data from one JDBC catalog to another JDBC catalog.
Example
INSERT INTO mysql_catalog.mysql_database.mysql_table VALUES(1, "doris");
INSERT INTO mysql_catalog.mysql_database.mysql_table SELECT * FROM table;
Transaction
ApsaraDB for SelectDB imports data to a JDBC catalog in batches. If the import process is interrupted, you may need to roll back the imported data. Therefore, JDBC catalogs support data import transactions. You can enable the transaction feature by setting the enable_odbc_transcation
parameter to TRUE.
SET enable_odbc_transcation = TRUE;
Transactions ensure the atomicity of data import from external JDBC tables, but reduce the data import performance. You can enable this feature based on your business requirements.