All Products
Search
Document Center

ApsaraDB for SelectDB:JDBC data source

Last Updated:Jul 17, 2024

This topic describes how to connect a Java Database Connectivity (JDBC) data source to ApsaraDB for SelectDB and how to use the JDBC data source in ApsaraDB for SelectDB. This helps you perform federated analytics on data sources that are compatible with the JDBC protocol.

Overview

A JDBC catalog allows you to connect an external data source to ApsaraDB for SelectDB over the standard JDBC protocol. After the external data source is connected to ApsaraDB for SelectDB, ApsaraDB for SelectDB synchronizes the metadata of the databases and tables from the external data source. This accelerates access to data in the external data source.

You can select MySQL, PostgreSQL, Oracle, SQL Server, ClickHouse, Apache Doris, SAP High-performance ANalytic Appliance (HANA), Trino, Presto, or OceanBase as a JDBC data source.

Syntax

CREATE CATALOG <catalog_name>
PROPERTIES ("key"="value", ...)

Parameter description

Parameter

Required

Default value

Description

user

Yes

N/A

The account that is used to connect to the database.

password

Yes

N/A

The password of the account that is used to connect to the database.

jdbc_url

Yes

N/A

The JDBC URL that is used to connect to the database.

driver_url

Yes

N/A

The name of the JAR package of the JDBC driver.

driver_class

Yes

N/A

The name of the JDBC driver class.

lower_case_table_names

No

"false"

Specifies whether to synchronize the names of the databases and tables of the external JDBC data source in lowercase.

only_specified_database

No

"false"

Specifies whether to synchronize data from only the specified database.

include_database_list

No

""

The names of the databases from which data is synchronized. This parameter takes effect only when you set the only_specified_database parameter to true. Separate multiple database names with commas (,). Database names are case-sensitive.

exclude_database_list

No

""

The names of the databases that are excluded. This parameter takes effect only when you set the only_specified_database parameter to true. Separate multiple database names with commas (,). Database names are case-sensitive.

Specify the driver package path

To create a JDBC catalog, you must specify the driver package path of the corresponding database. You can specify the driver package path by using one of the following methods to set the driver_url parameter:

  • Set the driver_url parameter to the package name. Example: mysql-connector-java-8.0.25.jar. In this case, ApsaraDB for SelectDB automatically searches for JAR packages in the local jdbc_drivers/ directory. By default, the directory includes the mysql-connector-java-8.0.25.jar, postgresql-42.5.1.jar, mssql-jdbc-11.2.3.jre8.jar, and ojdbc8.jar JAR packages. You can use the JAR packages based on your business requirements.

  • Set the driver_url parameter to the HTTP URL of the package. Example: https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/mysql-connector-java-8.0.25.jar. The system downloads the driver file from this HTTP URL. Only HTTP URLs without authentication are supported.

Synchronize lowercase table names

If the lower_case_table_names parameter is set to true, ApsaraDB for SelectDB can query data from databases and tables whose names are not in lowercase by maintaining mappings of lowercase names to actual names in the external data sources.

  • ApsaraDB for SelectDB 2.x supports the feature only for Oracle databases. When you query data from an Oracle database, ApsaraDB for SelectDB automatically converts the names of all databases and tables from lowercase to uppercase in the query statement. For example,

    an Oracle database has a TEST table in the TEST space. If you set the lower_case_table_names parameter to true when you create a catalog in ApsaraDB for SelectDB, ApsaraDB for SelectDB can query the TEST table by executing the SELECT * FROM oracle_catalog.test.test statement. ApsaraDB for SelectDB automatically converts test.test into TEST.TEST and then queries the Oracle database by using the converted names. This is a default behavior. In this case, you cannot query tables whose names are in lowercase in this Oracle database.

    When you query data from other types of databases, you must specify the actual names of databases and tables in the query statement.

  • ApsaraDB for SelectDB 3.x and later support the feature for all types of databases. ApsaraDB for SelectDB converts the names of all databases and tables in the query statement into the actual database and table names when you query data from the databases and tables. If you upgrade ApsaraDB for SelectDB from an earlier version to 3.x, you need to execute the REFRESH <catalog_name> statement to make ApsaraDB for SelectDB 3.x take effect.

    However, if the database or table names differ only in case, such as SelectDB and selectdb, ApsaraDB for SelectDB cannot query the databases or tables due to name ambiguity.

  • If you set the lower_case_table_names frontend (FE) parameter to 1 or 2, you must set the lower_case_table_names parameter of the JDBC catalog to true. If you set the lower_case_table_names FE parameter to 0, you can set the lower_case_table_names parameter of the JDBC catalog to true or false. By default, the lower_case_table_names parameter of the JDBC catalog is set to false. This ensures the consistency and predictability of configurations of ApsaraDB for SelectDB for processing the internal and external tables.

Specify the databases from which data is synchronized

The following parameters are used to specify the databases from which data is synchronized:

  • only_specified_database: specifies whether to synchronize data from only the specified database. For example, you can specify the database to be connected by setting the jdbc_url parameter in MySQL and the schema to be connected by specifying the jdbc_url parameter in PostgreSQL.

  • include_database_list: the names of the databases from which data is synchronized. This parameter takes effect only when you set the only_specified_database parameter to true. Separate multiple database names with commas (,). Database names are case-sensitive.

  • exclude_database_list: the names of the databases that are excluded. This parameter takes effect only when you set the only_specified_database parameter to true. Separate multiple database names with commas (,). Database names are case-sensitive.

Note
  • If the value of the include_database_list parameter is the same as that of the exclude_database_list parameter, the value of the exclude_database_list parameter takes precedence.

  • If you use the preceding parameters to connect to an Oracle database, you must use a JAR package of JDBC driver whose version is later than 8.

Examples

MySQL

Create a MySQL data source

CREATE CATALOG jdbc_mysql PROPERTIES (
    "type"="jdbc",
    "user"="root",
    "password"="123456",
    "jdbc_url" = "jdbc:mysql://127.0.0.1:3306/demo",
    "driver_url" = "mysql-connector-java-8.0.25.jar",
    "driver_class" = "com.mysql.cj.jdbc.Driver",
    "checksum" = "fdf55dcef04b09f2eaf42b75e61ccc9a"
)
Note

If a compute driver checksum error occurs, you can temporarily add the following property to block the checksum detection mechanism:

"checksum" = "fdf55dcef04b09f2eaf42b75e61ccc9a".

Level mapping

ApsaraDB for SelectDB

MySQL

Catalog

MySQL server

Database

Database

Table

Table

Data type mappings

MySQL

ApsaraDB for SelectDB

Remarks

BOOLEAN

TINYINT

N/A

TINYINT

TINYINT

N/A

SMALLINT

SMALLINT

N/A

MEDIUMINT

INT

N/A

INT

INT

N/A

BIGINT

BIGINT

N/A

UNSIGNED TINYINT

SMALLINT

ApsaraDB for SelectDB does not support the UNSIGNED data type. Therefore, ApsaraDB for SelectDB expands the data type by an order of magnitude.

UNSIGNED MEDIUMINT

INT

ApsaraDB for SelectDB does not support the UNSIGNED data type. Therefore, ApsaraDB for SelectDB expands the data type by an order of magnitude.

UNSIGNED INT

BIGINT

ApsaraDB for SelectDB does not support the UNSIGNED data type. Therefore, ApsaraDB for SelectDB expands the data type by an order of magnitude.

UNSIGNED BIGINT

LARGEINT

N/A

FLOAT

FLOAT

N/A

DOUBLE

DOUBLE

N/A

DECIMAL

DECIMAL

N/A

UNSIGNED DECIMAL(p,s)

DECIMAL(p+1,s) / STRING

If the value of p+1 is greater than 38, ApsaraDB for SelectDB uses the STRING data type.

DATE

DATE

N/A

TIMESTAMP

DATETIME

N/A

DATETIME

DATETIME

N/A

YEAR

SMALLINT

N/A

TIME

STRING

N/A

CHAR

CHAR

N/A

VARCHAR

VARCHAR

N/A

JSON

JSON

N/A

SET

STRING

N/A

BIT

BOOLEAN/STRING

BIT(1) is mapped to BOOLEAN, and other BIT data types are mapped to STRING.

TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT

STRING

N/A

BLOB, MEDIUMBLOB, LONGBLOB, and TINYBLOB

STRING

N/A

TINYSTRING, STRING, MEDIUMSTRING, and LONGSTRING

STRING

N/A

BINARY and VARBINARY

STRING

N/A

Other

UNSUPPORTED

N/A

PostgreSQL

Create a PostgreSQL data source

CREATE CATALOG jdbc_postgresql PROPERTIES (
    "type"="jdbc",
    "user"="root",
    "password"="123456",
    "jdbc_url" = "jdbc:postgresql://127.0.0.1:5432/demo",
    "driver_url" = "postgresql-42.5.1.jar",
    "driver_class" = "org.postgresql.Driver",
    "checksum" = "20c8228267b6c9ce620fddb39467d3eb"
);

Level mapping

When you map a PostgreSQL data source to an ApsaraDB for SelectDB instance, a database in ApsaraDB for SelectDB is mapped to a schema in a specific catalog in PostgreSQL. In the preceding example, a schema is specified in demo in the jdbc_url parameter. The tables of the database in ApsaraDB for SelectDB are mapped to the tables of the schema in PostgreSQL. The following table describes the mapping relationships:

ApsaraDB for SelectDB

PostgreSQL

Catalog

Database

Database

Schema

Table

Table

Note

ApsaraDB for SelectDB executes the SELECT nspname FROM pg_namespace WHERE has_schema_privilege('<UserName>', nspname, 'USAGE'); statement to query all schemas that you can access and maps the schemas to the databases in ApsaraDB for SelectDB.

Data type mappings

PostgreSQL

ApsaraDB for SelectDB

Remarks

BOOLEAN

BOOLEAN

N/A

SMALLINT and INT2

SMALLINT

N/A

INTEGER and INT4

INT

N/A

BIGINT and INT8

BIGINT

N/A

DECIMAL and NUMERIC

DECIMAL

N/A

REAL and FLOAT4

FLOAT

N/A

DOUBLE PRECISION

DOUBLE

N/A

SMALLSERIAL

SMALLINT

N/A

SERIAL

INT

N/A

BIGSERIAL

BIGINT

N/A

CHAR

CHAR

N/A

VARCHAR and TEXT

STRING

N/A

TIMESTAMP

DATETIME

N/A

DATE

DATE

N/A

JSON and JSONB

JSON

N/A

TIME

STRING

N/A

INTERVAL

STRING

N/A

POINT, LINE, ISEG, BOX, PATH, POLYGON, and CIRCLE

STRING

N/A

CIDR, INET, and MAC ADDR

STRING

N/A

BIT

BOOLEAN and STRING

BIT(1) is mapped to BOOLEAN, and other BIT data types are mapped to STRING.

UUID

STRING

N/A

Other

UNSUPPORTED

N/A

Oracle

Create an Oracle data source

CREATE CATALOG jdbc_oracle PROPERTIES (
    "type"="jdbc",
    "user"="root",
    "password"="123456",
    "jdbc_url" = "jdbc:oracle:thin:@127.0.0.1:1521:helowin",
    "driver_url" = "ojdbc8.jar",
    "driver_class" = "oracle.jdbc.driver.OracleDriver",
    "checksum" = "621a393d7be9ff0f2fec6fbba2c8f9b6"
);

Level mapping

When you map an Oracle data source to an ApsaraDB for SelectDB instance, a database in ApsaraDB for SelectDB is mapped to a user in the Oracle database. Tables of the database in ApsaraDB for SelectDB is mapped to the tables that the user has permissions to access in the Oracle data source. The following table describes the mapping relationships:

ApsaraDB for SelectDB

Oracle

Catalog

Database

Database

User

Table

Table

Note

You cannot synchronize data from synonym tables.

Data type mappings

Oracle

ApsaraDB for SelectDB

Remarks

NUMBER(p) and NUMBER(p,0)

TINYINT, SMALLINT, INT, BIGINT, and LARGEINT

ApsaraDB for SelectDB selects the data type based on the value of p:

  • If p is smaller than 3, TINYINT is used.

  • If p is smaller than 5, SMALLINT is used.

  • If p is smaller than 10, INT is used.

  • If p is smaller than 19, BIGINT is used.

  • If p is greater than 19, LARGEINT is used.

NUMBER(p,s),[ if(s>0 && p>s) ]

DECIMAL(p,s)

N/A

NUMBER(p,s),[ if(s>0 && p < s) ]

DECIMAL(s,s)

N/A

NUMBER(p,s),[ if(s<0) ]

TINYINT, SMALLINT, INT, BIGINT, and LARGEINT

If s is smaller than 0, ApsaraDB for SelectDB sets p to p+|s| and maps the data type in the same way as it maps the NUMBER(p) and NUMBER(p,0) data types.

NUMBER

N/A

ApsaraDB for SelectDB does not support the NUMBER data type without p and s specified in Oracle.

DECIMAL

DECIMAL

N/A

FLOAT and REAL

DOUBLE

N/A

DATE

DATETIME

N/A

TIMESTAMP

DATETIME

N/A

CHAR and NCHAR

STRING

N/A

VARCHAR2 and NVARCHAR2

STRING

N/A

LONG, RAW, LONG RAW, and INTERVAL

STRING

N/A

Other

UNSUPPORTED

N/A

SQL Server

Create an SQL Server data source

Important

If you use ApsaraDB for SelectDB 3.0.8 or later, you must include encrypt=false in the JDBC URL to ensure that ApsaraDB for SelectDB can access the data in the SQL Server data source.

CREATE CATALOG jdbc_sqlserver PROPERTIES (
    "type"="jdbc",
    "user"="SA",
    "password"="SelectDB123456",
    "jdbc_url" = "jdbc:sqlserver://localhost:1433;DataBaseName=SelectDB_test;encrypt=false",
    "driver_url" = "mssql-jdbc-11.2.3.jre8.jar",
    "driver_class" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
    "checksum" = "b204274eb02a848ac405961e6f43e7bd"
);

Level mapping

When you map an SQL Server data source to an ApsaraDB for SelectDB instance, a database in ApsaraDB for SelectDB is mapped to a schema in SQL Server. In the preceding example, a schema mapped to the SelectDB_test database is specified in the jdbc_url parameter. Tables of the database in ApsaraDB for SelectDB are mapped to the tables of the schema in SQL Server. The following table describes the mapping relationships:

ApsaraDB for SelectDB

SQL Server

Catalog

Database

Database

Schema

Table

Table

Data type mappings

SQL Server

ApsaraDB for SelectDB

BIT

BOOLEAN

TINYINT

SMALLINT

SMALLINT

SMALLINT

INT

INT

BIGINT

BIGINT

REAL

FLOAT

FLOAT

DOUBLE

MONEY

DECIMAL(19,4)

SMALLMONEY

DECIMAL(10,4)

DECIMAL and NUMERIC

DECIMAL

DATE

DATE

DATETIME, DATETIME2, and SMALLDATETIME

DATETIMEV2

CHAR, VARCHAR, TEXT, NCHAR, NVARCHAR, and NTEXT

STRING

BINARY and VARBINARY

STRING

TIME and DATETIMEOFFSET

STRING

Other

UNSUPPORTED

Apache Doris

JDBC catalogs also allow you to connect ApsaraDB for SelectDB to Apache Doris data sources.

Create an Apache Doris data source

CREATE CATALOG jdbc_doris PROPERTIES (
    "type"="jdbc",
    "user"="root",
    "password"="123456",
    "jdbc_url" = "jdbc:mysql://127.0.0.1:9030?useSSL=false",
    "driver_url" = "mysql-connector-java-8.0.25.jar",
    "driver_class" = "com.mysql.cj.jdbc.Driver",
    "checksum" = "fdf55dcef04b09f2eaf42b75e61ccc9a"
)

Data type mappings

Apache Doris

ApsaraDB for SelectDB

Remarks

BOOLEAN

BOOLEAN

N/A

TINYINT

TINYINT

N/A

SMALLINT

SMALLINT

N/A

INT

INT

N/A

BIGINT

BIGINT

N/A

LARGEINT

LARGEINT

N/A

FLOAT

FLOAT

N/A

DOUBLE

DOUBLE

N/A

DECIMALV3

DECIMALV3 and STRING

ApsaraDB for SelectDB selects the data type based on the precision and scale of the DECIMAL field.

DATE

DATE

N/A

DATETIME

DATETIME

N/A

CHAR

CHAR

N/A

VARCHAR

VARCHAR

N/A

STRING

STRING

N/A

TEXT

STRING

N/A

HLL

HLL

To query data of the HyperLogLog (HLL) type, you need to configure the return_object_data_as_binary=true property.

ARRAY

ARRAY

To map the ARRAY data type, refer to the preceding types. Nested complex ARRAY types are not supported.

BITMAP

BITMAP

To query data of the BITMAP type, you need to configure the return_object_data_as_binary=true property.

Other

UNSUPPORTED

N/A

ClickHouse

Create a ClickHouse data source

CREATE CATALOG jdbc_clickhouse PROPERTIES (
    "type"="jdbc",
    "user"="root",
    "password"="123456",
    "jdbc_url" = "jdbc:clickhouse://127.0.0.1:8123/demo",
    "driver_url" = "clickhouse-jdbc-0.4.2-all.jar",
    "driver_class" = "com.clickhouse.jdbc.ClickHouseDriver",
    "checksum" = "9be22a93267dc4b066e0a3aefc2dd024"
);

Level mapping

ApsaraDB for SelectDB

ClickHouse

Catalog

ClickHouse server

Database

Database

Table

Table

Data type mappings

ClickHouse

ApsaraDB for SelectDB

BOOL

BOOLEAN

STRING

STRING

DATE and DATE32

DATE

DATETIME and DATETIME64

DATETIME

FLOAT32

FLOAT

FLOAT64

DOUBLE

INT8

TINYINT

INT16 and UINT8

SMALLINT

INT32 and UINT16

INT

INT64 and UINT32

BIGINT

INT128 and UINT64

LARGEINT

INT256, UINT128, and UINT256

STRING

DECIMAL

DECIMALV3 and STRING

ENUM, IPv4, IPv6, and UUID

STRING

ARRAY

ARRAY

Other

UNSUPPORTED

SPA HANA

Create an SPA HANA data source

CREATE CATALOG jdbc_hana PROPERTIES (
    "type"="jdbc",
    "user"="SYSTEM",
    "password"="SAPHANA",
    "jdbc_url" = "jdbc:sap://localhost:31515/TEST",
    "driver_url" = "ngdbc.jar",
    "driver_class" = "com.sap.db.jdbc.Driver"
)

Level mapping

ApsaraDB for SelectDB

SAP HANA

Catalog

Database

Database

Schema

Table

Table

Data type mappings

SAP HANA

ApsaraDB for SelectDB

BOOLEAN

BOOLEAN

TINYINT

TINYINT

SMALLINT

SMALLINT

INTEGER

INT

BIGINT

BIGINT

SMALLDECIMAL

DECIMALV3

DECIMAL

DECIMALV3 and STRING

REAL

FLOAT

DOUBLE

DOUBLE

DATE

DATE

TIME

STRING

TIMESTAMP

DATETIME

SECONDDATE

DATETIME

VARCHAR

STRING

NVARCHAR

STRING

ALPHANUM

STRING

SHORTTEXT

STRING

CHAR

CHAR

NCHAR

CHAR

Trino or Presto

Create a Trino or Presto data source

  • Trino

    CREATE CATALOG jdbc_trino PROPERTIES (
        "type"="jdbc",
        "user"="hadoop",
        "password"="",
        "jdbc_url" = "jdbc:trino://localhost:9000/hive",
        "driver_url" = "trino-jdbc-389.jar",
        "driver_class" = "io.trino.jdbc.TrinoDriver"
    );
  • Presto

    CREATE CATALOG jdbc_presto PROPERTIES (
        "type"="jdbc",
        "user"="hadoop",
        "password"="",
        "jdbc_url" = "jdbc:presto://localhost:9000/hive",
        "driver_url" = "presto-jdbc-0.280.jar",
        "driver_class" = "com.facebook.presto.jdbc.PrestoDriver"
    );

Level mapping

When you map a Trino or Presto data source to an ApsaraDB for SelectDB instance, a database in ApsaraDB for SelectDB is mapped to a schema in the specified catalog in Trino or Presto. Tables of the database in ApsaraDB for SelectDB are mapped to the tables of the schema in Trino or Presto. The following table describes the mapping relationships:

ApsaraDB for SelectDB

Trino or Presto

Catalog

Catalog

Database

Schema

Table

Table

Data type mappings

Trino or Presto

ApsaraDB for SelectDB

BOOLEAN

BOOLEAN

TINYINT

TINYINT

SMALLINT

SMALLINT

INTEGER

INT

BIGINT

BIGINT

DECIMAL

DECIMAL, DECIMALV3, and STRING

REAL

FLOAT

DOUBLE

DOUBLE

DATE

DATE

TIMESTAMP

DATETIME

VARCHAR

TEXT

CHAR

CHAR

ARRAY

ARRAY

Others

UNSUPPORTED

OceanBase

Create an OceanBase data source

CREATE CATALOG jdbc_oceanbase PROPERTIES (
    "type"="jdbc",
    "user"="root",
    "password"="123456",
    "jdbc_url" = "jdbc:oceanbase://127.0.0.1:2881/demo",
    "driver_url" = "oceanbase-client-2.4.2.jar",
    "driver_class" = "com.oceanbase.jdbc.Driver"
)

When you connect an ApsaraDB for SelectDB instance to an OceanBase data source, ApsaraDB for SelectDB automatically identifies whether the OceanBase data source is in MySQL or Oracle mode. For more information about the level mapping and data type mapping, see the "MySQL" and "Oracle" sections of this topic.

Query data

Example

SELECT * FROM mysql_catalog.mysql_database.mysql_table WHERE k1 > 1000  AND k3 ='term';
Note

The keywords in the database may be used as the column names. To ensure such a query can be performed as expected in this case, ApsaraDB for SelectDB adds escape characters to the column names and table names in SQL statements based on the rules of different databases. For example, ApsaraDB for SelectDB adds single quotation marks (``) for MySQL data sources, brackets ([]) for SQL Server data sources, and double quotation marks("") for PostgreSQL and Oracle data sources. This may require that the column names must be case-sensitive. You can query the query statements sent to the databases by executing the EXPLAIN statement.

Predicate pushdown

When you execute a query statement that contains filter conditions, such as WHERE dt = '2022-01-01', ApsaraDB for SelectDB can push down the filter conditions to the external data source and then directly exclude data that does not meet the filter conditions at the data source level. This reduces the amount of data to be queried and transferred. This also improves query performance and reduces the load on external data sources.

If the enable_ext_func_pred_pushdown variable is set to true, the function conditions in the WHERE clause are also pushed down to the external data source. Only MySQL supports the feature. If functions that are not supported by MySQL are used, you can set this parameter to false. ApsaraDB for SelectDB automatically identifies some functions that are not supported by MySQL and filters conditions to be pushed down. You can query a specific statement by executing the EXPLAIN statement.

The following functions cannot be pushed down: DATE_TRUNC and MONEY_FORMAT.

Line limit

If a query statement contains the LIMIT keyword, ApsaraDB for SelectDB converts the query statement into statements that meet semantic requirements of different data sources.

Write data

After you create a JDBC catalog in ApsaraDB for SelectDB, you can directly write data by executing the INSERT INTO statement, write the query results to the JDBC catalog, or import data from one JDBC catalog to another JDBC catalog.

Example

INSERT INTO mysql_catalog.mysql_database.mysql_table VALUES(1, "doris");
INSERT INTO mysql_catalog.mysql_database.mysql_table SELECT * FROM table;

Transaction

ApsaraDB for SelectDB imports data to a JDBC catalog in batches. If the import process is interrupted, you may need to roll back the imported data. Therefore, JDBC catalogs support data import transactions. You can enable the transaction feature by setting the enable_odbc_transcation parameter to TRUE.

SET enable_odbc_transcation = TRUE; 

Transactions ensure the atomicity of data import from external JDBC tables, but reduce the data import performance. You can enable this feature based on your business requirements.