This topic describes how to download the JAR package of the MaxCompute JDBC driver and use the driver to connect to MaxCompute. This topic also provides sample code.
Usage notes
To execute SQL statements and obtain execution results by using the MaxCompute JDBC driver, you must meet the following requirements:
You are a member of a project.
You have the CREATE INSTANCE permission on the project.
You have the SELECT and DOWNLOAD permissions on the table that you want to use.
NoteWhen you use MaxCompute JDBC V1.9 or earlier, a temporary table is automatically created for each query. You can use Tunnel commands to obtain query results from the temporary table. To use these versions, you must have the CREATE TABLE permission.
When you use MaxCompute JDBC V2.2 or later, no temporary table is automatically created for each query. You can call the InstanceTunnel interface to obtain query results, regardless of whether you have the CREATE TABLE permission.
For more information about MaxCompute permissions, see MaxCompute permissions.
MaxCompute provides the data protection feature. If the data protection feature is enabled for a project, you cannot move data out of the project. If you use MaxCompute JDBC of a version earlier than V2.4, no
result sets
can be obtained. If you use MaxCompute JDBC V2.4 or later, the number of result rows that you obtain cannot exceed the value of the READ_TABLE_MAX_ROW parameter. For more information about this parameter, see Project operations. For more information about the data protection feature, see Project data protection.The MaxCompute V2.0 data type edition supports more data types, such as TINYINT, SMALLINT, DATETIME, TIMESTAMP, ARRAY, MAP, and STRUCT. To use these new data types, you must run the following command to enable the MaxCompute V2.0 data type edition. For more information, see Data type editions.
set odps.sql.type.system.odps2=true
Download the JAR package of the MaxCompute JDBC driver
You can download the JAR packages of different versions of the MaxCompute JDBC driver from OSS, GitHub, or Maven. We recommend that you download the JAR package whose name includes jar-with-dependencies.
The following code describes the dependency in the Project Object Model (POM) file of the MaxCompute JDBC driver downloaded from Maven:
<dependency>
<groupId>com.aliyun.odps</groupId>
<artifactId>odps-jdbc</artifactId>
<version>3.3.6</version>
<classifier>jar-with-dependencies</classifier>
</dependency>
The MaxCompute JDBC driver is an open source code project that is named aliyun-odps-jdbc.
You are welcome to participate in the development and improvement of the MaxCompute JDBC driver. You can report issues on the Issues tab and optimize source code on the Pull requests tab. Your operations on the Issues and Pull requests tabs must follow the template requirements for open source projects.
JDBC parameters
You can configure the URL and property parameters to use a JDBC driver. The property parameters have a higher priority than the URL parameters.
If a URL parameter is odps_config=config_file
, config_file
is read as a property parameter.
Basic parameters
URL key
Property key
Required
Description
project
project_name
Yes
The name of the MaxCompute project.
accessId
access_id
Yes
The AccessKey ID of your Alibaba Cloud account.
You can obtain the AccessKey ID from the AccessKey Pair page.
accessKey
access_key
Yes
The AccessKey secret of your Alibaba Cloud account.
You can obtain the AccessKey secret on the AccessKey Pair page.
logview
logview_host
No
The URL of the MaxCompute LogView. Set the value to
http://logview.odps.aliyun.com
.tunnelEndpoint
tunnel_endpoint
No
The endpoint of MaxCompute Tunnel.
For more information about the Tunnel endpoints that correspond to different regions and network connection types, see Endpoints.
Log configuration parameters
URL key
Property key
Required
Description
enableOdpsLogger
enable_odps_logger
No
Specifies whether to enable the MaxCompute JDBC logger. Valid values:
False: The MaxCompute JDBC logger is disabled. Logs are not recorded in a file. This is the default value.
True: The MaxCompute JDBC logger is enabled. Logs are recorded in the file
jdbc.log
of the directory in which the JAR package of the MaxCompute JDBC driver is stored.
logConfFile
log_conf_file
No
The configuration file for Simple Logging Facade for Java (SLF4J). You can specify this configuration file to flexibly configure the log output. For example, you can specify the output file and configure the logLevel parameter to specify the log level. To specify this configuration file, you can add the following dependencies to the file
pom.xml
in the project:<dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-core</artifactId> <version>1.2.3</version> </dependency> <dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.3</version> </dependency>
For more information about the configuration, see Example.
logLevel
log_level
No
The log level. Default value: INFO.
Other parameters
URL key
Property key
Required
Description
stsToken
sts_token
No
The Alibaba Cloud STS token.
charset
charset
No
The character set for the input and output. Default value: UTF-8.
useProjectTimeZone
use_project_time_zone
No
Specifies whether to use the time zone that is specified by the
odps.sql.timezone
parameter of the MaxCompute project. Valid values:False: The time zone of the MaxCompute project is not used. This is the default value.
True: The time zone of the MaxCompute project is used.
NoteYou can also configure
set odps.sql.timezone=xxx
in a statement to use the time zone of the MaxCompute project.The time zone specified in a statement takes precedence over the time zone specified in a project. The time zone specified in a project takes precedence over the time zones that you do not specify.
disableConnectinosSetting
disable_connection_setting
No
Specifies whether the SQL parameters of a connection can be configured. Valid values:
False: The SQL parameters of a connection cannot be configured. This is the default value.
True: The SQL parameters of a connection can be configured.
If you set this parameter to True, when you run the
set
command in a statement, the SQL parameters of the statement and the connection are configured at the same time. If you set this parameter to False, when you run the set xxx command in a statement, only the SQL parameters of the statement are configured.settings
settings
No
The default global
SQL settings
. The value of this parameter is passed in the JSON format, such as{"key":"value"}
.tableList
table_list
No
The name of the table in MaxCompute. The name of the table is in the
projectname.tablename,projectname1.tablename1
format.connectTimeout
connect_timeout
No
The timeout period for establishing a connection over the underlying network. Default value: 10. Unit: seconds.
readTimeout
read_timeout
No
The timeout period for reading data over an underlying network connection. Default value: 120. Unit: seconds.
NoteThe timeout period of a connection in each RESTful API request is the sum of the values of connectTimeout and readTimeout. The default timeout period of each RESTful API request is 130 seconds. By default, the maximum number of retries for establishing a connection in a RESTful API request is 3.
If you want to adjust the timeout period of a RESTful API request, change the value of the readTimeout parameter.
enableCommandApi
enable_command_api
No
Specifies whether to use the commandAPI. Valid values:
False: The commandAPI is not used. This is the default value.
True: The commandAPI is used.
If you set this parameter to True, you can run specific commands that can be run only on the MaxCompute client (odpscmd) when you use a JDBC driver.
httpsCheck
https_check
No
Specifies whether to perform HTTPS-based certificate verification. Valid values:
False: HTTPS-based certificate verification is not performed. This is the default value.
True: HTTPS-based certificate verification is performed.
tunnelConnectTimeout
tunnel_connect_timeout
No
The timeout period for a tunnel connection when you run Tunnel commands to download data. Default value: 180. Unit: seconds.
tunnelReadTimeout
tunnel_read_timeout
No
The timeout period for reading data when you run Tunnel commands to download data. Default value: 300. Unit: seconds.
Parameters that are not related to MCQA (in offline mode)
URL key
Property key
Required
Description
autoLimitFallback
auto_limit_fallback
No
Specifies whether to enable auto limit fallback. Valid values:
False: Auto limit fallback is disabled. This is the default value.
True: Auto limit fallback is enabled. In offline mode, if this parameter is set to True, the maximum number of data records that can be downloaded is 10,000 when the error message
no download permission
appears.
MCQA-related parameters (in MCQA mode)
Basic parameters
False: MCQA is disabled. This is the default value.
True: MCQA is enabled.
Limit-related parameters
URL key
Property key
Required
Description
instanceTunnelMaxRecord
instance_tunnel_max_record
No
The maximum number of records in the result set.
NoteThis parameter takes effect only when the enableLimit parameter is set to False.
instanceTunnelMaxSize
instance_tunnel_max_size
No
The maximum size of the result set. Unit: bytes.
autoSelectLimit
auto_select_limit
No
The maximum number of rows of data in a query.
In the elastic computing environment of Alibaba Cloud, a maximum of 1 million rows of data can be queried by default. If you want to query more than 1 million rows of data, you can configure this parameter.
NoteThis parameter takes effect only when the enableLimit parameter is set to False.
When you use a JDBC driver of V3.2.29 or later, if you configure the autoSelectLimit parameter, the enableLimit parameter is automatically set to False.
You can add
odps.sql.select.auto.limit
to the SQL settings of MaxCompute. By default, this parameter takes effect when you run an SQL job.
enableLimit
enable_limit
No
Specifies whether to enable the limit on the maximum number of data records in a query. Valid values:
False: The limit on the maximum number of data records in a query is disabled.
True: The limit on the maximum number of data records in a query is enabled. This is the default value.
After the limit on the maximum number of data records in a query is enabled, the download control is not checked and the maximum number of result records is 10,000.
Rollback-related parameters
URL key
Property key
Required
Description
fallbackForUnknownError
fallback_for_unknownerror
No
Specifies whether to roll back the MCQA job to the offline mode when an unknown error occurs. Valid values:
False: The MCQA job is not rolled back to the offline mode.
True: The MCQA job is rolled back to the offline mode. This is the default value.
fallbackForResourceNotEnough
fallback_for_resourcenotenough
No
Specifies whether to roll back the MCQA job to the offline mode when resources are insufficient. Valid values:
False: The MCQA job is not rolled back to the offline mode.
True: The MCQA job is rolled back to the offline mode. This is the default value.
fallbackForUpgrading
fallback_for_upgrading
No
Specifies whether to roll back the MCQA job to the offline mode during an upgrade. Valid values:
False: The MCQA job is not rolled back to the offline mode.
True: The MCQA job is rolled back to the offline mode. This is the default value.
fallbackForRunningTimeout
fallback_for_runningtimeout
No
Specifies whether to roll back the MCQA job to the offline mode if a connection times out when you run a command. Valid values:
False: The MCQA job is not rolled back to the offline mode.
True: The MCQA job is rolled back to the offline mode. This is the default value.
fallbackForUnsupportedFeature
fallbackForUnsupportedFeature
No
Specifies whether to roll back the MCQA job to the offline mode in scenarios in which MCQA is not supported. Valid values:
False: The MCQA job is not rolled back to the offline mode.
True: The MCQA job is rolled back to the offline mode. This is the default value.
alwaysFallback
always_fallback
No
Specifies whether to roll back the MCQA job to the offline mode in all the preceding scenarios. Valid values:
False: Auto limit fallback is disabled. This is the default value.
True: The MCQA job is rolled back to the offline mode.
NoteThis parameter takes effect only when you use a MaxCompute JDBC driver of V3.2.3 or later.
disableFallback
disable_fallback
No
Specifies whether not to roll back the MCQA job to the offline mode in all the preceding scenarios. Valid values:
False: The MCQA job is rolled back to the offline mode. This is the default value.
True: The MCQA job is not rolled back to the offline mode.
fallbackQuota
fallback_quota
No
The name of the quota that is selected when the MCQA job is rolled back. If you do not configure this parameter, the default quota of the MaxCompute project is used.
URL key
Property key
Required
Description
interactiveMode
interactive_mode
No
Specifies whether to enable MaxCompute Query Acceleration (MCQA). Valid values:
executeProject
execute_project_name
No
The name of the MaxCompute project in which the SQL task is executed.
tunnelRetryTime
tunnel_retry_time
No
The maximum number of retries for Tunnel commands specified by SQLExecutor. Default value: 6.
attachTimeout
attach_timeout
No
The timeout period for establishing a connection when MCQA is enabled. Default value: 60. Unit: seconds.
fallbackQuota
fallback_quota
No
The name of the quota that is selected when the MCQA job is rolled back. If you do not configure this parameter, the default quota of the MaxCompute project is used.
Connect to MaxCompute
Load the MaxCompute JDBC driver.
Class.forName("com.aliyun.odps.jdbc.OdpsDriver");
Use DriverManager to establish a connection to MaxCompute.
Connection cnct = DriverManager.getConnection(url, accessId, accessKey);
url: A URL is in the
jdbc:odps:<maxcompute_endpoint>?project=<maxcompute_project_name>[&useProjectTimeZone={true|false}]
format. Parameters:<maxcompute_endpoint>: the endpoint of the MaxCompute service in a specific region. For example, the public endpoint of MaxCompute in the China (Hangzhou) region is
http://service.cn-hangzhou.maxcompute.aliyun.com/api
. For more information about the configuration information of endpoints, see Endpoints.<maxcompute_project_name>: the name of your MaxCompute project.
useProjectTimeZone: specifies whether to use the time zone of the MaxCompute project.
Sample command:
jdbc:odps:http://service.cn-hangzhou.maxcompute.aliyun.com/api?project=test_project&useProjectTimeZone=true;
accessId: the AccessKey ID of the account used to create the project.
accessKey: the AccessKey secret that corresponds to the AccessKey ID of the account used to create the project.
NoteFor more information about how to create and view an AccessKey ID and AccessKey secret, see Create an Alibaba Cloud account.
Perform a data query.
Statement stmt = cnct.createStatement(); ResultSet rset = stmt.executeQuery("SELECT foo FROM bar"); while (rset.next()) { // process the results } rset.close(); stmt.close(); cnct.close();
Sample code
Delete a table, create a table, and obtain metadata
NoteIf JDBC dependencies are used in the project, you do not need to specify SDK dependencies. After you configure JDBC dependencies, the related SDK is automatically used. If you specify SDK dependencies, an error may occur due to version inconsistency when you run the code.
import java.sql.Connection; import java.sql.DatabaseMetaData; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.SQLException; import java.sql.Statement; public class Main { private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver"; // The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. // In this example, the AccessKey ID and AccessKey secret are saved in the environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements. // To prevent AccessKey pair leaks, we recommend that you do not directly specify the AccessKey ID and AccessKey secret in the code. private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"); private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"); public static void main(String[] args) throws SQLException { try { Class.forName(DRIVER_NAME); } catch (ClassNotFoundException e) { e.printStackTrace(); System.exit(1); } Connection conn = DriverManager.getConnection( "jdbc:odps:<maxcompute_endpoint>?project=<maxcompute_project>", Main.accessId, Main.accessKey); // create a table Statement stmt = conn.createStatement(); final String tableName = "jdbc_test"; stmt.execute("DROP TABLE IF EXISTS " + tableName); stmt.execute("CREATE TABLE " + tableName + " (key BIGINT, value STRING)"); // get meta data DatabaseMetaData metaData = conn.getMetaData(); System.out.println("product = " + metaData.getDatabaseProductName()); System.out.println("jdbc version = " + metaData.getDriverMajorVersion() + ", " + metaData.getDriverMinorVersion()); ResultSet tables = metaData.getTables(null, null, tableName, null); while (tables.next()) { String name = tables.getString("TABLE_NAME"); System.out.println("inspecting table: " + name); ResultSet columns = metaData.getColumns(null, null, name, null); while (columns.next()) { System.out.println( columns.getString("COLUMN_NAME") + "\t" + columns.getString("TYPE_NAME") + "(" + columns.getInt("DATA_TYPE") + ")"); } columns.close(); } tables.close(); stmt.close(); conn.close(); } }
The following result is returned:
product = MaxCompute/ODPS jdbc version = 3, 0 inspecting table: jdbc_test key BIGINT(-5) value STRING(12)
Update a table
import java.sql.Connection; import java.sql.DriverManager; import java.sql.SQLException; import java.sql.Statement; public class Main { private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver"; // The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use these credentials to perform operations, security risks may arise. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. // In this example, the AccessKey ID and AccessKey secret are saved in the environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements. // To prevent AccessKey pair leaks, we recommend that you do not directly specify the AccessKey ID and AccessKey secret in the code. private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"); private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"); public static void main(String[] args) throws SQLException { try { Class.forName(DRIVER_NAME); } catch (ClassNotFoundException e) { e.printStackTrace(); System.exit(1); } Connection conn = DriverManager.getConnection( "jdbc:odps:<maxcompute_endpoint>?project=<maxcompute_project>", Main.accessId, Main.accessKey); Statement stmt = conn.createStatement(); // The following DML also works //String dml = "INSERT INTO jdbc_test SELECT 1, \"foo\""; String dml = "INSERT INTO jdbc_test VALUES(1, \"foo\")"; int ret = stmt.executeUpdate(dml); assert ret == 1; stmt.close(); conn.close(); } }
Update multiple tables at the same time
import java.sql.Connection; import java.sql.DriverManager; import java.sql.PreparedStatement; import java.sql.SQLException; public class Main { private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver"; // The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. // In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements. // We recommend that you do not hard-code the AccessKey ID and AccessKey secret in your code. Otherwise, the AccessKey pair may be leaked. private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"); private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"); public static void main(String[] args) throws SQLException { try { Class.forName(DRIVER_NAME); } catch (ClassNotFoundException e) { e.printStackTrace(); System.exit(1); } Connection conn = DriverManager.getConnection( "jdbc:odps:<maxcompute endpoint>?project=<maxcompute project>", Main.accessId, Main.accessKey); PreparedStatement pstmt = conn.prepareStatement("INSERT INTO jdbc_test VALUES(?, ?)"); pstmt.setLong(1, 1L); pstmt.setString(2, "foo"); pstmt.addBatch(); pstmt.setLong(1, 2L); pstmt.setString(2, "bar"); pstmt.addBatch(); int[] ret = pstmt.executeBatch(); assert ret[0] == 1; assert ret[1] == 1; pstmt.close(); conn.close(); } }
NoteThe executeBatch method is not suitable for batch data writing to clustered tables. For example, you cannot use this method to write a batch of data to a Transaction Table 2.0 table at a time.
If you write a batch of data to a standard partitioned table at a time, you must specify partitions to which data is written in the INSERT INTO statement. The following code shows an example:
-- Execute the following statement to create the partitioned table sale_detail. create table if not exists sale_detail ( shop_name string, customer_id string, total_price double ) partitioned by (sale_date string, region string); -- If the partitioned table contains partitions sale_date='20240219' and region='hangzhou', execute the following INSERT INTO statement to write a batch of data to the partitioned table at a time: INSERT INTO sale_detail PARTITION(sale_date='20240219', region='hangzhou') VALUES(?, ?, ?)
Query data from a table
import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.SQLException; import java.sql.Statement; public class Main { private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver"; // The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. // In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements. // We recommend that you do not hard-code the AccessKey ID and AccessKey secret in your code. Otherwise, the AccessKey pair may be leaked. private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"); private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"); public static void main(String[] args) throws SQLException { try { Class.forName(DRIVER_NAME); } catch (ClassNotFoundException e) { e.printStackTrace(); System.exit(1); } Connection conn = DriverManager.getConnection( "jdbc:odps:<maxcompute endpoint>?project=<maxcompute project>", Main.accessId, Main.accessKey); ResultSet rs; Statement stmt = conn.createStatement(); String sql = "SELECT * FROM JDBC_TEST"; stmt.executeQuery(sql); ResultSet rset = stmt.getResultSet(); while (rset.next()) { System.out.println(String.valueOf(rset.getInt(1)) + "\t" + rset.getString(2)); } } }
NoteOdpsStatement supports the following methods:
execute(sql)
,executeQuery(sql)
, andexecuteUpdate(sql)
. Theexecute(sql)
andexecuteQuery(sql)
methods support the following common commands:desc table
,show tables
, andshow partitions
.