Download the JAR package of the MaxCompute JDBC driver and connect to MaxCompute - MaxCompute

This topic describes how to download the JAR package of the MaxCompute JDBC driver and use the driver to connect to MaxCompute. This topic also provides sample code.

Usage notes

To execute SQL statements and obtain execution results by using the MaxCompute JDBC driver, you must meet the following requirements:
- You are a member of a project.
- You have the CREATE INSTANCE permission on the project.
- You have the SELECT and DOWNLOAD permissions on the table that you want to use.
  Note
  - When you use MaxCompute JDBC V1.9 or earlier, a temporary table is automatically created for each query. You can use Tunnel commands to obtain query results from the temporary table. To use these versions, you must have the CREATE TABLE permission.
  - When you use MaxCompute JDBC V2.2 or later, no temporary table is automatically created for each query. You can call the InstanceTunnel interface to obtain query results, regardless of whether you have the CREATE TABLE permission.
  For more information about MaxCompute permissions, see MaxCompute permissions.
MaxCompute provides the data protection feature. If the data protection feature is enabled for a project, you cannot move data out of the project. If you use MaxCompute JDBC of a version earlier than V2.4, no result sets can be obtained. If you use MaxCompute JDBC V2.4 or later, the number of result rows that you obtain cannot exceed the value of the READ_TABLE_MAX_ROW parameter. For more information about this parameter, see Project operations. For more information about the data protection feature, see Project data protection.
The MaxCompute V2.0 data type edition supports more data types, such as TINYINT, SMALLINT, DATETIME, TIMESTAMP, ARRAY, MAP, and STRUCT. To use these new data types, you must run the following command to enable the MaxCompute V2.0 data type edition. For more information, see Data type editions.
```
set odps.sql.type.system.odps2=true
```

Download the JAR package of the MaxCompute JDBC driver

You can download the JAR packages of different versions of the MaxCompute JDBC driver from OSS, GitHub, or Maven. We recommend that you download the JAR package whose name includes jar-with-dependencies.

The following code describes the dependency in the Project Object Model (POM) file of the MaxCompute JDBC driver downloaded from Maven:

<dependency>
  <groupId>com.aliyun.odps</groupId>
  <artifactId>odps-jdbc</artifactId>
  <version>3.3.6</version>
  <classifier>jar-with-dependencies</classifier>
</dependency>

Note

The MaxCompute JDBC driver is an open source code project that is named aliyun-odps-jdbc.

You are welcome to participate in the development and improvement of the MaxCompute JDBC driver. You can report issues on the Issues tab and optimize source code on the Pull requests tab. Your operations on the Issues and Pull requests tabs must follow the template requirements for open source projects.

JDBC parameters

You can configure the URL and property parameters to use a JDBC driver. The property parameters have a higher priority than the URL parameters.

Note

If a URL parameter is odps_config=config_file, config_file is read as a property parameter.

Basic parameters

URL key	Property key	Required	Description
project	project_name	Yes	The name of the MaxCompute project.
accessId	access_id	Yes	The AccessKey ID of your Alibaba Cloud account. You can obtain the AccessKey ID from the AccessKey Pair page.
accessKey	access_key	Yes	The AccessKey secret of your Alibaba Cloud account. You can obtain the AccessKey secret on the AccessKey Pair page.
logview	logview_host	No	The URL of the MaxCompute LogView. Set the value to `http://logview.odps.aliyun.com`.
tunnelEndpoint	tunnel_endpoint	No	The endpoint of MaxCompute Tunnel. For more information about the Tunnel endpoints that correspond to different regions and network connection types, see Endpoints.

Log configuration parameters

URL key	Property key	Required	Description
enableOdpsLogger	enable_odps_logger	No	Specifies whether to enable the MaxCompute JDBC logger. Valid values: False: The MaxCompute JDBC logger is disabled. Logs are not recorded in a file. This is the default value. True: The MaxCompute JDBC logger is enabled. Logs are recorded in the file `jdbc.log` of the directory in which the JAR package of the MaxCompute JDBC driver is stored.
logConfFile	log_conf_file	No	The configuration file for Simple Logging Facade for Java (SLF4J). You can specify this configuration file to flexibly configure the log output. For example, you can specify the output file and configure the logLevel parameter to specify the log level. To specify this configuration file, you can add the following dependencies to the file `pom.xml` in the project: `<dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-core</artifactId> <version>1.2.3</version> </dependency> <dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-classic</artifactId> <version>1.2.3</version> </dependency>` For more information about the configuration, see Example.
logLevel	log_level	No	The log level. Default value: INFO.

Other parameters

URL key	Property key	Required	Description
stsToken	sts_token	No	The Alibaba Cloud STS token.
charset	charset	No	The character set for the input and output. Default value: UTF-8.
useProjectTimeZone	use_project_time_zone	No	Specifies whether to use the time zone that is specified by the `odps.sql.timezone` parameter of the MaxCompute project. Valid values: False: The time zone of the MaxCompute project is not used. This is the default value. True: The time zone of the MaxCompute project is used. Note You can also configure `set odps.sql.timezone=xxx` in a statement to use the time zone of the MaxCompute project. The time zone specified in a statement takes precedence over the time zone specified in a project. The time zone specified in a project takes precedence over the time zones that you do not specify.
disableConnectinosSetting	disable_connection_setting	No	Specifies whether the SQL parameters of a connection can be configured. Valid values: False: The SQL parameters of a connection cannot be configured. This is the default value. True: The SQL parameters of a connection can be configured. If you set this parameter to True, when you run the `set` command in a statement, the SQL parameters of the statement and the connection are configured at the same time. If you set this parameter to False, when you run the set xxx command in a statement, only the SQL parameters of the statement are configured.
settings	settings	No	The default global `SQL settings`. The value of this parameter is passed in the JSON format, such as `{"key":"value"}`.
tableList	table_list	No	The name of the table in MaxCompute. The name of the table is in the `projectname.tablename,projectname1.tablename1` format.
connectTimeout	connect_timeout	No	The timeout period for establishing a connection over the underlying network. Default value: 10. Unit: seconds.
readTimeout	read_timeout	No	The timeout period for reading data over an underlying network connection. Default value: 120. Unit: seconds. Note The timeout period of a connection in each RESTful API request is the sum of the values of connectTimeout and readTimeout. The default timeout period of each RESTful API request is 130 seconds. By default, the maximum number of retries for establishing a connection in a RESTful API request is 3. If you want to adjust the timeout period of a RESTful API request, change the value of the readTimeout parameter.
enableCommandApi	enable_command_api	No	Specifies whether to use the commandAPI. Valid values: False: The commandAPI is not used. This is the default value. True: The commandAPI is used. If you set this parameter to True, you can run specific commands that can be run only on the MaxCompute client (odpscmd) when you use a JDBC driver.
httpsCheck	https_check	No	Specifies whether to perform HTTPS-based certificate verification. Valid values: False: HTTPS-based certificate verification is not performed. This is the default value. True: HTTPS-based certificate verification is performed.
tunnelConnectTimeout	tunnel_connect_timeout	No	The timeout period for a tunnel connection when you run Tunnel commands to download data. Default value: 180. Unit: seconds.
tunnelReadTimeout	tunnel_read_timeout	No	The timeout period for reading data when you run Tunnel commands to download data. Default value: 300. Unit: seconds.

Parameters that are not related to MCQA (in offline mode)

URL key

Property key

Required

Description

autoLimitFallback

auto_limit_fallback

Specifies whether to enable auto limit fallback. Valid values:

False: Auto limit fallback is disabled. This is the default value.
True: Auto limit fallback is enabled. In offline mode, if this parameter is set to True, the maximum number of data records that can be downloaded is 10,000 when the error message no download permission appears.

MCQA-related parameters (in MCQA mode)

Basic parameters

URL key	Property key	Required	Description
interactiveMode	interactive_mode	No	Specifies whether to enable MaxCompute Query Acceleration (MCQA). Valid values: False: MCQA is disabled. This is the default value. True: MCQA is enabled.
executeProject	execute_project_name	No	The name of the MaxCompute project in which the SQL task is executed.
tunnelRetryTime	tunnel_retry_time	No	The maximum number of retries for Tunnel commands specified by SQLExecutor. Default value: 6.
attachTimeout	attach_timeout	No	The timeout period for establishing a connection when MCQA is enabled. Default value: 60. Unit: seconds.
fallbackQuota	fallback_quota	No	The name of the quota that is selected when the MCQA job is rolled back. If you do not configure this parameter, the default quota of the MaxCompute project is used.

Limit-related parameters

URL key	Property key	Required	Description
instanceTunnelMaxRecord	instance_tunnel_max_record	No	The maximum number of records in the result set. Note This parameter takes effect only when the enableLimit parameter is set to False.
instanceTunnelMaxSize	instance_tunnel_max_size	No	The maximum size of the result set. Unit: bytes.
autoSelectLimit	auto_select_limit	No	The maximum number of rows of data in a query. In the elastic computing environment of Alibaba Cloud, a maximum of 1 million rows of data can be queried by default. If you want to query more than 1 million rows of data, you can configure this parameter. Note This parameter takes effect only when the enableLimit parameter is set to False. When you use a JDBC driver of V3.2.29 or later, if you configure the autoSelectLimit parameter, the enableLimit parameter is automatically set to False. You can add `odps.sql.select.auto.limit` to the SQL settings of MaxCompute. By default, this parameter takes effect when you run an SQL job.
enableLimit	enable_limit	No	Specifies whether to enable the limit on the maximum number of data records in a query. Valid values: False: The limit on the maximum number of data records in a query is disabled. True: The limit on the maximum number of data records in a query is enabled. This is the default value. After the limit on the maximum number of data records in a query is enabled, the download control is not checked and the maximum number of result records is 10,000.

Rollback-related parameters

URL key	Property key	Required	Description
fallbackForUnknownError	fallback_for_unknownerror	No	Specifies whether to roll back the MCQA job to the offline mode when an unknown error occurs. Valid values: False: The MCQA job is not rolled back to the offline mode. True: The MCQA job is rolled back to the offline mode. This is the default value.
fallbackForResourceNotEnough	fallback_for_resourcenotenough	No	Specifies whether to roll back the MCQA job to the offline mode when resources are insufficient. Valid values: False: The MCQA job is not rolled back to the offline mode. True: The MCQA job is rolled back to the offline mode. This is the default value.
fallbackForUpgrading	fallback_for_upgrading	No	Specifies whether to roll back the MCQA job to the offline mode during an upgrade. Valid values: False: The MCQA job is not rolled back to the offline mode. True: The MCQA job is rolled back to the offline mode. This is the default value.
fallbackForRunningTimeout	fallback_for_runningtimeout	No	Specifies whether to roll back the MCQA job to the offline mode if a connection times out when you run a command. Valid values: False: The MCQA job is not rolled back to the offline mode. True: The MCQA job is rolled back to the offline mode. This is the default value.
fallbackForUnsupportedFeature	fallbackForUnsupportedFeature	No	Specifies whether to roll back the MCQA job to the offline mode in scenarios in which MCQA is not supported. Valid values: False: The MCQA job is not rolled back to the offline mode. True: The MCQA job is rolled back to the offline mode. This is the default value.
alwaysFallback	always_fallback	No	Specifies whether to roll back the MCQA job to the offline mode in all the preceding scenarios. Valid values: False: Auto limit fallback is disabled. This is the default value. True: The MCQA job is rolled back to the offline mode. Note This parameter takes effect only when you use a MaxCompute JDBC driver of V3.2.3 or later.
disableFallback	disable_fallback	No	Specifies whether not to roll back the MCQA job to the offline mode in all the preceding scenarios. Valid values: False: The MCQA job is rolled back to the offline mode. This is the default value. True: The MCQA job is not rolled back to the offline mode.
fallbackQuota	fallback_quota	No	The name of the quota that is selected when the MCQA job is rolled back. If you do not configure this parameter, the default quota of the MaxCompute project is used.

Connect to MaxCompute

Load the MaxCompute JDBC driver.

Class.forName("com.aliyun.odps.jdbc.OdpsDriver");

Use DriverManager to establish a connection to MaxCompute.
```
Connection cnct = DriverManager.getConnection(url, accessId, accessKey);
```
- url: A URL is in the jdbc:odps:<maxcompute_endpoint>?project=<maxcompute_project_name>[&useProjectTimeZone={true|false}] format. Parameters:
  - <maxcompute_endpoint>: the endpoint of the MaxCompute service in a specific region. For example, the public endpoint of MaxCompute in the China (Hangzhou) region is http://service.cn-hangzhou.maxcompute.aliyun.com/api. For more information about the configuration information of endpoints, see Endpoints.
  - <maxcompute_project_name>: the name of your MaxCompute project.
  - useProjectTimeZone: specifies whether to use the time zone of the MaxCompute project.
  Sample command:
```
jdbc:odps:http://service.cn-hangzhou.maxcompute.aliyun.com/api?project=test_project&useProjectTimeZone=true;
```
- accessId: the AccessKey ID of the account used to create the project.
- accessKey: the AccessKey secret that corresponds to the AccessKey ID of the account used to create the project.
  Note
  For more information about how to create and view an AccessKey ID and AccessKey secret, see Create an Alibaba Cloud account.

Perform a data query.

Statement stmt = cnct.createStatement();
ResultSet rset = stmt.executeQuery("SELECT foo FROM bar");

while (rset.next()) {
    // process the results
}

rset.close();
stmt.close();
cnct.close();

Sample code

Delete a table, create a table, and obtain metadata

Note

If JDBC dependencies are used in the project, you do not need to specify SDK dependencies. After you configure JDBC dependencies, the related SDK is automatically used. If you specify SDK dependencies, an error may occur due to version inconsistency when you run the code.

import java.sql.Connection;
import java.sql.DatabaseMetaData;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class Main {

    private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver";
 		// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
		// In this example, the AccessKey ID and AccessKey secret are saved in the environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
		// To prevent AccessKey pair leaks, we recommend that you do not directly specify the AccessKey ID and AccessKey secret in the code.
		private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
		private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");

    public static void main(String[] args) throws SQLException {
        try {
            Class.forName(DRIVER_NAME);
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
            System.exit(1);
        }
				
        Connection conn = DriverManager.getConnection(
            "jdbc:odps:<maxcompute_endpoint>?project=<maxcompute_project>",
            Main.accessId, Main.accessKey);

        // create a table
        Statement stmt = conn.createStatement();
        final String tableName = "jdbc_test";
        stmt.execute("DROP TABLE IF EXISTS " + tableName);
        stmt.execute("CREATE TABLE " + tableName + " (key BIGINT, value STRING)");

        // get meta data
        DatabaseMetaData metaData = conn.getMetaData();
        System.out.println("product = " + metaData.getDatabaseProductName());
        System.out.println("jdbc version = "
            + metaData.getDriverMajorVersion() + ", "
            + metaData.getDriverMinorVersion());
        ResultSet tables = metaData.getTables(null, null, tableName, null);
        while (tables.next()) {
            String name = tables.getString("TABLE_NAME");
            System.out.println("inspecting table: " + name);
            ResultSet columns = metaData.getColumns(null, null, name, null);
            while (columns.next()) {
                System.out.println(
                    columns.getString("COLUMN_NAME") + "\t" +
                        columns.getString("TYPE_NAME") + "(" +
                        columns.getInt("DATA_TYPE") + ")");
            }
            columns.close();
        }

        tables.close();
        stmt.close();
        conn.close();
    }
}

The following result is returned:

product = MaxCompute/ODPS
jdbc version = 3, 0
inspecting table: jdbc_test
key    BIGINT(-5)
value    STRING(12)

Update a table

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;

public class Main {

    private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver";
		// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use these credentials to perform operations, security risks may arise. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
		// In this example, the AccessKey ID and AccessKey secret are saved in the environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
		// To prevent AccessKey pair leaks, we recommend that you do not directly specify the AccessKey ID and AccessKey secret in the code.
  	private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
  	private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");
  
    public static void main(String[] args) throws SQLException {
        try {
            Class.forName(DRIVER_NAME);
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
            System.exit(1);
        }
				
        Connection conn = DriverManager.getConnection(
            "jdbc:odps:<maxcompute_endpoint>?project=<maxcompute_project>",
            Main.accessId, Main.accessKey);

        Statement stmt = conn.createStatement();
        // The following DML also works
        //String dml = "INSERT INTO jdbc_test SELECT 1, \"foo\"";
        String dml = "INSERT INTO jdbc_test VALUES(1, \"foo\")";
        int ret = stmt.executeUpdate(dml);

        assert ret == 1;

        stmt.close();
        conn.close();
    }
}

Update multiple tables at the same time

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;

public class Main {

    private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver";
  	// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
		// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
		// We recommend that you do not hard-code the AccessKey ID and AccessKey secret in your code. Otherwise, the AccessKey pair may be leaked.
		private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
		private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");
  
    public static void main(String[] args) throws SQLException {
        try {
            Class.forName(DRIVER_NAME);
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
            System.exit(1);
        }
				
        Connection conn = DriverManager.getConnection(
            "jdbc:odps:<maxcompute endpoint>?project=<maxcompute project>",
            Main.accessId, Main.accessKey);

        PreparedStatement pstmt = conn.prepareStatement("INSERT INTO jdbc_test VALUES(?, ?)");

        pstmt.setLong(1, 1L);
        pstmt.setString(2, "foo");
        pstmt.addBatch();

        pstmt.setLong(1, 2L);
        pstmt.setString(2, "bar");
        pstmt.addBatch();

        int[] ret = pstmt.executeBatch();

        assert ret[0] == 1;
        assert ret[1] == 1;

        pstmt.close();
        conn.close();
    }
}

Note

The executeBatch method is not suitable for batch data writing to clustered tables. For example, you cannot use this method to write a batch of data to a Transaction Table 2.0 table at a time.

If you write a batch of data to a standard partitioned table at a time, you must specify partitions to which data is written in the INSERT INTO statement. The following code shows an example:

-- Execute the following statement to create the partitioned table sale_detail. 
create table if not exists sale_detail
(
shop_name string,
customer_id string,
total_price double
)
partitioned by (sale_date string, region string);

-- If the partitioned table contains partitions sale_date='20240219' and region='hangzhou', execute the following INSERT INTO statement to write a batch of data to the partitioned table at a time: 
INSERT INTO sale_detail PARTITION(sale_date='20240219', region='hangzhou') VALUES(?, ?, ?)

Query data from a table

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class Main {

    private static final String DRIVER_NAME = "com.aliyun.odps.jdbc.OdpsDriver";
  	// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
		// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
		// We recommend that you do not hard-code the AccessKey ID and AccessKey secret in your code. Otherwise, the AccessKey pair may be leaked.
		private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
		private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");

    public static void main(String[] args) throws SQLException {
        try {
            Class.forName(DRIVER_NAME);
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
            System.exit(1);
        }
				
        Connection conn = DriverManager.getConnection(
            "jdbc:odps:<maxcompute endpoint>?project=<maxcompute project>",
            Main.accessId, Main.accessKey);
        ResultSet rs;

        Statement stmt = conn.createStatement();
        String sql = "SELECT * FROM JDBC_TEST";
        stmt.executeQuery(sql);

        ResultSet rset = stmt.getResultSet();
        while (rset.next()) {
            System.out.println(String.valueOf(rset.getInt(1)) + "\t" + rset.getString(2));
        }
    }
}

Note

OdpsStatement supports the following methods: execute(sql), executeQuery(sql), and executeUpdate(sql). The execute(sql) and executeQuery(sql) methods support the following common commands: desc table, show tables, and show partitions.