Use JDBC to connect to Hologres for data development - Hologres

Hologres provides the Java Database Connectivity (JDBC) and Open Database Connectivity (ODBC) interfaces that are fully compatible with PostgreSQL. You can connect an SQL client to Hologres by using these interfaces for data development. This topic describes how to use JDBC to connect to Hologres for data development.

Usage notes

To write data to Hologres by using JDBC, you must use PostgreSQL JDBC Driver 42.3.2 or later.
If you use JDBC to connect to Hologres and want to test the data write performance, we recommend that you use a virtual private cloud (VPC). The Internet cannot be used to test the data write performance.
Hologres does not support multiple data writes in a transaction. Therefore, you must set the autoCommit parameter to true to enable the auto-commit mode instead of explicitly committing statements in code. If you use PostgreSQL JDBC Driver, the auto-commit mode is enabled by default. If the ERROR:INSERT in transaction is not supported now error message is reported, you must set autoCommit to true. Sample code:
```
Connection conn = DriverManager.getConnection(url, user, password); 
conn.setAutoCommit(true);
```

Use JDBC to connect to Hologres

To use JDBC to connect to Hologres, perform the following steps:

Download and configure PostgreSQL JDBC Driver.
If the client tool that you use integrates PostgreSQL JDBC Driver by default, you can use the built-in driver of the client tool. If the client tool that you use does not integrate PostgreSQL JDBC Driver, you must download and install PostgreSQL JDBC Driver.
To download PostgreSQL JDBC Driver, visit the Maven repository website. You must download PostgreSQL JDBC Driver 42.3.2 or later. We recommend that you download the latest version. Then, you must add the driver as a dependency to your Maven repository. Sample code:
```
<dependencies>
        <dependency>
            <groupId>org.postgresql</groupId>
            <artifactId>postgresql</artifactId>
            <version>42.3.2</version>                            
        </dependency>
</dependencies>
```

Connect to a Hologres instance.

Use the following JDBC URL to connect to a Hologres instance:

jdbc:postgresql://{ENDPOINT}:{PORT}/{DBNAME}?user={ACCESS_ID}&password={ACCESS_KEY}

The following table describes the parameters in the JDBC URL.

Parameter	Description
ENDPOINT	The endpoint of the Hologres instance. You can obtain the endpoint of the Hologres instance on the Instance Details page in the Hologres console.
PORT	The port number of the Hologres instance. You can obtain the port number of the Hologres instance on the Instance Details page in the Hologres console.
DBNAME	The name of the Hologres database.
ACCESS_ID	The AccessKey ID that is used to log on to the Hologres instance. We recommend that you configure environment variables and obtain the AccessKey ID and AccessKey secret from the environment variables. This helps reduce the leak risk. For more information, see the following examples.
ACCESS_KEY	The AccecssKey secret that is used to log on to the Hologres instance. We recommend that you configure environment variables and obtain the AccessKey ID and AccessKey secret from the environment variables. This helps reduce the leak risk. For more information, see the following examples.

Take note of the following recommendations when you connect to a Hologres instance:
- We recommend that you configure the optional parameter ApplicationName in the JDBC URL. If you configure the ApplicationName parameter in the JDBC URL for connecting to your Hologres database, you can identify the application that sent requests on the Historical Slow Query page. Format of the JDBC URL:
```
jdbc:postgresql://{ENDPOINT}:{PORT}/{DBNAME}?user={ACCESS_ID}&password={ACCESS_KEY}&ApplicationName={APPLICATION_NAME}
```
- We recommend that you add reWriteBatchedInserts=true to the JDBC URL to allow multiple data entries to be written at a time. This improves the data write performance. Format of the JDBC URL:
```
jdbc:postgresql://{ENDPOINT}:{PORT}/{DBNAME}?ApplicationName={APPLICATION_NAME}&reWriteBatchedInserts=true
```
- We recommend that you read and write data by executing prepared statements to increase the throughput.
- After the feature of automatically creating foreign tables for MaxCompute is enabled for Hologres, the MaxCompute project automatically maps to the Hologres schema that has the same name as the MaxCompute project. If you want to query data by using the foreign table in the schema, we recommend that you configure the currentSchema parameter in the JDBC URL. This parameter helps identify the MaxCompute project that maps to the schema. The following example shows a complete JDBC URL:
```
jdbc:postgresql://{ENDPOINT}:{PORT}/{DBNAME}?currentSchema={SCHEMA_NAME}&user={ACCESS_ID}&password={ACCESS_KEY}&ApplicationName={APPLICATION_NAME}
```
- We recommend that you configure environment variables and obtain the AccessKey ID and AccessKey secret from the environment variables. This helps reduce the leak risk. In this example, the Linux operating system (OS) is used. You can add the following statements to the bash_profile file to configure environment variables.
```
export ALIBABA_CLOUD_USER=<ACCESS_ID>
export ALIBABA_CLOUD_PASSWORD=<ACCESS_KEY>
```

The following sample code provides an example on how to connect to Hologres:

public class HologresTest {

    private void jdbcExample() throws SQLException {
        String user= System.getenv("ALIBABA_CLOUD_USER");
        String password = System.getenv("ALIBABA_CLOUD_PASSWORD");
        String url = String.format("jdbc:postgresql://{ENDPOINT}:{PORT}/{DBNAME}?currentSchema={SCHEMA_NAME}&user=%s&password=%s", user, password);
        try (Connection conn = DriverManager.getConnection(url)) {
            try (Statement st = conn.createStatement()) {
                String sql = "SELECT * FROM table where xxx limit 100";
                try (ResultSet rs = st.executeQuery(sql)) {
                    while (rs.next()) {
                        // Query the values of the first column in the data table.
                        String c1 = rs.getString(1);
                    }
                }
            }
        }
    }

    private void jdbcPreparedStmtExample() throws SQLException {
        String user= System.getenv("ALIBABA_CLOUD_USER");
        String password = System.getenv("ALIBABA_CLOUD_PASSWORD");
        String url = String.format("jdbc:postgresql://{ENDPOINT}:{PORT}/{DBNAME}?currentSchema={SCHEMA_NAME}&user=%s&password=%s", user, password);
        try (Connection conn = DriverManager.getConnection(url)) {
            String sql = "insert into test values" +
                    "(?, ?), " +
                    "(?, ?), " +
                    "(?, ?), " +
                    "(?, ?), " +
                    "(?, ?), " +
                    "(?, ?), " +
                    "(?, ?), " +
                    "(?, ?), " +
                    "(?, ?), " +
                    "(?, ?)";
            try (PreparedStatement st = conn.prepareStatement(sql)) {
                for (int i = 0; i < 10; ++i) {
                    for (int j = 0; j < 2 * 10; ++j) {
                        st.setString(j + 1, UUID.randomUUID().toString());
                    }
                    System.out.println("affected row => " + st.executeUpdate());
                }
            }
        }
    }
 }

Use JDBC to develop data

After you connect to Hologres by using JDBC, you can use standard SQL statements to develop data in Hologres. For example, you can write data to or read data from Hologres.

Write data

You can write data to Hologres by using JDBC statements or prepared statements. In most cases, we recommend that you use prepared statements and set the number of data entries to be written at a time to 256 or a multiple of 256. This is because the client caches the SQL compilation results when prepared statements are executed. This reduces the latency of data writes and improves the throughput.

In this example, data is written by executing prepared statements.

Execute the following statements to write data in batches:

/* Write data in batches by executing prepared statements. */
/* In this example, the batchSize parameter is set to 256. */
private static void WriteBatchWithPreparedStatement(Connection conn) throws Exception {
    try (PreparedStatement stmt = conn.prepareStatement("insert into test_tb values (?,?,?,?)")) {
        int batchSize = 256;
        for (int i = 0; i < batchSize; ++i) {
            stmt.setInt( 1, 1000 + i);
            stmt.setString( 2, "1");
            SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");
            Date parsedDate = dateFormat.parse("1990-11-11 00:00:00");
            stmt.setTimestamp( 3, new java.sql.Timestamp(parsedDate.getTime()));
            stmt.setDouble( 4 , 0.1 );
            stmt.addBatch();
        }
        stmt.executeBatch();
    }
}

You can also add the INSERT ON CONFLICT statement to the prepared statements to write and update the data. Sample statements:

Note

If you add the INSERT ON CONFLICT statement, a primary key must be defined for the destination table.

private static void InsertOverwrite(Connection conn) throws Exception {
    try (PreparedStatement stmt = conn.prepareStatement("insert into test_tb values (?,?,?,?), (?,?,?,?), (?,?,?,?), (?,?,?,?), (?,?,?,?), (?,?,?,?) on conflict(pk) do update set f1 = excluded.f1, f2 = excluded.f2, f3 = excluded.f3")) {
        int batchSize = 6;
        for (int i = 0; i < batchSize; ++i) {
            stmt.setInt(i * 4 + 1, i);
            stmt.setString(i * 4 + 2, "1");
            SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");
            Date parsedDate = dateFormat.parse("1990-11-11 00:00:00");
            stmt.setTimestamp(i * 4 + 3, new java.sql.Timestamp(parsedDate.getTime()));
            stmt.setDouble(i * 4 + 4, 0.1);
        }
        int affected_rows = stmt.executeUpdate();
        System.out.println("affected rows => " + affected_rows);
    }
}

Query data
After data is written, you can query the written data. You can also query the data in an existing table based on your business requirements.

Use the Druid connection pool

Usage notes
- We recommend that you configure keepAlive=true to reuse connections and prevent short-lived connections.
- Druid 1.1.12 or later is required to connect to Hologres.
- For Druid versions from 1.2.12 to 1.2.21, if you do not configure the connectTimeout and socketTimeout parameters, their default values which are 10s are used. To resolve this issue, upgrade the Druid version.

Configure the Druid connection pool

Note

Configure the initialSize, minIdle, and maxActive parameters based on the size of your Hologres instance and your business requirements.

<bean id="dataSource" class="com.alibaba.druid.pool.DruidDataSource" init-method="init" destroy-method="close">
  <!-- jdbc_url: the endpoint of the Hologres instance to which you want to connect. You can obtain the endpoint on the Instance Details page in the Hologres console. -->
  <property name="url" value="${jdbc_url}" />
  <!-- jdbc_user: the AccessKey ID that is used to connect to the Hologres instance.  -->
  <property name="username" value="${jdbc_user}" />
  <!-- jdbc_password: the AccessKey secret that is used to connect to the Hologres instance.  -->
  <property name="password" value="${jdbc_password}" />
  <!-- Specify the initial size of the connection pool, and the minimum and maximum numbers of connections.  -->
  <property name="initialSize" value="5" />
  <property name="minIdle" value="10" />
  <property name="maxActive" value="20" />
  <!-- Specify the timeout period for obtaining a connection from the connection pool.  -->
  <property name="maxWait" value="60000" />
  <!-- Specify the interval at which the system detects idle connections to be closed. Unit: milliseconds.  -->
  <property name="timeBetweenEvictionRunsMillis" value="2000" />
  <!-- Specify the minimum validity period of a connection in the connection pool. Unit: milliseconds.  -->
  <property name="minEvictableIdleTimeMillis" value="600000" />
  <property name="maxEvictableIdleTimeMillis" value="900000" />
  <property name="validationQuery" value="select 1" />
  <property name="testWhileIdle" value="true" />
  <!-- Specify whether to check the validity of the connection each time you obtain a connection from the connection pool. The value true specifies that the system checks the validity of the connection, and the value false specifies that the system does not check the validity of the connection.  -->
  <property name="testOnBorrow" value="false" />
  <!-- Specify whether to check the validity of the connection each time you return a connection to the connection pool. The value true specifies that the system checks the validity of the connection, and the value false specifies that the system does not check the validity of the connection.  -->
  <property name="testOnReturn" value="false" />
  <property name="keepAlive" value="true" />
  <property name="phyMaxUseCount" value="100000" />
  <!-- Configure the filters used for monitoring statistics.  -->
  <property name="filters" value="stat" />
</bean>

Best practices for performance optimization

To achieve desired performance when you use JDBC, take note of the following items:

We recommend that you use a VPC instead of the Internet to prevent the network overhead of the Internet.
We recommend that you configure reWriteBatchedInserts=true in the JDBC URL to allow multiple data entries to be written at a time. This improves the data write performance. We recommend that you set the number of data entries to be written at a time to 256 or a multiple of 256 to achieve better performance. You can also use Holo Client, which automatically collects data in batches.
```
jdbc:postgresql://{ENDPOINT}:{PORT}/{DBNAME}?ApplicationName={APPLICATION_NAME}&reWriteBatchedInserts=true
```
The server caches the SQL compilation results when prepared statements are executed. This reduces the latency of data writes and increases the throughput.

Configure GUC parameters for JDBC

In some cases, you must configure Grand Unified Configuration (GUC) parameters at the session level. For more information about GUC parameters, see GUC parameters. We recommend that you use the following code to configure GUC parameters at the session level. In the code, both the statement_timeout and idle_in_transaction_session_timeout parameters are set to 12345, in milliseconds.

import org.postgresql.PGProperty;
import java.sql.*;
import java.util.HashMap;
import java.util.Map;
import java.util.Properties;

public class gucdemo {
    public static void main(String[] args) {
        // Specify the endpoint of the Hologres instance.
        String hostname = "hgpostcn-cn-xxxx-cn-hangzhou.hologres.aliyuncs.com";
        // Specify the port number of the Hologres instance.
        String port = "80";
        // Specify the name of the database to which you want to connect.
        String dbname = "demo";
        String jdbcUrl = "jdbc:postgresql://" + hostname + ":" + port + "/" + dbname;
        Properties properties = new Properties();
        //Specify the account that is used to connect to the specified database.
        properties.setProperty("user", "xxxxx");
        // Specify the password of the account that is used to connect to the specified database.
        properties.setProperty("password", "xxxx");
        // Configure the GUC parameters.
        PGProperty.OPTIONS.set(properties,"--statement_timeout=12345 --idle_in_transaction_session_timeout=12345");
        try {
            Class.forName("org.postgresql.Driver");
            Connection connection = DriverManager.getConnection(jdbcUrl, properties);
            PreparedStatement preparedStatement = connection.prepareStatement("show statement_timeout" );
            ResultSet resultSet = preparedStatement.executeQuery();
            while (resultSet.next()) {
                ResultSetMetaData rsmd = resultSet.getMetaData();
                int columnCount = rsmd.getColumnCount();
                Map map = new HashMap();
                for (int i = 0; i < columnCount; i++) {
                    map.put(rsmd.getColumnName(i + 1).toLowerCase(), resultSet.getObject(i + 1));
                }
                System.out.println(map);
            }
        } catch (Exception exception) {
            exception.printStackTrace();
        }
    }
}

JDBC-based load balancing

In Hologres V1.3 and later, you can use JDBC to configure multiple read-only secondary instances to implement simple load balancing. For more information, see JDBC-based load balancing.