All Products
Search
Document Center

Hologres:Python

Last Updated:Nov 15, 2024

Psycopg is a PostgreSQL database adapter for Python. Hologres is compatible with PostgreSQL 11. You can use Psycopg to access Hologres. This topic describes how to use Psycopg 2 to access Hologres. In this example, Python 3.8 installed on CentOS 7 is used.

Install Python 3.8

You can install Python 3.8 in a Miniconda or Anaconda environment. In this example, Python 3.8 is installed on CentOS 7.

  1. Install Python 3.8.

    Download the Python and run the following commands to install Python:

    # yum install centos-release-scl
    # yum install rh-python38
    # scl enable rh-python38 bash
    # python --version
    Python 3.8.6
  2. Install the Psycopg 2 package.

    Run the following command to install the Psycopg 2 package:

     # pip install psycopg2-binary

Connect to Hologres

After you install Python 3.8 and Psycopg 2, perform the following steps to connect to Hologres:

  1. Load the Psycopg 2 package.

    To use Psycopg 2, run the import psycopg2 command to load the Psycopg 2 package that you installed.

  2. Connect to your Hologres database.

    You can use the psycopg2.connect() function to connect to your Hologres database. The following sample code provides an example on how to use this function. The following table describes the parameters of this function.

    conn = psycopg2.connect(host="<Endpoint>",
                            port=<Port>,
                            dbname="<databases>",
                            user="<Access ID>",
                            password="<Access Key>",
                            keepalives=<keepalives>,
                            keepalives_idle=<keepalives_idle>,
                            keepalives_interval=<keepalives_interval>,
                            keepalives_count=<keepalives_count>,
                            application_name="<Application Name>"
    
                           )
    

    Parameter

    Description

    host

    The endpoint of your Hologres instance.

    You can view the endpoint of your Hologres instance in the Network Information section of the Instance Details page in the Hologres console.

    port

    The port number of your Hologres instance.

    You can view the port number of your Hologres instance in the Network Information section of the Instance Details page in the Hologres console.

    dbname

    The name of the Hologres database.

    user

    The AccessKey ID of the Alibaba Cloud account.

    You can obtain the AccessKey ID from the AccessKey page.

    password

    The AccessKey secret of the Alibaba Cloud account.

    You can obtain the AccessKey secret from the AccessKey page.

    application_name

    Optional. The name of your application. This parameter is used to specify the application that initiates the SQL statement, which helps you understand the usage of the statement when you query logs.

    Note

    If you configure this parameter in the preceding code, you can identify the application that sent requests in the Application Name column on the Historical Slow Query page.

    keepalives

    Optional. The connection method. We recommend that you configure this parameter. Valid values:

    • 1: uses a persistent connection.

    • 0: uses a non-persistent connection.

    keepalives_idle

    The interval at which a keepalive message is sent when the connection to the Hologres database enters the idle state. Unit: seconds.

    keepalives_interval

    The interval at which a keepalive message is sent again if no response is returned. Unit: seconds.

    keepalives_count

    The maximum number of times a keepalive message is sent.

    Sample code:

    conn = psycopg2.connect(host="<Endpoint>",
                            port=<Port>,
                            dbname="<databases>",
                            user="<Access ID>",
                            password="<Access Key>",
                            keepalives=1, # The connection is retained.
                            keepalives_idle=130, # A keepalive message is sent at an interval of 130 seconds when the connection to the Hologres database is idle.
                            keepalives_interval=10, # A keepalive message is sent again at an interval of 10 seconds if no response is returned.
                            keepalives_count=15, # A keepalive message can be sent a maximum number of 15 times.
                            application_name="<Application Name>"
                           )
    
    

Use Hologres

After you connect to the Hologres database, you can use Psycopg 2 to develop data in the database. You can perform the following steps to create a table, insert data into the table, query the data, and then release resources. If you want to use fixed plans to perform read and write operations that require better performance, use the related Grand Unified Configuration (GUC) parameters. For more information, see Accelerate the execution of SQL statements by using fixed plans.

  1. Create a cursor.

    Before you develop data, run the cur = conn.cursor() command to create a cursor for the connection.

  2. Develop data.

    1. Create a table.

      The following sample command creates a table named holo_test and specifies that the type of data in the table is INTEGER. You can set a custom table name and specify a custom data type based on your business requirements.

      cur.execute("CREATE TABLE holo_test (num integer);")
    2. Insert data.

      The following sample command inserts integers from 1 to 1,000 to the holo_test table.

      cur.execute("INSERT INTO holo_test SELECT generate_series(%s, %s)", (1, 1000))
    3. Query the data.

      cur.execute("SELECT sum(num) FROM holo_test;")
      cur.fetchone()
  3. Commit the transaction.

    After you run the command for the data query, run the conn.commit() command. This ensures that the operation is committed as a transaction. You can also set the autocommit parameter to true to automatically commit the operation.

  4. Release resources.

    To ensure that subsequent operations are not affected, after the preceding commands are run, run the following commands to close the cursor and disconnect Psycopg 2 from your Hologres database:

    cur.close()
    conn.close()

Best practice of efficiently importing DataFrames to Hologres

In most cases where you use Python, you need to use pandas to convert data entries into a DataFrame, process the DataFrame, and then import the DataFrame to Hologres. In this case, you may want to efficiently import the DataFrame to Hologres. The to_sql function is usually used to import DataFrames to Hologres. For more information, see pandas.DataFrame.to_sql.

The version of pandas must be V1.4.2 or later. You can run the following command to forcibly install pandas whose version is V1.5.1:

# pip install Pandas==1.5.1

When you import a DataFrame to Hologres, we recommend that you use a callable of the to_sql function to execute the COPY FROM STDIN statement. Sample code in Python:

# Load a dependency.
import pandas as pd
import psycopg2

# Generate a connection string.
host="hgpostcn-cn-xxxxxx-cn-hangzhou.hologres.aliyuncs.com"
port=80
dbname="demo"
user="LTAI5xxxxx"
password="fa8Kdgxxxxx"
application_name="Python Test"
conn = "postgresql+psycopg2://{}:{}@{}:{}/{}?application_name={}".format(user, password, host, port, dbname,application_name)
print(conn)

# Convert data entries into a DataFrame.
data = [('1','1','1'),('2','2','2')]
cols = ('col1','col2','col3')
pd_data = pd.DataFrame(data, columns=cols)

# Specify a callable of the to_sql function.
import csv
from io import StringIO

def psql_insert_copy(table, conn, keys, data_iter):
    """
    Execute SQL statement inserting data

    Parameters
    ----------
    table : pandas.io.sql.SQLTable
    conn : sqlalchemy.engine.Engine or sqlalchemy.engine.Connection
    keys : list of str
        Column names
    data_iter : Iterable that iterates the values to be inserted
    """
    # gets a DBAPI connection that can provide a cursor
    dbapi_conn = conn.connection
    with dbapi_conn.cursor() as cur:
        s_buf = StringIO()
        writer = csv.writer(s_buf)
        writer.writerows(data_iter)
        s_buf.seek(0)

        columns = ', '.join('"{}"'.format(k) for k in keys)
        if table.schema:
            table_name = '{}.{}'.format(table.schema, table.name)
        else:
            table_name = table.name

        sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(
            table_name, columns)
        cur.copy_expert(sql=sql, file=s_buf)

# Import the DataFrame.
pd_data.to_sql(
    name="pd_data",
    con=conn,
    if_exists="append",
    index=False,
    method=psql_insert_copy
)

You can view historical queries in the HoloWeb console to check whether the DataFrame is imported to Hologres by executing the COPY FROM STDIN statement.