Psycopg is a PostgreSQL database adapter for Python. Hologres is compatible with PostgreSQL 11. You can use Psycopg to access Hologres. This topic describes how to use Psycopg 2 to access Hologres. In this example, Python 3.8 installed on CentOS 7 is used.
Install Python 3.8
You can install Python 3.8 in a Miniconda or Anaconda environment. In this example, Python 3.8 is installed on CentOS 7.
Install Python 3.8.
Download the Python and run the following commands to install Python:
# yum install centos-release-scl # yum install rh-python38 # scl enable rh-python38 bash # python --version Python 3.8.6
Install the Psycopg 2 package.
Run the following command to install the Psycopg 2 package:
# pip install psycopg2-binary
Connect to Hologres
After you install Python 3.8 and Psycopg 2, perform the following steps to connect to Hologres:
Load the Psycopg 2 package.
To use Psycopg 2, run the
import psycopg2
command to load the Psycopg 2 package that you installed.Connect to your Hologres database.
You can use the
psycopg2.connect()
function to connect to your Hologres database. The following sample code provides an example on how to use this function. The following table describes the parameters of this function.conn = psycopg2.connect(host="<Endpoint>", port=<Port>, dbname="<databases>", user="<Access ID>", password="<Access Key>", keepalives=<keepalives>, keepalives_idle=<keepalives_idle>, keepalives_interval=<keepalives_interval>, keepalives_count=<keepalives_count>, application_name="<Application Name>" )
Parameter
Description
host
The endpoint of your Hologres instance.
You can view the endpoint of your Hologres instance in the Network Information section of the Instance Details page in the Hologres console.
port
The port number of your Hologres instance.
You can view the port number of your Hologres instance in the Network Information section of the Instance Details page in the Hologres console.
dbname
The name of the Hologres database.
user
The AccessKey ID of the Alibaba Cloud account.
You can obtain the AccessKey ID from the AccessKey page.
password
The AccessKey secret of the Alibaba Cloud account.
You can obtain the AccessKey secret from the AccessKey page.
application_name
Optional. The name of your application. This parameter is used to specify the application that initiates the SQL statement, which helps you understand the usage of the statement when you query logs.
NoteIf you configure this parameter in the preceding code, you can identify the application that sent requests in the Application Name column on the Historical Slow Query page.
keepalives
Optional. The connection method. We recommend that you configure this parameter. Valid values:
1: uses a persistent connection.
0: uses a non-persistent connection.
keepalives_idle
The interval at which a keepalive message is sent when the connection to the Hologres database enters the idle state. Unit: seconds.
keepalives_interval
The interval at which a keepalive message is sent again if no response is returned. Unit: seconds.
keepalives_count
The maximum number of times a keepalive message is sent.
Sample code:
conn = psycopg2.connect(host="<Endpoint>", port=<Port>, dbname="<databases>", user="<Access ID>", password="<Access Key>", keepalives=1, # The connection is retained. keepalives_idle=130, # A keepalive message is sent at an interval of 130 seconds when the connection to the Hologres database is idle. keepalives_interval=10, # A keepalive message is sent again at an interval of 10 seconds if no response is returned. keepalives_count=15, # A keepalive message can be sent a maximum number of 15 times. application_name="<Application Name>" )
Use Hologres
After you connect to the Hologres database, you can use Psycopg 2 to develop data in the database. You can perform the following steps to create a table, insert data into the table, query the data, and then release resources. If you want to use fixed plans to perform read and write operations that require better performance, use the related Grand Unified Configuration (GUC) parameters. For more information, see Accelerate the execution of SQL statements by using fixed plans.
Create a cursor.
Before you develop data, run the
cur = conn.cursor()
command to create a cursor for the connection.Develop data.
Create a table.
The following sample command creates a table named
holo_test
and specifies that the type of data in the table is INTEGER. You can set a custom table name and specify a custom data type based on your business requirements.cur.execute("CREATE TABLE holo_test (num integer);")
Insert data.
The following sample command inserts integers from 1 to 1,000 to the
holo_test
table.cur.execute("INSERT INTO holo_test SELECT generate_series(%s, %s)", (1, 1000))
Query the data.
cur.execute("SELECT sum(num) FROM holo_test;") cur.fetchone()
Commit the transaction.
After you run the command for the data query, run the
conn.commit()
command. This ensures that the operation is committed as a transaction. You can also set the autocommit parameter to true to automatically commit the operation.Release resources.
To ensure that subsequent operations are not affected, after the preceding commands are run, run the following commands to close the cursor and disconnect Psycopg 2 from your Hologres database:
cur.close() conn.close()
Best practice of efficiently importing DataFrames to Hologres
In most cases where you use Python, you need to use pandas to convert data entries into a DataFrame, process the DataFrame, and then import the DataFrame to Hologres. In this case, you may want to efficiently import the DataFrame to Hologres. The to_sql
function is usually used to import DataFrames to Hologres. For more information, see pandas.DataFrame.to_sql.
The version of pandas must be V1.4.2 or later. You can run the following command to forcibly install pandas whose version is V1.5.1:
# pip install Pandas==1.5.1
When you import a DataFrame to Hologres, we recommend that you use a callable of the to_sql
function to execute the COPY FROM STDIN statement. Sample code in Python:
# Load a dependency.
import pandas as pd
import psycopg2
# Generate a connection string.
host="hgpostcn-cn-xxxxxx-cn-hangzhou.hologres.aliyuncs.com"
port=80
dbname="demo"
user="LTAI5xxxxx"
password="fa8Kdgxxxxx"
application_name="Python Test"
conn = "postgresql+psycopg2://{}:{}@{}:{}/{}?application_name={}".format(user, password, host, port, dbname,application_name)
print(conn)
# Convert data entries into a DataFrame.
data = [('1','1','1'),('2','2','2')]
cols = ('col1','col2','col3')
pd_data = pd.DataFrame(data, columns=cols)
# Specify a callable of the to_sql function.
import csv
from io import StringIO
def psql_insert_copy(table, conn, keys, data_iter):
"""
Execute SQL statement inserting data
Parameters
----------
table : pandas.io.sql.SQLTable
conn : sqlalchemy.engine.Engine or sqlalchemy.engine.Connection
keys : list of str
Column names
data_iter : Iterable that iterates the values to be inserted
"""
# gets a DBAPI connection that can provide a cursor
dbapi_conn = conn.connection
with dbapi_conn.cursor() as cur:
s_buf = StringIO()
writer = csv.writer(s_buf)
writer.writerows(data_iter)
s_buf.seek(0)
columns = ', '.join('"{}"'.format(k) for k in keys)
if table.schema:
table_name = '{}.{}'.format(table.schema, table.name)
else:
table_name = table.name
sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(
table_name, columns)
cur.copy_expert(sql=sql, file=s_buf)
# Import the DataFrame.
pd_data.to_sql(
name="pd_data",
con=conn,
if_exists="append",
index=False,
method=psql_insert_copy
)
You can view historical queries in the HoloWeb console to check whether the DataFrame is imported to Hologres by executing the COPY FROM STDIN statement.