PyODPS is MaxCompute SDK for Python. PyODPS provides the DataFrame framework and basic operations on MaxCompute objects to help you analyze data in MaxCompute by using Python. You can use PyODPS in DataWorks or an on-premises environment. This topic describes how to install PyODPS when you use PyODPS in an on-premises environment.
Prerequisites
The version of Python meets requirements. We recommend that you use Python 3.6 or later. For more information, see Install Python.
Procedure
Open the command line interface.
Run the following command to install PyODPS:
pip install pyodps
Run the following command to check whether the installation is successful. If no result is returned and no error is reported, the installation is successful.
python -c "from odps import ODPS"
If you encounter installation errors related to dependencies such as NumPy or PyArrow, which are C code compilation errors in most cases, the errors may be caused by outdated versions of pip or setuptools. If this occurs, run the following command to upgrade these tools and perform the installation again:
pip install -U pip setuptools
If the Python version you want to use is not the default version of the system, run the following command to use the required version:
/home/tops/bin/python3.7 -m pip install pyodps
#/home/tops/bin/python3.7 is the path of the installed Python.
If the error urllib3 v2.0 only supports OpenSSL 1.1.1+ is reported during installation, your Python uses an outdated OpenSSL version, which is not supported by the current urllib3 version. To resolve this issue, install urllib3 of an earlier version before installing PyODPS.
pip install "urllib3<2.0"
What to do next
Initialize the MaxCompute entry point.
import os
from odps import ODPS
# Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_ID to the AccessKey ID of your Alibaba Cloud account.
# Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET to the AccessKey secret of your Alibaba Cloud account.
o = ODPS(
os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
project='your-default-project',
endpoint='your-end-point',
)
Description:
ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET: Set the two environment variables to the AccessKey ID and AccessKey secret of your Alibaba Cloud account separately. For the method for setting environment variables, see Configure environment variables in Linux, macOS, and Windows.
NoteWe recommend that you use the environment variables rather than the AccessKey ID and AccessKey secret.
your-default-project and your-end-point: Replace them with the default project name and endpoint. For more information about the endpoints of each region, see Endpoint.
After you complete the preceding configurations, you can use PyODPS in your on-premises environment. For example, you can perform basic operations on MaxCompute objects, such as list
, get
, exist
, create
, and delete
. For more information about how to use PyODPS, see Overview of basic operations and Overview of DataFrame.
Unless otherwise specified, the o object in this topic is a MaxCompute object.