All Products
Search
Document Center

MaxCompute:Install PyODPS

Last Updated:Dec 19, 2024

PyODPS, the MaxCompute SDK for Python, offers the DataFrame framework and essential operations for MaxCompute objects, facilitating data analysis in MaxCompute using Python. PyODPS can be utilized within DataWorks or on-premises environments. This topic outlines the installation process for PyODPS in an on-premises setting.

Prerequisites

Before installing PyODPS, ensure Python 3.6 or later is installed on your system. For installation instructions, see Install Python.

Installation procedure

  1. Open the command line interface.

  2. Execute the command below to install PyODPS.

    pip install pyodps
  3. Verify the installation by running the following command. If it returns no result and no errors are reported, the installation has been successful.

    python -c "from odps import ODPS"

If you encounter installation errors related to dependencies such as NumPy or PyArrow, which are C code compilation errors in most cases, the errors may be caused by outdated versions of pip or setuptools. If this occurs, run the following command to upgrade these tools and perform the installation again:

pip install -U pip setuptools

If the Python version you want to use is not the default version of the system, run the following command to use the required version:

/home/tops/bin/python3.7 -m pip install pyodps
#/home/tops/bin/python3.7 is the path of the installed Python.

If the error urllib3 v2.0 only supports OpenSSL 1.1.1+ is reported during installation, your Python uses an outdated OpenSSL version, which is not supported by the current urllib3 version. To resolve this issue, install urllib3 of an earlier version before installing PyODPS.

pip install "urllib3<2.0"

What to do next

Initialize the MaxCompute entry point.

import os
from odps import ODPS
# Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_ID to your AccessKey ID,
# Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET to your AccessKey secret,
o = ODPS(
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
    os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
    project='your-default-project',
    endpoint='your-end-point',
)

Explanation:

  • ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET: Set these environment variables with your Alibaba Cloud account's AccessKey ID and AccessKey secret. For setting environment variables, refer to Configure environment variables on Linux, macOS, and Windows systems.

    Note

    It is recommended to use environment variables instead of directly using the AccessKey ID and AccessKey secret.

  • your-default-project and your-end-point: Replace these placeholders with your default project name and Endpoint details. For Endpoint information by region, see Endpoint.

After configuring as described, PyODPS is ready for use in an on-premises environment. You can perform basic operations on ODPS objects, such as list, retrieve, exist, create, delete, among others. For further guidance on PyODPS, see Overview of Basic Operations and Overview of DataFrame.

Note

Unless specified otherwise, the 'o' object in this topic refers to a MaxCompute object.