All Products
Search
Document Center

MaxCompute:Install PyODPS

Last Updated:Dec 12, 2024

PyODPS is MaxCompute SDK for Python. PyODPS provides the DataFrame framework and basic operations on MaxCompute objects to help you analyze data in MaxCompute by using Python. You can use PyODPS in DataWorks or an on-premises environment. This topic describes how to install PyODPS when you use PyODPS in an on-premises environment.

Prerequisites

The version of Python meets requirements. We recommend that you use Python 3.6 or later. For more information, see Install Python.

Procedure

  1. Launch Python.

  2. Run the following command to install PyODPS:

    pip install pyodps
  3. Run the following command to check whether the installation is successful: If no result is returned and no error is reported, the installation is successful.

    python -c "from odps import ODPS"
  4. If the Python version is not the default version, run the following command to switch to the default version after pip is installed:

    /home/tops/bin/python3.7 -m pip install setuptools>=3.0
    #/home/tops/bin/python3.7 is the directory in which Python is installed.

What to do next

  1. Initialize the MaxCompute entry point.

    import os
    from odps import ODPS
    # Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_ID to the AccessKey ID of your Alibaba Cloud account. 
    # Set the environment variable ALIBABA_CLOUD_ACCESS_KEY_SECRET to the AccessKey secret of your Alibaba Cloud account. 
    o = ODPS(
        os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID'),
        os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET'),
        project='your-default-project',
        endpoint='your-end-point',
    )

    Description:

    • ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET: Set the two environment variables to the AccessKey ID and AccessKey secret of your Alibaba Cloud account separately. For the method for setting environment variables, see Configure environment variables in Linux, macOS, and Windows.

      Note

      We recommend that you use the environment variables rather than the AccessKey ID and AccessKey secret.

    • your-default-project and your-end-point: Replace them with the default project name and endpoint. For more information about the endpoints of each region, see Endpoints.

After you complete the preceding configurations, you can use PyODPS in your on-premises environment. For example, you can perform basic operations on MaxCompute objects, such as list, get, exist, create, and delete. For more information about how to use PyODPS, see Overview of basic operations and Overview of DataFrame.

Note

Unless otherwise specified, the o object in this topic is a MaxCompute object.