All Products
Search
Document Center

MaxCompute:MaxCompute client (odpscmd)

Last Updated:Oct 30, 2024

If you are accustomed to using a CLI tool or want to quickly run tasks without a graphical user interface (GUI), we recommend that you use the MaxCompute client to access a MaxCompute project and run commands. You can run the MaxCompute client on your on-premises machine to manage MaxCompute projects in a simple and an efficient manner. This topic describes how to download the MaxCompute client installation package, and install, configure, and run the MaxCompute client. This topic also provides related instructions.

Prerequisites

Before you install the MaxCompute client, make sure that the following conditions are met:

  • Java 8 or later is installed on the machine on which you want to install the MaxCompute client.

  • A MaxCompute project is created.

    For more information about how to create a MaxCompute project, see Create a MaxCompute project.

  • A RAM user as which you want to use the MaxCompute client is added to the DataWorks workspace to which the MaxCompute project belongs.

    For more information about how to add members to a DataWorks workspace, see Grant permissions to a RAM user.

Limits

MaxCompute client V0.28.0 and later support Java Development Kit (JDK) 1.8 and JDK 1.9. The MaxCompute client of a version that is earlier than V0.28.0 supports only JDK 1.8. You can view the client version in the CLI after you start the MaxCompute client. For more information about how to start the MaxCompute client, see Run the MaxCompute client.

Billing rules

You are not charged for accessing a MaxCompute project on the MaxCompute client. However, you may be charged for the operations that you perform on the MaxCompute client. For example, if you execute an SQL statement to query data or write data on the MaxCompute client, the SQL statement consumes computing resources and the data that is written occupies the storage capacity. As a result, you are charged for data computing and storage. For more information about the billing rules of MaxCompute, see Overview.

Precautions

  • The output format of the MaxCompute client may not be forward compatible. The command syntax and execution rules of the client vary based on the client version. We recommend that you do not rely on the output format of the client to parse data. For more information about client versions, see aliyun-odps-console.

  • The first time you run the Tunnel Download command on the MaxCompute client, the MaxCompute client creates a folder named session in the client installation directory plugins/dship of your on-premises machine. The folder is used to store logs. If multiple users run the Tunnel Download command on the same machine, you can use one of the following methods to ensure data security:

    • Use the folder permission management feature provided by the machine to manage the permissions on the session folder.

    • Add the -sd <Name of the new session folder> or -session-dir <Name of the new session folder> parameter to the Tunnel Download command to download data to a different session folder. For more information about the Tunnel Download command, see Download.

  • Two consecutive minus signs (-) are used to comment out a command line on the MaxCompute client.

  • By default, the MaxCompute client uses UTF-8 encoding. In an on-premises environment that does not use UTF-8 encoding, garbled characters may exist in the following scenarios: 1. When you query data in a MaxCompute table by using the MaxCompute client, the system returns values that contain Chinese characters. 2. You run Tunnel commands on the MaxCompute client to upload local data files to MaxCompute.

Install and configure the MaxCompute client

Note

MaxCompute client V0.27.0 and later support the MaxCompute V2.0 data type edition. We recommend that you use the MaxCompute V2.0 data type edition. For more information about the data types that are supported by the MaxCompute V2.0 data type edition, see MaxCompute V2.0 data type edition.

To install and configure the MaxCompute client, perform the following steps:

  1. Download the MaxCompute client installation package from GitHub.

    Note
    • You can click the preceding link to download the latest version of the MaxCompute client installation package odpscmd_public.zip on the page that appears.

    • If you cannot download the package after you click the preceding link, click odpscmd_public_0.48.0 to download the package. For more information about how to resolve the download failure, we recommend that you search for related solutions by using search engines.

  2. Decompress the downloaded package to obtain the bin, conf, lib, and plugins folders.

  3. Open the conf folder and configure the odps_config.ini file.

    In the odps_config.ini file, lines that start with a number sign (#) are comments. The following table describes the parameters in the odps_config.ini file.

    Parameter

    Required

    Description

    Example

    project_name

    Yes

    The name of the MaxCompute project that you want to access.

    If you create a workspace in standard mode, take note of the differences in the project names between the production environment and development environment when you configure the Default Project parameter. The name of the project for the development environment ends with _dev. For more information, see Differences between workspaces in basic mode and workspaces in standard mode.

    To obtain the name of the MaxCompute project, perform the following steps: Log on to the MaxCompute console. In the top navigation bar, select a region. In the left-side navigation pane, choose Workspace > Projects. On the Projects page, view the name of the MaxCompute project.

    doc_test_dev

    access_id

    Yes

    The AccessKey ID of your Alibaba Cloud account or RAM user.

    You can click the profile picture in the upper-right corner of the MaxCompute console and select AccessKey Management to obtain the AccessKey ID.

    None

    access_key

    Yes

    The AccessKey secret that corresponds to the AccessKey ID.

    You can click the profile picture in the upper-right corner of the MaxCompute console and select AccessKey Management to obtain the AccessKey secret.

    None

    end_point

    Yes

    Obtain the endpoint of the MaxCompute project.

    You must configure this parameter based on the region and network connection method that you selected when you create the MaxCompute project. For more information about the endpoints that correspond to different regions and network connection types, see Endpoints.

    Important
    • The value of this parameter is the endpoint of MaxCompute, which is used to access MaxCompute rather than MaxCompute Tunnel.

    • If you specify an invalid endpoint, an error occurs when you access the MaxCompute project.

    http://service.cn-hangzhou.maxcompute.aliyun.com/api

    log_view_host

    No

    The Uniform Resource Locator (URL) of LogView. You can view the detailed runtime information of a job by using this URL. This information helps you troubleshoot job errors. Set the value to http://logview.odps.aliyun.com.

    Note

    We recommend that you configure this parameter. If you do not configure this parameter, you cannot identify the cause of job errors.

    http://logview.odps.aliyun.com

    https_check

    No

    Specifies whether to enable HTTPS access. If HTTPS access is enabled, requests to access MaxCompute projects are encrypted. Valid values:

    • True: HTTPS access is enabled.

    • False: HTTP access is enabled.

    Default value: False.

    True

    data_size_confirm

    No

    The maximum size of input data. Unit: GB. You can set this parameter to any value. We recommend that you set this parameter to 100.

    100

    update_url

    No

    A reserved parameter.

    None

    use_instance_tunnel

    No

    Specifies whether to use InstanceTunnel to download the execution results of SQL statements. Valid values:

    • True: InstanceTunnel is used to download the execution results of SQL statements.

    • False: InstanceTunnel is not used to download the execution results of SQL statements.

    Default value: False.

    True

    instance_tunnel_max_record

    No

    The maximum number of SQL execution results that can be returned by the client. If the use_instance_tunnel parameter is set to True, you must configure this parameter. Maximum value: 10000.

    10000

    tunnel_endpoint

    No

    The public endpoint of MaxCompute Tunnel. If you do not configure this parameter, traffic is automatically routed to the Tunnel endpoint that corresponds to the network in which MaxCompute resides. If you configure this parameter, traffic is routed to the specified endpoint and automatic routing is not performed.

    For more information about the Tunnel endpoints that correspond to each region and network, see Endpoints.

    http://dt.cn-hangzhou.maxcompute.aliyun.com

    set.<key>

    No

    The properties of the MaxCompute project.

    For more information about the properties of MaxCompute projects, see Properties.

    set.odps.sql.decimal.odps2=true

    Note

    Make sure that the settings of the preceding parameters are valid. Invalid settings will result in a project connection failure.

Run the MaxCompute client

You can start the MaxCompute client by using one of the following methods:

Method 1: Start the MaxCompute client by using the script file of the installation package

In the bin folder in the installation directory of the MaxCompute client, double-click the odpscmd.bat file for the Windows operating system (OS) or the odpscmd file for macOS to start the MaxCompute client. If the information shown in the following figure is returned, the MaxCompute project is connected.image.png

Method 2: Start the MaxCompute client by using the CLI of the system

In the CLI of the system, go to the bin folder in the installation directory of the MaxCompute client and run the odpscmd command for the Windows OS or run the sh odpscmd command for macOS or Linux OS to start the MaxCompute client. If the information shown in the following figure is returned, the MaxCompute project is connected.

Note

An error is returned when you run the sh odpscmd command in Ubuntu. You can run the ./odpscmd command to start the MaxCompute client.

image.png

If you start the MaxCompute client by using the CLI of the system, you can specify parameters to run commands. For more information, see Specify startup parameters.

Perform operations on the MaxCompute client

Obtain the help information about all commands

You can obtain help information about the commands of the MaxCompute client by using one of the following methods:

View the help information about commands on the MaxCompute client

  • View the help information about all commands.

    odps@project_name>help;
    -- The preceding command is equivalent to the following command: 
    odps@project_name>h;
  • Specify a keyword to view the help information about the related commands.

    Example: Obtain the help information about the commands that are related to table operations.

    odps@project_name>help table;
    -- The following results are returned: 
    Usage: alter table <tablename> merge smallfiles
    Usage: export table <tablename>
    Usage: show tables [in <project_name>] [like '<prefix>']
           list|ls tables [-p,-project <project_name>]
    Usage: describe|desc [<projectname>.]<tablename> [partition(<spec>)]
    Usage: read [<project_name>.]<table_name> [(<col_name>[,..])] [PARTITION (<partition_spec>)] [line_num]
    Important

    The read command uses the SQL syntax. For more information about the billing method for SQL jobs, see Overview.

View the help information about all commands in the CLI of the system

In the CLI of the system, go to the bin folder in the installation directory of the MaxCompute client, and run the following command to view the help information about all commands. If you start the MaxCompute client by using the CLI of the system, you can specify a series of parameters in the commands. For more information about these parameters, see Specify startup parameters.

...\odpscmd\bin>odpscmd -h

Obtain information about the current logon user

You can run the following command in the CLI to obtain the information about the current logon user:

odps@project_name>whoami;

Description of the returned result:

  • Name: the account of the current logon user.

  • Source IP: the IP address of the machine in which the MaxCompute client resides.

  • End_Point: the endpoint of MaxCompute.

  • Project: the name of the MaxCompute project.

  • Schema: the schema information about the MaxCompute project.

Exit the MaxCompute client

You can run the following command in the CLI to exit the MaxCompute client:

odps@project_name>quit;
-- The preceding command is equivalent to the following command: 
odps@project_name>q;

What to do next

After you log on to the MaxCompute client, you can run SQL commands in the MaxCompute project. For more information, see Use the MaxCompute client.

Note

For more information about the command syntax supported by the MaxCompute client, see Common commands or SQL commands and functions.

FAQ

After the odps_config.ini file is configured, the following common errors may be reported when you start the MaxCompute client:

  • Error message: no java found

    • Possible cause

      The Java development environment is not installed on the machine where the MaxCompute client is deployed.

    • Solution

      Install the Java development environment on the machine where the MaxCompute client is deployed and specify environment variables.

      Note

      The MaxCompute client of version 0.28.0 or later supports JDK 1.9. The MaxCompute client of a version earlier than 0.28.0 supports only JDK 1.8.

  • Error message: Error: Cannot find or load the main class com.aliyun.openservices.odps.console.ODPSConsole

    • Possible cause

      The MaxCompute client installation package may be downloaded twice. The second time the package is downloaded, the directory is automatically renamed odpscmd_public (1). The directory name contains invalid characters such as spaces. As a result, an error is returned when the system mistakenly identifies the directory.

    • Solution

      Remove invalid characters such as spaces from the name of the directory.

  • Error message: Accessing project '<projectname>' failed: ODPS-0420111: Project not found - '<projectname>'.

    • Possible cause

      The project name in the odps_config.ini configuration file is invalid.

    • Solution

      Log on to the MaxCompute console. In the top navigation bar, select a region. In the left-side navigation pane, choose Workspace > Projects. On the Projects page, obtain the name of the MaxCompute project. Then, modify the odps_config.ini file.

  • Error message: Accessing project '<projectname>' failed: ODPS-0420095: Access Denied - Authorization Failed [4002], You don't exist in project <projectname>.

  • Error message: Accessing project '<projectname>' failed: { "Code": "InvalidProjectTable", "Message": "The specified project or table name is not valid or missing."} or Accessing project '<projectname>' failed: connect timed out

    • Possible cause

      The value of the end_point parameter is invalid. For example, you want to use the MaxCompute client on your on-premises machine to connect to a MaxCompute project over the Internet, but you enter the endpoint of the cloud product interconnection network of Alibaba Cloud or the endpoint of MaxCompute Tunnel.

    • Solution

      Set the end_point parameter to the endpoint that matches the region and network environment of the project to which you want to connect. You can obtain the endpoint from Endpoints.

      Note

      The value of the end_point parameter is the endpoint of MaxCompute, which is used to access MaxCompute rather than MaxCompute Tunnel.

  • Error message: Accessing project '<projectname>' failed: <endpoint>

    • Possible cause

      The value of the end_point parameter is invalid. For example, cn in http://service.cn-hangzhou.maxcompute.aliyun.com/api is mistakenly entered as ch.

    • Solution

      Copy the endpoint that matches the region and network environment of the project from Endpoints. We recommend that you do not manually enter the endpoint.

Specify startup parameters

In the CLI of the system, you can specify a series of parameters to run a command. The following code shows the usage of these parameters.

Usage: odpscmd [OPTION]...
where options include:
    --help                                  (-h)for help
    --config=<config_file>                  specify another config file
    --project=<prj_name>                    use project
    --endpoint=<http://host:port>           set endpoint
    -k <n>                                  will skip begining queries and start from specified position
    -r <n>                                  set retry times
    -f <"file_path;">                       execute command in file
    -e <"command;[command;]...">            execute command, include sql command

The following table describes the startup parameters.

Parameter

Description

Sample command

--help or -h

Obtains the help information about all commands of the MaxCompute client.

odpscmd --help

--config

Specifies the directory in which the configuration file odps_config.ini is stored. The default directory is odpscmd_public/conf/odps_config.ini.

odpscmd --config=D:/odpscmd/conf/odps_config.ini

--project

Specifies the name of the MaxCompute project that you want to access.

odpscmd --project=doc_test

--endpoint

Specifies the endpoint of MaxCompute. For more information about the endpoints of MaxCompute, see Endpoints.

odpscmd --endpoint=http://service.cn-shanghai.maxcompute.aliyun.com/api

-k

Executes statements from a specific location. If n is set to a value that is less than or equal to 0, the execution starts from the first statement. Multiple statements are separated by semicolons (;).

Ignore the first two statements and start from the third statement: odpscmd -k 3 -e "drop table table_name;create table table_name (dummy string);insert overwrite table table_name select count(*) from table_name;"

-r

Specifies the number of retries that are allowed after a job fails to run.

odpscmd -r 2 -e "select * from sale_detail;select * from table_test;"

-f

Specifies the file to read.

  1. Prepare a script file named script.txt. In this example, the file is stored in drive D and contains the following data:

    drop table if exists test_table_mj;
    create table test_table_mj (id string, name string);
    drop table test_table_mj;
  2. In the CLI, go to the bin folder in the installation directory of the MaxCompute client and run the following command:

    ..\odpscmd\bin>odpscmd -f D:/script.txt;

-e

Specifies the command that you want to run.

odpscmd -e "select * from sale_detail;"

The dynamic return value of an odpscmd -e <SQL statement> command may be called by a shell script that is run in a shell window or the Command Prompt in Windows. A shell variable obtains the return value and uses the return value in subsequent jobs. In this scenario, only field values are required. Other information, such as the runtime information and headers, cannot be returned. To facilitate shell calls, you must set the use_instance_tunnel parameter in the odps_config.ini file to false to disable InstanceTunnel. You can run the set odps.sql.select.output.format={"needHeader":false,"fieldDelim":" "}; command to disable header display.

For example, a table named noheader contains one column and three rows of data. The field values are 1, 2, and 3. After you run the following command to redirect the standard output of the query result to the destination handle, the output contains only field values.

-- Run the following command in the Command Prompt in Windows: 
...\odpscmd\bin>odpscmd -e "set odps.sql.select.output.format={""needHeader"":false,""fieldDelim"":"" ""};select * from noheader;" >D:\test.txt
-- The returned results are stored in the test.txt file in drive D. 

-- Run the following command in a shell window: 
/Users/.../odpscmd/bin/odpscmd -e "set odps.sql.select.output.format={\"needHeader\":false,\"fieldDelim\":\"\"};select * from noheader;" >/Users/A/temp/test.txt 
-- The returned results are stored in the test.txt file in the /Users/A/temp directory. 
-- The following results are returned: 
1
2
3

Version updates

The following table describes the latest version updates of the MaxCompute client. For more information, click the URL of a specific version.

Version

Operation type

Description

v0.47.0-public

New feature

  • HTTP commands: HTTP commands are added to help users easily initiate HTTP requests as the current user.

  • Session persistence variable: The keep-session-variables startup parameter is added to manage session persistence. If you use this parameter, the previously configured session variable is not deleted when you run the USE command to switch projects. For example, the set a=b configuration is retained even if you switch between projects.

  • Upgrade of Tunnel commands:

    • The -qn option is added to the Tunnel Upload and Tunnel Download commands to specify the name of the Tunnel quota that is used to access MaxCompute.

    • The -dfp option is added to the Tunnel Upload command to specify the format of the DATETIME text that you want to upload.

    For more information about Tunnel commands, see Tunnel commands.

v0.46.5-public

New feature

  • The USE feature is supported to retain the session-level flag parameter that is previously configured.

  • Upgrade of Tunnel commands

    • The Tunnel Upsert command is supported. For more information, see Tunnel commands.

    • When you download a table with row-level access control, the execution information about the data filtering SQL statement is returned.

v0.45.1-public

New feature

JSON data type and TIMESTAMP_NTZ data type are supported in MaxCompute.

v0.43.2-public

New feature

  • External volume can be created.

  • SHOW commands can be used to query all built-in functions or the built-in functions that meet specific rules in a MaxCompute project.

v0.40.10-public

Fixed issue

The dependency on log4j is deleted.

v0.40.8-public

New feature

Project data can be stored by schema. For more information, see Schema-related operations.

v0.37.5-public

New feature

Data of complex data types can be uploaded or downloaded by running Tunnel commands.

v0.37.4-public

Enhanced feature

  • The help information about commands is optimized.

  • The desc extended partition statement returns more information about the properties of partitions.

v0.36.0-public

New feature

An external project can be created and connected to Data Lake Formation (DLF). This helps implement the data lakehouse feature.

Fixed issue

The issue that caused the nanosecond part of data of the TIMESTAMP type to be incorrectly processed is fixed.