All Products
Search
Document Center

MaxCompute:SDK for Java

最終更新日:Oct 15, 2024

The Java SDK is a set of Java programming language interfaces provided by MaxCompute. Through these interfaces, you can use Java code to operate and manage MaxCompute services, such as accessing and managing projects, manipulating data tables, data transfer, and function management. This topic describes the packages that MaxCompute SDK for Java provides and the core classes of the SDK, such as instances, resources, tables, and functions.

Note

If you use MaxCompute SDK for Java to access the computing and storage services provided by MaxCompute, you are charged in the same way as you access the services by using other methods. For more information, see

Storage pricing (pay-as-you-go), Computing pricing, and Download pricing (pay-as-you-go).

Background information

This topic describes the core classes of MaxCompute SDK for Java. For more information about the SDK, see the ODPS SDK Core 0.45.1-public API.

You can use Maven to configure the version of the SDK that you want to use. For example, to specify the version of odps-sdk-core to use, add the following Maven dependency:

<dependency>
  <groupId>com.aliyun.odps</groupId>
  <artifactId>odps-sdk-core</artifactId>
  <version>X.X.X-public</version>
</dependency>
Note
  • Only the SDK of the 0.27.2-public version or later supports the new data types that are introduced in the MaxCompute V2.0 data type edition.

  • You can obtain the latest version of the SDK by searching for odps-sdk-core at search.maven.org.

The following table describes the packages that MaxCompute SDK for Java provides.

Package name

Description

odps-sdk-core

Provides the basic features of MaxCompute, such as the features that allow you to manage tables and projects. You can also use this package to access the Tunnel service.

odps-sdk-commons

Provides common utilities.

odps-sdk-udf

Provides user-defined function (UDF) features.

odps-sdk-mapred

Provides MapReduce features.

odps-sdk-graph

Provides Graph SDK for Java. You can obtain the SDK by searching for odps-sdk-graph.

AliyunAccount

You can call this API to obtain your Alibaba Cloud account. The constructor of the AliyunAccount class uses an AccessKey ID and an AccessKey secret as the input parameters. An AccessKey ID and an AccessKey secret constitute an AccessKey pair that is used to authenticate an Alibaba Cloud user. This API is also considered as a class that is used to initialize MaxCompute.

ODPS

The Odps class is the entry point of MaxCompute SDK for Java. The Odps class allows you to access all types of objects in MaxCompute, including projects, tables, resources, functions, and instances.

You can pass an AliyunAccount object to the constructor of the Odps class to create an Odps object. The following code shows an example:

// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the Resource Access Management (RAM) console.
// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
Account account = new AliyunAccount(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
Odps odps = new Odps(account);
String odpsUrl = "<your odps endpoint>";
odps.setEndpoint(odpsUrl);
odps.setDefaultProject("my_project");
for (Table t : odps.tables()) {
    ....
}              

MaxCompute Tunnel

MaxCompute Tunnel is developed based on Tunnel SDK. You can use MaxCompute Tunnel to upload data to and download data from MaxCompute. For more information, see MaxCompute Tunnel overview. MaxCompute Tunnel allows you to upload and download only tables, but not views.

MapReduce

For more information about MapReduce SDK supported by MapReduce, see Overview.

Projects

The Projects class defines a collection of all the projects in MaxCompute. Each MaxCompute project is an element in the collection. The following code shows an example:

// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
Account account = new AliyunAccount(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
Odps odps = new Odps(account);
String odpsUrl = "<your odps endpoint>";
odps.setEndpoint(odpsUrl);
Project p = odps.projects().get("my_exists");
p.reload();
...

Project

The Project class defines the details of a project. You can obtain a specified project by calling the get method of the Projects class.

SQLTask

The SQLTask class allows you to run and process SQL statements in MaxCompute. You can execute SQL statements by calling the run method of the SQLTask class.

The run method returns an Instance object. You can obtain the status and execution result of the SQL statement from the Instance object. The following code shows an example:

import java.util.List;
import com.aliyun.odps.Instance;
import com.aliyun.odps.Odps;
import com.aliyun.odps.OdpsException;
import com.aliyun.odps.account.Account;
import com.aliyun.odps.account.AliyunAccount;
import com.aliyun.odps.data.Record;
import com.aliyun.odps.task.SQLTask;
public class TestSql {
        // The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
				// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
				// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
				private static String accessId = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
				private static String accessKey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");     
        private static final String endPoint = "http://service.odps.aliyun.com/api";
        private static final String project = "";
        private static final String sql = "select category from iris;";
    public static void
        main(String[] args) {
        Account account = new AliyunAccount(accessId, accessKey);
        Odps odps = new Odps(account);
        odps.setEndpoint(endPoint);
        odps.setDefaultProject(project);
        Instance i;
        try {
            i = SQLTask.run(odps, sql);
            i.waitForSuccess();
            List<Record> records = SQLTask.getResult(i);
            for(Record r:records){
                System.out.println(r.get(0).toString());
            }
        } catch (OdpsException e) {
            e.printStackTrace();
        }
    }
}
Note
  • You can submit and execute only one SQL statement at a time.

  • To create a table by using MaxCompute SDK for Java, you must use the SQLTask class instead of the Table class. You need to pass the CREATE TABLE statement to SQLTask before you use an SQLTask object to execute the CREATE TABLE statement. For more information about the CREATE TABLE statement, see Table operations.

Instances

The Instances class defines a collection of all the instances in MaxCompute. Each instance is an element in the collection. The following code shows an example:

// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
Account account = new AliyunAccount(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
Odps odps = new Odps(account);
String odpsUrl = "<your odps endpoint>";
odps.setEndpoint(odpsUrl);
odps.setDefaultProject("my_project");
for (Instance i : odps.instances()) {
    ....
}

Instance

The Instance class defines the details of an instance. You can obtain a specified instance by calling the get method of the Instances class. The following code shows an example:

// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
Account account = new AliyunAccount(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
Odps odps = new Odps(account);
String odpsUrl = "<your odps endpoint>";
odps.setEndpoint(odpsUrl);
Instance instance= odps.instances().get("instance id");
Date startTime = instance.getStartTime();
Date endTime = instance.getEndTime();
...
    Status instanceStatus = instance.getStatus();
String instanceStatusStr = null;
if (instanceStatus == Status.TERMINATED) {
    instanceStatusStr = TaskStatus.Status.SUCCESS.toString();
    Map<String, TaskStatus> taskStatus = instance.getTaskStatus();
    for (Entry<String, TaskStatus> status : taskStatus.entrySet()) {
        if (status.getValue().getStatus() != TaskStatus.Status.SUCCESS) {
            instanceStatusStr = status.getValue().getStatus().toString();
            break;
        }
    }
} else {
    instanceStatusStr = instanceStatus.toString();
}
...
    TaskSummary summary = instance.getTaskSummary("task name");
String s = summary.getSummaryText();

Tables

The Tables class defines a collection of all tables in MaxCompute. Each table is an element in the collection. The following code shows an example:

// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
Account account = new AliyunAccount(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
Odps odps = new Odps(account);
String odpsUrl = "<your odps endpoint>";
odps.setEndpoint(odpsUrl);
odps.setDefaultProject("my_project");
for (Table t : odps.tables()) {
    ....
}

Table

The Table class defines the details of a table. You can obtain a specified table by calling the get method of the Tables class. The following code shows an example:

// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
Account account = new AliyunAccount(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
Odps odps = new Odps(account);
String odpsUrl = "<your odps endpoint>";
odps.setEndpoint(odpsUrl);
odps.setDefaultProject("my_project");

Table table = odps.tables().get("tablename");
for(Column c : table.getSchema().getColumns()) 
{
String name = c.getName();
TypeInfo type = c.getTypeInfo();
 }

The following code shows how to obtain the data in partitions of a table:

// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
Account account = new AliyunAccount(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
Odps odps = new Odps(account);
String odpsUrl = "<your odps endpoint>";
odps.setEndpoint(odpsUrl);
Table t = odps.tables().get("table name");
t.reload();
Partition part = t.getPartition(new PartitionSpec("partition_col=partition_col_value"));
part.reload();
...

Resources

The Resources class defines a collection of all the resources in MaxCompute. Each resource is an element in the collection. The following code shows an example:

// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
Account account = new AliyunAccount(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
Odps odps = new Odps(account);
String odpsUrl = "<your odps endpoint>";
odps.setEndpoint(odpsUrl);
odps.setDefaultProject("my_project");
for (Resource r : odps.resources()) {
    ....
}

Resource

The Resource class defines the details of each resource. You can obtain a specified resource by calling the get method of the Resources class. The following code shows an example:

// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
Account account = new AliyunAccount(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
Odps odps = new Odps(account);
String odpsUrl = "<your odps endpoint>";
odps.setEndpoint(odpsUrl);
Resource r = odps.resources().get("resource name");
r.reload();
if (r.getType() == Resource.Type.TABLE) {
    TableResource tr = new TableResource(r);
    String tableSource = tr.getSourceTable().getProject() + "."
        + tr.getSourceTable().getName();
    if (tr.getSourceTablePartition() != null) {
        tableSource += " partition(" + tr.getSourceTablePartition().toString()
            + ")";
    }
    ....
}

The following code shows how to create a file resource:

String projectName = "my_porject";
String source = "my_local_file.txt";
File file = new File(source);
InputStream is = new FileInputStream(file);
FileResource resource = new FileResource();
String name = file.getName();
resource.setName(name);
odps.resources().create(projectName, resource, is);

The following code shows how to create a table resource:

TableResource resource = new TableResource(tableName, tablePrj, partitionSpec);
//resource.setName(INVALID_USER_TABLE);
resource.setName("table_resource_name");
odps.resources().update(projectName, resource);

Functions

The Functions class defines a collection of all the functions in MaxCompute. Each function is an element in the collection. The following code shows an example:

// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
Account account = new AliyunAccount(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
Odps odps = new Odps(account);
String odpsUrl = "<your odps endpoint>";
odps.setEndpoint(odpsUrl);
odps.setDefaultProject("my_project");
for (Function f : odps.functions()) {
    ....
}                

Function

The Function class defines the details of a function. You can obtain a specified function by calling the get method of the Functions class. The following code shows an example:

// The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. If you use an AccessKey pair to call API operations, risks may occur. We recommend that you create and use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
// In this example, the AccessKey ID and AccessKey secret are configured as environment variables. You can also save your AccessKey pair in the configuration file based on your business requirements.
// We recommend that you do not directly specify the AccessKey ID and AccessKey secret in code to prevent AccessKey pair leaks.
Account account = new AliyunAccount(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
Odps odps = new Odps(account);
String odpsUrl = "<your odps endpoint>";
odps.setEndpoint(odpsUrl);
Function f = odps.functions().get("function name");
List<Resource> resources = f.getResources();               

The following code shows how to create a function:

String resources = "xxx:xxx";
String classType = "com.aliyun.odps.mapred.open.example.WordCount";
ArrayList<String> resourceList = new ArrayList<String>();
for (String r : resources.split(":")) {
    resourceList.add(r);
}
Function func = new Function();
func.setName(name);
func.setClassType(classType);
func.setResources(resourceList);
odps.functions().create(projectName, func);              

Reference

If you want to use Python for interacting with and processing data in MaxCompute, see Overview.