All Products
Search
Document Center

MaxCompute:Use Java UDTFs to read MaxCompute resources

Last Updated:Sep 23, 2024

This topic provides an example on how to use Java user-defined table-valued functions (UDTFs) to read resources from MaxCompute base on MaxCompute Studio.

Prerequisites

UDTF code examples

The following sample code is the Java UDTF.

Note

Parameter category

Parameter type

Description

Input Parameter

String

First input parameter.

String

Second input parameter.

Output Parameter

String

First input parameter value.

Bigint

Length of the second input parameter string.

String

Concatenated value of the line count from file_resource.txt, the row count from the table_resource1 table, and the row count from the table_resource2 table.

package com.aliyun.odps.examples.udf;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.Iterator;
import com.aliyun.odps.udf.ExecutionContext;
import com.aliyun.odps.udf.UDFException;
import com.aliyun.odps.udf.UDTF;
import com.aliyun.odps.udf.annotation.Resolve;
/**
 * project: example_project 
 * table: wc_in2 
 * partitions: p1=2,p2=1 
 * columns: cola,colc
 */
@Resolve("string,string->string,bigint,string")
public class UDTFResource extends UDTF {
  ExecutionContext ctx;
  long fileResourceLineCount;
  long tableResource1RecordCount;
  long tableResource2RecordCount;
  @Override
  public void setup(ExecutionContext ctx) throws UDFException {
  this.ctx = ctx;
  try {
   InputStream in = ctx.readResourceFileAsStream("file_resource.txt");
   BufferedReader br = new BufferedReader(new InputStreamReader(in));
   String line;
   fileResourceLineCount = 0;
   while ((line = br.readLine()) != null) {
     fileResourceLineCount++;
   }
   br.close();
   Iterator<Object[]> iterator = ctx.readResourceTable("table_resource1").iterator();
   tableResource1RecordCount = 0;
   while (iterator.hasNext()) {
     tableResource1RecordCount++;
     iterator.next();
   }
   iterator = ctx.readResourceTable("table_resource2").iterator();
   tableResource2RecordCount = 0;
   while (iterator.hasNext()) {
     tableResource2RecordCount++;
     iterator.next();
   }
 } catch (IOException e) {
   throw new UDFException(e);
 }
}
   @Override
   public void process(Object[] args) throws UDFException {
     String a = (String) args[0];
     long b = args[1] == null ? 0 : ((String) args[1]).length();
     forward(a, b, "fileResourceLineCount=" + fileResourceLineCount + "|tableResource1RecordCount="
     + tableResource1RecordCount + "|tableResource2RecordCount=" + tableResource2RecordCount);
    }
}

The following code shows the dependency that is required in the pom.xml file for local testing.

<dependency>
    <groupId>com.aliyun.odps</groupId>
    <artifactId>odps-udf-local</artifactId>
    <version>0.48.0-public</version>
</dependency>

Procedure

Local testing

  1. Create a new Java program of the UDTF type in MaxCompute Studio. For example, name the Java Class UDTFResource and use the program code from the UDTF code examples.

  2. Configure the runtime parameters based on the warehouse resource in the Java Module.

    Note
    • The input parameters are the values of the first and third columns of each row in the partition p1=2, p2=1 of the wc_in2 table in the local resource.

    • The code execution retrieves data from the local resource file_resource.txt, the corresponding table wc_in1 under table_resource1, and the corresponding table wc_in2 (p1=2, p2=1) under table_resource2.

    g2

  3. Right-click the UDTFResource class and select Run to execute the program. The results are displayed.

    image

    image

Client testing

  1. Click image Project Explorer in the upper-left corner of IDEA, and select image Add Resource.

    image

  2. Add the file_resource.txt file based on the MaxCompute instance information.

    image

  3. Add the table_resource1 and table_resource2 resources. Then, set the type to the table. Map these resources to the wc_in1 and wc_in2 tables created in MaxCompute and insert data as necessary. image

  4. Package the created UDTF into a JAR file, upload it to the MaxCompute project, and register the function. For example, the function name is my_udtf. Right-click the UDTFResource class and select Deploy to Server... to enter the packaging and upload interface.

    image

  5. Click image Project Explorer in the upper-left corner of IDEA, right-click the target MaxCompute project, and select Open Console to start the MaxCompute client and execute SQL commands to call the newly created UDTF. The results are displayed.

    image

    image

    Sample SQL command:

    SELECT my_udtf("10","20") AS (a, b, fileResourceLineCount);

References

For more information about MaxCompute resources, see Resource.