MaxCompute V2.0 allows you to use external tables to access Object Storage Service (OSS) and Tablestore. MaxCompute Studio provides code templates to help you query unstructured data. This topic describes how to use MaxCompute Studio to query unstructured data.
Prerequisites
A MaxCompute project is connected. For more information, see Manage project connections.
A MaxCompute Java module is created. For more information, see Create a MaxCompute Java module.
Write a StorageHandler, Extractor, or Outputer program
In the left-side navigation pane of the Project tab, choose , right-click java, and then choose .
Configure Name, select Extractor, StorageHandler, or Outputer, and then press Enter.
Name: the name of the MaxCompute Java class that you want to create. If no package is created, enter a name in the Package name.Class name format. The system automatically creates a package that is named in this format.
Select Extractor, StorageHandler, or Outputer as the class type.
NoteYou can select Extractor, StorageHandler, or Outputer based on your business requirements.
Extractor: the class that allows the custom configuration of logic for reading unstructured data.
StorageHandler: the class that is used to implement the logic defined in Extractor and Outputer programs.
Outputer: the class that allows the custom configuration of logic for writing unstructured data.
After the class is created, develop a Java program in the code editor. The Java template is automatically filled with framework code. You need to only compile the logic code based on your requirements.
Debug the Extractor or Outputer program
Write your test cases to debug your Extractor or Outputer program based on the unit test examples in the examples directory.
Package and upload the program
After you debug the program, compress the program into a JAR package and upload the package to the MaxCompute server as a resource. For more information, see Package a Java program, upload the package, and create a MaxCompute UDF.
Query unstructured data
In the Project tool window, right-click scripts under your MaxCompute project and choose .
Enter the name of an SQL script in the Script Name field, select a MaxCompute project from the MaxCompute Project drop-down list, and then click OK.
In the code editor, enter the SQL statement that is used to create an external table and click the icon.
Create a MaxCompute SQL script, enter the following query statement, and then click the icon to query data.