Creates a user-defined function (UDF) in a MaxCompute project.
Prerequisites
Resources that are required to create a UDF are added to the desired MaxCompute project by executing the add jar <localfile> [comment '<comment>'][-f];
statement. For more information, see ADD JAR.
Limits
Function names must be unique in a project. You cannot create a function that has the same name as an existing function in the project.
UDFs cannot overwrite built-in functions of MaxCompute. Only the project owner can use UDFs to overwrite built-in functions. If you use a UDF that overwrites a built-in function, warning information is displayed in Summary of the Logview of your job after the SQL statement is executed.
Syntax
create function <function_name> as <'package_to_class'> using <'resource_list'>;
Parameters
function_name: required. The name of the UDF that you want to create.
package_to_class: required. The class of the UDF that you want to create. This parameter is case-sensitive and must be enclosed in single quotation marks (').
For a Java UDF, specify this name as a fully qualified class name from the top-level package name to the UDF class name.
For a Python UDF, specify this name in the Python script name.Class name format.
NoteThe Python script name refers to the underlying resource name that uniquely identifies a resource. The name of a MaxCompute resource is not case-sensitive. For example, the resource name is pyudf_test.py the first time you upload a resource. If you rename the resource to PYUDF_TEST.py in DataStudio or use PYUDF_TEST.py to overwrite pyudf_test.py on the MaxCompute client, the underlying resource name that uniquely identifies the resource is still pyudf_test.py. In this case, when you create a UDF based on the resource, the class name must be pyudf_test.SampleUDF. You can execute the
list resource;
statement to view the underlying resource names that uniquely identify all resources.
resource_list: required. The list of resources used by the UDF.
The resource list must include the resources that contain the UDF code. Make sure that the resources are uploaded to MaxCompute.
If the code calls the Distributed Cache API to read resource files, this resource list must also contain the list of resource files that are read by the UDF.
The resource list consists of multiple resource names and must be enclosed in single quotation marks ('). The resource names must be separated by commas (,).
To specify the project that contains the resource, configure the parameter in the
<project_name>/resources/<resource_name>
format.
Examples
Example 1: Create the
my_lower
function. In this example, the Java UDF classorg.alidata.odps.udf.examples.Lower
is in my_lower.jar.create function my_lower as 'org.alidata.odps.udf.examples.Lower' using 'my_lower.jar';
Example 2: Create the
my_lower
function. In this example, the Python UDF class MyLower is in the pyudf_test.py script of thetest_project
project.create function my_lower as 'pyudf_test.MyLower' using 'test_project/resources/pyudf_test.py';
Example 3: Create the
test_udtf
function. In this example, the Java UDF classcom.aliyun.odps.examples.udf.UDTFResource
is in udtfexample1.jar. The function depends on the file resource file_resource.txt, the table resource table_resource1, and the archive resource test_archive.zip.create function test_udtf as 'com.aliyun.odps.examples.udf.UDTFResource' using 'udtfexample1.jar, file_resource.txt, table_resource1, test_archive.zip';
Related statements
FUNCTION: If you do not need to store SQL functions in the metadata system of MaxCompute, you can create temporary SQL functions. These functions apply only to the current SQL script.
DELETE FUNCTION: Deletes a function. You can write a UDF and call the delete_function() method of a MaxCompute entry object to delete the UDF.
DROP FUNCTION: Deletes an existing UDF from a MaxCompute project.
DESC FUNCTION: Views the information of a specified UDF in a MaxCompute project. The information includes the name, owner, creation time, class name, and resource list of the UDF.
LIST FUNCTIONS: Views the information of all UDFs in a MaxCompute project.
UPDATE FUNCTION: Updates a function. You can write a UDF and call the update method of MaxCompute to update the UDF.