All Products
Search
Document Center

AnalyticDB:UDFs

Last Updated:Aug 12, 2024

AnalyticDB for MySQL allows you to use the REMOTE_CALL() function to invoke custom functions that you create in Function Compute. This way, you can use user-defined functions (UDFs) in AnalyticDB for MySQL.

Prerequisites

  • An AnalyticDB for MySQL cluster of V3.2.1.0 or later is created.

  • The AnalyticDB for MySQL cluster is deployed in the same region as Function Compute. For more information, see the "Step 1: Activate Function Compute" section of the Quickly create a function topic.

  • A custom function is created. For more information, see the "Step 2: Create a service" and "Step 3: Create a function" sections of the Quickly create a function topic.

    Note

    To quickly start Function Compute, we recommend that you set the Runtime parameter to Python, Go, or Node.js when you create a custom function.

Overview

The remote UDF feature of AnalyticDB for MySQL uses Function Compute as a remote function server. You can use the REMOTE_CALL() function in AnalyticDB for MySQL to invoke custom functions that you create in Function Compute. The following figure shows the process.

image
  1. The client submits an SQL statement to AnalyticDB for MySQL.

  2. After you invoke the REMOTE_CALL() function, AnalyticDB for MySQL sends data to Function Compute in the JSON format.

  3. Function Compute uses a custom function to compute data.

  4. Function Compute returns the result to AnalyticDB for MySQL in the JSON format.

  5. AnalyticDB for MySQL returns the result to the client.

Usage notes

The remote UDF feature supports only scalar UDFs.

Syntax

remote_call('returnType', 'func_name', ['{external_config}'|NULL], X1, X2, ..., Xn)

Parameters

Parameter

Description

returnType

The data type of the return value. Only the BOOLEAN, DOUBLE, VARCHAR, INTEGER, TINYINT, BIGINT, TIME, DATE, TIMESTAMP, and DATETIME types are supported.

func_name

The name of the custom function that you create in Function Compute.

  • If the version of Function Compute is 3.0, specify only the name of the custom function.

  • If the version of Function Compute is 2.0, use a dollar sign ($) to specify the names of the service and the custom function. Format: serviceName$functionName.

Note

You can log on to the Function Compute console to view the version of Function Compute.

external_config|null

The extended parameters that are used to invoke functions. If you do not want to specify extended parameters, specify null. Specify extended parameters in the JSON format. Separate multiple parameters with commas (,).

You can specify extended parameters in a query or execute the SET ADB_CONFIG statement to configure global parameters. If you configure global and query-level parameters at the same time, the query-level parameters take precedence. For information about extended parameters, see the following table.

X1.......Xn

The input parameters. Only the BOOLEAN, DOUBLE, VARCHAR, INTEGER, TINYINT, BIGINT, TIME, DATE, TIMESTAMP, and DATETIME types are supported.

The following table describes the extended parameters that are supported by the external_config parameter.

Global parameter

Query-level parameter

Required

Description

XIHE_REMOTE_CALL_SERVER_ENDPOINT

endpoint

Yes

The internal endpoint of Function Compute. For more information, see the "Internal endpoints" section of the Endpoints topic.

XIHE_REMOTE_CALL_SERVER_AK

-

Yes if specific conditions are met

The AccessKey ID of the Alibaba Cloud account or the Resource Access Management (RAM) user that has permissions on Function Compute.

  • If the edition of the AnalyticDB for MySQL cluster is Data Warehouse Edition, you must specify the AccessKey ID.

  • If the edition of the AnalyticDB for MySQL cluster is Data Lakehouse Edition, the following rules apply:

    • When you access Function Compute that belongs to the same Alibaba Cloud account or RAM user, you do not need to specify the AccessKey ID. Security Token Service (STS) is automatically used for authentication.

    • When you access Function Compute across Alibaba Cloud accounts or RAM users, you must specify the AccessKey ID.

For information about how to obtain an AccessKey ID, see Accounts and permissions.

Important

The XIHE_REMOTE_CALL_SERVER_AK parameter can be configured only globally.

XIHE_REMOTE_CALL_SERVER_SK

-

Yes if specific conditions are met

The AccessKey secret of the Alibaba Cloud account or the RAM user that has permissions on Function Compute.

  • If the edition of the AnalyticDB for MySQL cluster is Data Warehouse Edition, you must specify the AccessKey secret.

  • If the edition of the AnalyticDB for MySQL cluster is Data Lakehouse Edition, the following rules apply:

    • When you access Function Compute that belongs to the same Alibaba Cloud account or RAM user, you do not need to specify the AccessKey secret. STS is automatically used to check authentication.

    • When you access Function Compute across Alibaba Cloud accounts or RAM users, you must specify the AccessKey secret.

For information about how to obtain an AccessKey secret, see Accounts and permissions.

Important

The XIHE_REMOTE_CALL_SERVER_SK parameter can be configured only globally.

XIHE_REMOTE_CALL_COMPRESS_ENABLED

compressed

No

Specifies whether AnalyticDB for MySQL compresses data into the GZIP format before sending the data to Function Compute. Valid values:

  • true (default)

  • false

XIHE_REMOTE_CALL_MAX_BATCH_SIZE

max_batch_size

No

The maximum number of data rows that can be sent per batch to Function Compute after you invoke the REMOTE_CALL() function. By default, the value of this parameter is unlimited. A smaller number of data rows results in lower consumption of CPU and memory resources in Function Compute and an extended execution duration.

Examples

Create a custom function in Function Compute

After you invoke the REMOTE_CALL() function, AnalyticDB for MySQL sends data to Function Compute in the JSON format. Function Compute computes the data and returns the result in the JSON format. You must create a custom function in Function Compute based on the formats in which data is sent and returned.

Formats for sending and returning data

Format for sending data

{
  "rowCount": 3, 
  "compressed":true, 
  "data": 
  [
      [1, "a", "2023-08-22 11:30:00"],
      [2, "b", "2023-08-22 12:30:00"],
      [3, "c", null]
  ]
}

JSON data parameters:

  • rowCount: the number of data rows.

  • compressed: specifies whether to compress the data that you want to send. Valid values: true and false.

  • data: the data that you want to send. This parameter consists of multiple JSON arrays. Each JSON array is a row of data.

Format for returning data

{
  "success": true,   
  "message": "",  
  "result" : [  ] 
}

JSON data parameters:

  • success: indicates whether the request was successful. Valid values: true and false.

  • message: the returned message. If the request was successful, an empty string is returned. If the request failed, an error message is returned.

  • result: the result that is obtained from Function Compute.

In this example, a custom function named ConcactNumberWithCompress is created in Function Compute. Sample code:

public class App
        implements StreamRequestHandler, FunctionInitializer
{
    public static final Logger log = Logger.getLogger(App.class.getName());

    public void initialize(Context context)
            throws IOException
    {
        // TODO
    }

    @Override
    public void handleRequest(
            InputStream inputStream, OutputStream outputStream, Context context)
            throws IOException
    {
        InputStream notCompressedInputStream1;
        notCompressedInputStream1 = tryUnCompress(inputStream);

        JSONObject response = new JSONObject();
        try {
            JSONObject requestJson = JSONObject.parseObject(IOUtils.toString(notCompressedInputStream1));
            if (requestJson.containsKey("data")) {
                JSONArray result = new JSONArray();
                JSONArray data = requestJson.getJSONArray("data");
                for (int i = 0; i < data.size(); i++) {
                    JSONArray row = data.getJSONArray(i);
                    // The beginning of the Function Compute business logic. 
                    // For example, if the custom function contains two input parameters a and b, each row of data that is sent by AnalyticDB for MySQL contains two data items a and b. 
                    if (row.size() == 2) {
                        result.add(testFunc(row.getInteger(0), row.getInteger(1)));
                    }
                    else {
                        throw new RuntimeException("row size is not 2");
                    }
                    // The end of the Function Compute business logic. 
                }

                response.put("result", result);
                response.put("success", true);
                response.put("message", "");
            }
            else {
                response.put("success", false);
                response.put("message", "no data inside");
            }
        }
        catch (Exception e) {
            log.info("error happened" + e.getMessage());
            response.put("success", false);
            response.put("message", e.getMessage());
        }

        outputStream.write(tryCompress(response.toJSONString().getBytes()));
    }

    private String testFunc(int a, int b)
    {
        // Return input parameters a and b and a value of 1 by combining them with ampersands (&). 
        return String.valueOf(a) + '&' + b + '&' + 1;
    }

    public static byte[] tryCompress(byte[] bytes)
    {
        ByteArrayOutputStream byteOutputStream = new ByteArrayOutputStream();
        try {
            GZIPOutputStream gzipOutputStream = new GZIPOutputStream(byteOutputStream);
            gzipOutputStream.write(bytes);
            gzipOutputStream.close();
        }
        catch (IOException e) {
            throw new RuntimeException(e);
        }

        return byteOutputStream.toByteArray();
    }

    public static InputStream tryUnCompress(InputStream inputStream)
            throws IOException
    {
        GZIPInputStream gzipInputStream;
        gzipInputStream = new GZIPInputStream(inputStream);
        return gzipInputStream;
    }
}

Invoke the REMOTE_CALL() function in AnalyticDB for MySQL

  • If you did not specify global parameters, invoke the ConcactNumberWithCompress function that is created in Function Compute to combine 1 and 2 by using ampersands (&).

    SELECT remote_call('varchar','ConcactNumberWithCompress','{endpoint:"1234567890000****.cn-zhangjiakou-internal.fc.aliyuncs.com"}',1,2);

    Sample result:

    1&2&1
  • If you did not specify global parameters, invoke the ConcactNumberWithCompress function that is created in Function Compute to combine 3 and 4 by using ampersands (&).

    SELECT remote_call('varchar','ConcactNumberWithCompress','{endpoint:"1234567890000****.cn-zhangjiakou-internal.fc.aliyuncs.com",compressed:false,max_batch_size:5000000}',3,4);

    Sample result:

    3&4&1
  • If you specify global parameters, invoke the ConcactNumberWithCompress function that is created in Function Compute to combine 5 and 6 by using ampersands (&).

    SELECT remote_call('varchar','ConcactNumberWithCompress',null,5,6);

    Sample result:

    5&6&1

FAQ

Why does the java.util.zip.ZipException: Not in GZIP format error occur when I invoke the REMOTE_CALL() function?

  • AnalyticDB for MySQL sends data to Function Compute in the GZIP format, but the custom function that is created in Function Compute does not include decompression code. As a result, Function Compute fails to parse the data.

  • AnalyticDB for MySQL sends uncompressed data to Function Compute, but Function Compute sends the compressed result to AnalyticDB for MySQL in the GZIP format. As a result, AnalyticDB for MySQL fails to parse the data.

What do I do if the parse remote_call config error occurs when I invoke the REMOTE_CALL() function?

Check whether the external_config parameter uses correct syntax in the REMOTE_CALL() function. If the syntax is incorrect, modify the SQL statement and try again.