All Products
Search
Document Center

Tablestore:Use Function Compute

Last Updated:Nov 15, 2024

This topic describes how to use Function Compute to perform real-time computing on incremental data in Tablestore.

Background information

Function Compute is a fully managed, event-driven computing service. It allows you to focus on coding without the need to procure and manage infrastructure resources such as servers. You need to only upload your code or image. Function Compute allocates computing resources, runs tasks in an elastic and reliable manner, and provides features such as log query, performance monitoring, and alerting. For more information, see What is Function Compute?

A Tablestore Stream is a tunnel that is used to obtain incremental data from Tablestore tables. After you create Tablestore triggers, a Tablestore Stream and a function in Function Compute can be automatically connected. This allows the custom program logic in the function to automatically process data modifications in Tablestore tables.

Scenarios

The following figure shows the tasks that you can perform by using Function Compute.

  • Data synchronization: You can use Function Compute to synchronize real-time data that is stored in Tablestore to data cache, search engines, or other database instances.

  • Data archiving: You can use Function Compute to archive incremental data that is stored in Tablestore to OSS for cold backup.

  • Event-driven application: You can create triggers to trigger functions to call API operations that are provided by IoT Hub and cloud applications. You can also create triggers to send notifications.

fig_fuc001

Prerequisites

Usage notes

  • Tablestore triggers are supported in the following regions: China (Beijing), China (Hangzhou), China (Shanghai), China (Shenzhen), Japan (Tokyo), Singapore, Germany (Frankfurt), and China (Hong Kong).

  • The Tablestore data table must reside in the same region as the associated service in Function Compute.

  • If you want the function that is associated with the Tablestore trigger to access Tablestore over the internal network, you must use the virtual private cloud (VPC) endpoint of Tablestore. You cannot use the internal endpoint of Tablestore.

    The VPC endpoint is in the following format: {instanceName}.{regionId}.vpc.tablestore.aliyuncs.com. For more information, see Query endpoints.

  • When you write function code for Tablestore, make sure not to use the following logic: Function B is invoked by a trigger for Table A and then Function B updates the data in Table A. This logic creates an infinite loop of function invocations.

  • The execution duration of a function that is invoked by a trigger cannot exceed 1 minute.

  • If an exception occurs during function execution, the function is retried an indefinite number of times until the log data in Tablestore expires.

    Note
    • A function execution exception occurs in the following scenarios:

      • A function instance is started but the function code does not run as expected. In this case, fees are generated for the instance.

      • A function instance fails to start due to reasons such as startup command errors. In this case, fees are not generated for the instance.

    • If a function execution exception occurs, you can disable the Stream feature for the data table to prevent the function from being retried an indefinite number of times. Before you disable the Stream feature, make sure that no other triggers are using the data table. Otherwise, these triggers may not work as expected.

Step 1: Enable the Stream feature for the data table

Before you create a trigger, you must enable the Stream feature for the data table in the Tablestore console to allow the function to process incremental data that is written to the table.

  1. Log on to the Tablestore console.

  2. In the top navigation bar, select a region.

  3. On the Overview page, click the name of the instance that you want to manage or click Manage Instance in the Actions column.

  4. On the Tables tab of the Instance Details tab, click the name of the data table that you want to manage and then click the Tunnels tab. Alternatively, you can click the fig_001 icon and then click Tunnels.

  5. On the Tunnels tab, click Enable in the Stream Information section.

  6. In the Enable Stream dialog box, configure the Log Expiration Time parameter and click Enable.

    The value of the Log Expiration Time parameter must be a non-zero integer. Unit: hours. Maximum value: 168.

    Important

    The Log Expiration Time parameter cannot be modified after it is specified. Proceed with caution.

Step 2: Create a function and Tablestore trigger

  1. Create a function.

    1. Log on to the Function Compute console.

    2. Optional: In the upper-right corner of the page, click Go to Function Compute 3.0.

      Note
      • Function Compute 3.0 provides various enhanced features. In this example, Function Compute 3.0 is used.

      • If Back to Function Compute 2.0 is displayed in the upper-right corner of the page, skip this step because you are already in the Function Compute 3.0 console.

    3. In the left-side navigation pane, click Functions.

    4. In the top navigation bar, select a region. On the Functions page, click Create Function.

    5. On the Create Function page, select a method to create a function, configure the following parameters, and then click Create.

      In this example, Event Function is selected to illustrate how to create a function to compute data modifications in Tablestore in real time.

      Note

      You can select Event Function, Web Function, or Task Function as the method that is used to create a function to process data in Tablestore. For more information, see Selection of methods to create functions.

      • If you want data processing to be automatically triggered by data modifications in Tablestore, select Event Function. For more information, see Create an event function.

      • If you want data processing to be automatically triggered by specific HTTP requests, select Web Function. For more information, see Create a web function.

      • If you want data processing to be periodically or asynchronously triggered, select Task Function. For more information, see Create a task function.

      • Basic Settings: Configure the Function Name parameter.

      • Code: Configure the runtime and code-related information of the function.

        Parameter

        Description

        Example

        Runtime

        Select a runtime, such as Python, Java, PHP, or Node.js, or Custom Container Image.

        Custom Container Image

        In this example, Python 3.9 is selected.

        Code Upload Method

        Specify how you want code to be uploaded to Function Compute.

        • Use Sample Code: You can select the sample code provided by Function Compute to create a function based on your business requirements. This is the default method.

        • Upload ZIP: Select and upload a .zip file that contains your function code.

        • Upload Folder: Select and upload the folder that contains your function code.

        • OSS: Upload code from an Object Storage Service (OSS) bucket. In this case, you must specify the Bucket Name and Object Name parameters.

        In this example, Use Sample Code and the Hello, world! sample code are selected.

      • Advanced Settings: Configure instance information and the function execution timeout period.

        Parameter

        Description

        Example

        Specifications

        Configure instance specifications, such as vCPU Capacity and Memory Capacity based on your business requirements. For more information about the billing of resources, see Billing overview.

        Note

        The ratio of vCPU specification to memory capacity (in GB) must range from 1:1 to 1:4.

        0.35 vCPUs, 512 MB

        Size of Temporary Disk

        Specify the size of the disk used to temporarily store files based on your business requirements.

        Valid values:

        • 512 MB: the default value. You are not charged for using a temporary disk of this size. Function Compute provides you with a free disk space of 512 MB.

        • 10 GB: You are charged based on the disk size of 9.5 GB.

        Note

        Data shares the space of the temporary disk and can be written to all directories in the disk.

        The lifecycle of the temporary disk is consistent with the lifecycle of the underlying instance. After the instance is recycled by the system, the data on the hard disk is cleared. To persist files, you can use File Storage NAS or OSS. For more information, see Configure a NAS file system and Configure an OSS file system.

        512 MB

        Execution Timeout Period

        Specify the timeout period for execution of the function. The default timeout period is 180 seconds, and the maximum timeout period is 86,400 seconds.

        180

        Handler

        Specify the handler of the function. The Function Compute runtime loads and invokes the handler to process requests. If you select Web Function to create a function, skip this parameter.

        Note

        If you set the Code Upload Method parameter to Use Sample Code, retain the value of the Handler parameter. If you select another code upload method, modify the value of the Handler parameter based on your business requirements. Otherwise, an error is reported when the function runs.

        index.handler

        Time Zone

        Select the time zone of the function. After you specify the time zone of the function, the environment variable TZ is automatically added to the function. The value is the time zone that you specify.

        UTC

        Function Role

        Specify the Resource Access Management (RAM) role of the function. Function Compute uses this role to generate a temporary AccessKey pair that is used to access your Alibaba Cloud resources and passes the AccessKey pair to your code.

        AliyunFCDefaultRole

        Access to VPC

        Specify whether to allow the function to access VPC resources. For more information, see Configure network settings.

        Yes

        VPC

        Specify the VPC. This parameter is required if you set Access to VPC to Yes. Create a VPC or select the ID of an existing VPC that you want the function to access from the drop-down list.

        fc.auto.create.vpc.1632317****

        vSwitch

        Specify the vSwitch. This parameter is required if you set Access to VPC to Yes. Create a vSwitch or select the ID of an existing vSwitch from the drop-down list.

        fc.auto.create.vswitch.vpc-bp1p8248****

        Security Group

        Specify the security group. This parameter is required if you set Access to VPC to Yes. Create a security group or select an existing security group from the drop-down list.

        fc.auto.create.SecurityGroup.vsw-bp15ftbbbbd****

        Allow Default NIC to Access Internet

        Specify whether to allow the function to access the Internet by using the default network interface controller (NIC). If you select No, the function cannot access the Internet by using the default NIC of Function Compute.

        Important

        If you use a static public IP address, you must set Allow Default NIC to Access Internet to No. Otherwise, the configured static public IP address does not take effect. For more information, see Configure static public IP addresses.

        Yes

        Logging

        Specify whether to enable the logging feature. Valid values:

        • Enable: Function Compute sends function execution logs to Simple Log Service for persistent storage. You can use these logs to debug code, analyze failures, and analyze data.

          Note

          After you enable the logging feature, logs that are printed to standard output (stdout) are collected by Simple Log Service. Then, you can use these logs to debug code, analyze failures, and analyze data.

        • Disable: You cannot use Simple Log Service to store and query function execution logs.

        Enable

      • (Optional) Environment Variables: Configure the environment variables in the runtime environment of the function. For more information, see Configure environment variables.

  2. Create a Tablestore trigger.

    1. On the Function Details tab, click the Configurations tab. In the left-side navigation pane, click Triggers and then click Create Trigger.

    2. In the Create Trigger panel, configure the parameters and click OK.

      Parameter

      Description

      Example

      Trigger Type

      The type of the trigger. Select Tablestore.

      Tablestore

      Name

      The name of the trigger.

      Tablestore-trigger

      Version or Alias

      The version or alias of the trigger. Default value: LATEST. If you want to create a trigger for another version or alias, select a version or alias from the Version or Alias drop-down list on the function details page. For more information about versions and aliases, see Manage versions and Manage aliases.

      LATEST

      Instance

      The name of the existing Tablestore instance.

      d00dd8xm****

      Table

      The name of the existing table.

      mytable

      Role Name

      Select AliyunTableStoreStreamNotificationRole.

      Note

      After you configure the preceding parameters, click OK. The first time you create a trigger of this type, click Authorize Now in the dialog box that appears.

      AliyunTableStoreStreamNotificationRole

      After the trigger is created, it is displayed on the Triggers tab. To modify or delete a trigger, see Trigger management.

Step 3: Configure test parameters for the function

  1. On the Code tab of the Function Details tab, click the image.png icon next to Test Function and select Configure Test Parameters from the drop-down list.

  2. In the Configure Test Parameters panel, click the Create New Test Event tab, set the Event Template parameter to Tablestore, and then specify the event name and event content. Click OK.

    Note

    You can select the name of a created event on the Modify Existing Test Event tab.

    A Tablestore trigger encodes incremental data in the Concise Binary Object Representation (CBOR) format to construct an event that is used to invoke a function in Function Compute. The following sample code provides an example on the format of the event content:

    {
        "Version": "Sync-v1",
        "Records": [
            {
                "Type": "PutRow",
                "Info": {
                    "Timestamp": 1506416585740836
                },
                "PrimaryKey": [
                    {
                        "ColumnName": "pk_0",
                        "Value": 1506416585881590900
                    },
                    {
                        "ColumnName": "pk_1",
                        "Value": "2017-09-26 17:03:05.8815909 +0800 CST"
                    },
                    {
                        "ColumnName": "pk_2",
                        "Value": 1506416585741000
                    }
                ],
                "Columns": [
                    {
                        "Type": "Put",
                        "ColumnName": "attr_0",
                        "Value": "hello_table_store",
                        "Timestamp": 1506416585741
                    },
                    {
                        "Type": "Put",
                        "ColumnName": "attr_1",
                        "Value": 1506416585881590900,
                        "Timestamp": 1506416585741
                    }
                ]
            }
        ]
    }

    The following table describes the parameters in the event content.

    Parameter

    Description

    Version

    The version of the payload. Example: Sync-v1. The value is a string.

    Records

    The array that stores the rows of incremental data in the data table. Each element contains the following parameters:

    • Type: the type of the operation performed on the row. Valid values: PutRow, UpdateRow, and DeleteRow. The value is a string.

    • Info: the information about the row, including the Timestamp parameter, which specifies the time when the row was last modified. The time must be in UTC. The value is of the INT64 type.

    PrimaryKey

    The array that stores the primary key columns. Each element contains the following parameters:

    • ColumnName: the name of the primary key column. The value is a string.

    • Value: the value of the primary key column. The value is of the formated_value type, which can be INTEGER, STRING, or BLOB.

    Columns

    The array that stores the attribute columns. Each element contains the following parameters:

    • Type: the type of the operation performed on the attribute column. Valid values: Put, DeleteOneVersion, and DeleteAllVersions. The value is a string.

    • ColumnName: the name of the attribute column. The value is a string.

    • Value: the value of the attribute column. The value is of the formatted_value type, which can be INTEGER, BOOLEAN, DOUBLE, STRING, or BLOB.

    • Timestamp: the time when the attribute column was last modified. The time must be in UTC. The value is of the INT64 type.

Step 4: Write and test function code

After you create the Tablestore trigger, you can write function code and test the function code to verify whether the code is valid. Functions are automatically invoked by triggers when the data in Tablestore is updated.

  1. On the Code tab of the Function Details tab, write code in the code editor and click Deploy.

    In this example, the function code is written in Python. For information about how to write function code in other runtime environments, see Use Tablestore to trigger Function Compute in Node.js, PHP, Java, and C# runtimes.

    import logging
    import cbor
    import json
    
    
    def get_attribute_value(record, column):
        attrs = record[u'Columns']
        for x in attrs:
            if x[u'ColumnName'] == column:
                return x['Value']
    
    
    def get_pk_value(record, column):
        attrs = record[u'PrimaryKey']
        for x in attrs:
            if x['ColumnName'] == column:
                return x['Value']
    
    
    def handler(event, context):
        logger = logging.getLogger()
        logger.info("Begin to handle event")
        # records = cbor.loads(event)
        records = json.loads(event)
        for record in records['Records']:
            logger.info("Handle record: %s", record)
            pk_0 = get_pk_value(record, "pk_0")
            attr_0 = get_attribute_value(record, "attr_0")
        return 'OK'
    
  2. Click Test Function.

    After the function is executed, you can view the results on the Code tab.

  3. Modify and deploy the code.

    1. If OK is returned after records=json.loads(event) is executed, set records to cbor.loads(event).

    2. Click Deploy.

    When data is written to Tablestore, the related function logic is triggered.

FAQ

  • If you cannot create a Tablestore trigger in a region, check whether the region supports Tablestore triggers. For more information, see Usage notes.

  • If you cannot find an existing Tablestore data table when you create a Tablestore trigger, check whether the data table resides in the same region as the associated service in Function Compute.

  • In most cases, if an error that indicates a client cancels invocation is repeatedly reported when you use a Tablestore trigger, the timeout period configured for function execution on the client is shorter than the actual function execution duration. In this case, we recommend that you increase the client timeout period. For more information, see What do I do if the client is disconnected and the message "Invocation canceled by client" is reported?

  • If data is written to a Tablestore data table but the associated Tablestore trigger is not triggered, you can troubleshoot the issue by performing the following steps. For more information about how to troubleshoot trigger failures, see What do I do if a trigger cannot trigger function execution?

    • Make sure that the Stream feature is enabled for the data table. For more information, see Step 1: Enable the Stream feature for the data table.

    • Check whether the role is correctly configured when you create the trigger. You can use the default role AliyunTableStoreStreamNotificationRole. For more information, see Create a Tablestore trigger.

    • View the function execution logs to check whether the function failed to be executed. If a function fails to be executed, the function is retried until the log data in Tablestore expires.

Appendix: Grant Function Compute the permissions to access Tablestore

To use the features provided by Function Compute in Tablestore, you must grant Function Compute the permissions to access Tablestore. To grant coarse-grained permissions, you can use the default RAM role AliyunFCDefaultRole that is provided by Function Compute. To grant fine-grained permissions, you can use a custom RAM role to which the required policy is attached.

  • Use the default RAM role

    Attach the AliyunOTSFullAccess policy to the AliyunFCDefaultRole RAM role to authorize the RAM role to manage Tablestore. For more information, see Grant permissions to a RAM role.

    Note

    The AliyunFCDefaultRole RAM role is the default RAM role of Function Compute. This RAM role does not have permissions to manage Tablestore.

    • The first time you use the RAM role, you must grant the RAM role the permissions to manage Tablestore.

    • If the RAM role already has the permissions to manage Tablestore, skip this step.

  • Use a custom RAM role

    For more information, see Example: Grant Function Compute the permissions to access Object Storage Service (OSS).