All Products
Search
Document Center

Tablestore:Create a delivery task

Last Updated:Jun 14, 2024

To deliver data in a Tablestore data table to an Object Storage Service (OSS) bucket, you can call the CreateDeliveryTask operation to create a delivery task.

Important

Make sure that the version of the installed Tablestore SDK for Go supports the data delivery feature.

Prerequisites

  • OSS is activated. A bucket is created in the region in which a Tablestore instance is deployed. For more information, see Activate OSS.

  • The Tablestore service-linked role (AliyunServiceRoleForOTSDataDelivery) is created in the Tablestore console. The Alibaba Cloud Resource Name (ARN) of the role is recorded. For more information, see Create a data delivery task.

    You can perform the following operations in the RAM console to obtain the ARN of the Tablestore service-linked role (AliyunServiceRoleForOTSDataDelivery):

    On the Roles page, search for AliyunServiceRoleForOTSDataDelivery. Then, click the RAM role name. On the role details page, you can view and copy the ARN information about the role.

  • A TableStoreClient instance is initialized. For more information, see Initialize an OTSClient instance.

  • A data table is created, and data is written to the data table.

Parameters

Parameter

Description

TableName

The name of the data table.

TaskName

The name of the delivery task.

The name must be 3 to 16 characters in length and can contain only lowercase letters, digits, and hyphens (-). The name must start and end with a lowercase letter or digit.

TaskConfig

The configurations of the delivery task, which includes the following content:

  • OssPrefix: the prefix of the directory in the bucket. Data is delivered from Tablestore to the directory. The path of the destination directory supports the following time variables: $yyyy, $MM, $dd, $HH, and $mm.

    • When the path uses time variables for delivery, OSS directories are dynamically generated based on the time when data is written. This way, data is partitioned based on the naming conventions that are followed when Hive partitions data. Objects in OSS are organized, partitioned, and distributed based on time.

    • When the path does not use time variables, all files are delivered to an OSS directory whose name contains the specified prefix.

  • OssBucket: the name of the OSS bucket.

  • OssEndpoint: the endpoint of the region in which the OSS bucket is deployed.

  • OssRoleName: the ARN of the Tablestore service-linked role.

  • Format: the format of the delivered data. The delivered data is stored in the Parquet format. By default, the data delivery feature uses PLAIN to encode data of all types.

  • EventTimeColumn: the event time column. This parameter specifies that data is partitioned based on the time of a column. If you do not specify this parameter, data is partitioned based on the time when the data is written to Tablestore.

  • Schema: specifies the columns you want to deliver. You must specify the source fields, destination fields, and destination field types to deliver the columns.

    You can specify the names of source and destination fields and the order in which you want to deliver the source fields in the schema. After data is delivered to OSS, the data is distributed based on the order of fields in the schema.

    Important

    The data types need to be consistent between the source and destination fields. Otherwise, the fields are discarded as dirty data. For more information about field type mappings, see Data type mapping.

TaskType

The type of the delivery task. Default value: BaseIncTask. Valid values:

  • IncTask: the incremental data delivery type. Only incremental data is synchronized.

  • BaseTask: the full data delivery type. All data in the table is scanned and synchronized.

  • BaseIncTask: the differential data delivery type. After the full data is synchronized, Tablestore synchronizes the incremental data.

    When Tablestore synchronizes incremental data, you can view the time when data is last delivered and the current status of the delivery task.

Examples

The following sample code provides an example on how to create a delivery task for a data table:

func CreateTaskSample(client *tablestore.TableStoreClient) {
    createTask := &tablestore.CreateDeliveryTaskRequest{
        TableName: "<TABLE_NAME>",
        TaskName: "<TASK_NAME>",
        TaskType: tablestore.BaseIncTask,
        TaskConfig: &tablestore.OSSTaskConfig{
            OssPrefix:   "sample/year=$yyyy/month=$MM",
            OssBucket:      "datadeliverytest",
            OssEndpoint:    "oss-cn-hangzhou.aliyuncs.com",
            OssRoleName:    "acs:ram::17************45:role/aliyunserviceroleforotsdatadelivery",
            Schema: []*tablestore.TaskSchema{
                {
                    ColumnName: "PK1",
                    OssColumnName: "PK1",
                    Type: tablestore.ParquetInt64,
                },
                {
                    ColumnName: "PK2",
                    OssColumnName: "PK2",
                    Type: tablestore.ParquetUtf8,
                },
                {
                    ColumnName: "Col1",
                    OssColumnName: "Col1",
                    Type: tablestore.ParquetDouble,
                },
            },
        },
    }
    createResp, err := client.CreateDeliveryTask(createTask)
    if err != nil {
        log.Fatal("create delivery task failed ", err)
    }
    fmt.Println("create delivery task success ", createResp.RequestId)
}