All Products
Search
Document Center

Data Transmission Service:Add additional columns to a destination table

Last Updated:Aug 23, 2024

Data Transmission Service (DTS) allows you to add columns to a destination table and assign values to the columns when you create a data synchronization or migration task. You can filter the values in the additional columns to perform operations, such as metadata management, sorting, and deduplication, on the data that is transmitted to the destination table. This facilitates data management and processing.

Usage notes

  • You can add additional columns to a destination table in the following scenarios:

    • The Database Type of the destination database is DataHub, Lindorm, Kafka, or ClickHouse.

    • The Database Type of the source database is DB2 for LUW or DB2 for iSeries (AS/400), and the Database Type of the destination database is MySQL or PolarDB for MySQL.

    • The Database Type of the source database is MySQL, MariaDB, or PolarDB for MySQL, and the Database Type of the destination database is MySQL, MariaDB, or PolarDB for MySQL.

    • The Database Type of the source database is MySQL, and the Database Type of the destination database is ApsaraDB for Redis Enhanced Edition (Tair), AnalyticDB for PostgreSQL, or AnalyticDB for MySQL 3.0.

    • The Database Type of the source database is PolarDB for PostgreSQL, and the Database Type of the destination database is AnalyticDB for PostgreSQL.

  • Before you add additional columns to a destination table in a data synchronization task, make sure that the names of the additional columns are different from those of the existing columns in the destination table.

  • If the source database of the data synchronization task is a MongoDB database, the existing collections in the destination database cannot contain the _id and _value fields. Otherwise, the data synchronization task fails.

  • When you create a data synchronization or migration task, if you right-click a database in the Selected Objects section to add columns, DTS adds the specified additional columns to all tables in the corresponding destination database.

Procedure

This example shows you how to add additional columns to a destination table in a data synchronization task.

  1. Go to the Data Synchronization Tasks page.

    1. Log on to the Data Management (DMS) console.

    2. In the top navigation bar, click DTS.

    3. In the left-side navigation pane, choose DTS (DTS) > Data Synchronization.

    Note
  2. On the right side of Data Synchronization Tasks, select the region in which the data synchronization instance resides.

    Note

    If you use the new DTS console, you must select the region in which the data synchronization instance resides in the top navigation bar.

  3. Click Create Task. On the Create Data Synchronization Task page, configure the source and destination databases based on your business requirements.

    Note

    If you want to add additional columns to a destination table in a data synchronization task that is in the Running state, click Reselect Objects in the Actions column of the task.

  4. Go to the Select Objects step and complete the configurations.

    In this step, you can add additional columns.

    1. In the Source Objects section, select a database or table as the object to be synchronized and click the image.png icon to move the object to the Selected Objects section.

    2. In the Selected Objects section, right-click the database or table to be synchronized.

    3. In the Additional Columns section of the dialog box that appears, click + Add Column.

    4. Configure the Column Name, Type, and Assign Value parameters.

      Note

      You can click the ... icon to the right of the Assign Value field to customize an expression for assigning a value to an addition column. For more information, see the Configure the Assign Value parameter section of this topic.

    5. Click OK.

  5. Perform subsequent steps as prompted to complete the configurations of the data synchronization task.

    Note

    If the extract, transform, and load (ETL) feature is enabled for the data synchronization task, DTS calculates the value of each additional column based on the Assign Value parameter, executes the ETL script in the data synchronization task to calculate a final value, and then synchronizes the final value to the destination database.

Configure the Assign Value parameter

The value of the Assign Value parameter of an additional column consists of constants, variables, operators, and expression functions.

Note
  • The parameter setting is compatible with the domain-specific language (DSL) syntax of ETL.

  • In an expression, the column name is enclosed in backquotes (``) rather than single quotation marks ('')

  • Constants

    Data type

    Example

    int

    123

    float

    123.4

    string

    "hello1_world"

    boolean

    true or false

    datetime

    DATETIME('2021-01-01 10:10:01')

  • Variables

    Variable

    Description

    Data type

    Example

    __TB__

    The name of the table in the database.

    string

    table

    __DB__

    The name of the database.

    string

    mydb

    __OPERATION__

    The type of the operation.

    string

    __OP_INSERT__,__OP_UPDATE__,__OP_DELETE__

    __COMMIT_TIMESTAMP__

    The time when the transaction was committed.

    datetime

    '2021-01-01 10:10:01'

    `column`

    The name of the column.

    string

    `id` or `name`

    __SCN__

    The system change number (SCN). This variable is used to record the version of a transaction that was committed in the database and the time when the transaction was committed. Each SCN is unique.

    string

    22509****

    __ROW_ID__

    The address ID of a data record. This variable is used to locate the data record and is unique.

    string

    AAAgWHAAKAAJgX****

  • Expression functions

    • Arithmetic operations

      Purpose

      Syntax

      Valid values

      Return value

      Example

      Addition (+)

      • op_sum(value1, value2)

      • value1+value2

      • value1: an integer or a floating-point number.

      • value2: an integer or a floating-point number.

      If value1 and value2 are integers, an integer is returned. Otherwise, a floating-point number is returned.

      • op_sum(`col1`, 1.0)

      • `col1`+1.0

      Subtraction (-)

      • op_sub(value1, value2)

      • value1-value2

      • value1: an integer or a floating-point number.

      • value2: an integer or a floating-point number.

      If value1 and value2 are integers, an integer is returned. Otherwise, a floating-point number is returned.

      • op_sub(`col1`, 1.0)

      • `col1`-1.0

      Multiplication (*)

      • op_mul(value1, value2)

      • value1*value2

      • value1: an integer or a floating-point number.

      • value2: an integer or a floating-point number.

      If value1 and value2 are integers, an integer is returned. Otherwise, a floating-point number is returned.

      • op_mul(`col1`, 1.0)

      • `col1`*1.0

      Division (/)

      • op_div_true(value1, value2)

      • value1/value2

      • value1: an integer or a floating-point number.

      • value2: an integer or a floating-point number.

      If value1 and value2 are integers, an integer is returned. Otherwise, a floating-point number is returned.

      • op_div_true(`col1`, 2.0). In this example, if the value of col1 is 15, 7.5 is returned.

      • `col1`/1.0

      Modulo

      op_mod(value1, value2)

      • value1: an integer or a floating-point number.

      • value2: an integer or a floating-point number.

      If value1 and value2 are integers, an integer is returned. Otherwise, a floating-point number is returned.

      op_mod(`col1`, 10). In this example, if the value of col1 is 23, 3 is returned.

    • Logical operations

      Operation

      Syntax

      Valid value

      Return value

      Example

      Equal to

      op_eq(value1, value2)

      • value1: an integer, a floating-point number, or a string.

      • value2: an integer, a floating-point number, or a string.

      true or false

      op_eq(`col1`, 23)

      Greater than

      op_gt(value1, value2)

      • value1: an integer, a floating-point number, or a string.

      • value2: an integer, a floating-point number, or a string.

      true or false

      op_gt(`col1`, 1.0)

      Less than

      op_lt(value1, value2)

      • value1: an integer, a floating-point number, or a string.

      • value2: an integer, a floating-point number, or a string.

      true or false

      op_lt(`col1`, 1.0)

      Greater than or equal to

      op_ge(value1, value2)

      • value1: an integer, a floating-point number, or a string.

      • value2: an integer, a floating-point number, or a string.

      true or false

      op_ge(`col1`, 1.0)

      Less than or equal to

      op_le(value1, value2)

      • value1: an integer, a floating-point number, or a string.

      • value2: an integer, a floating-point number, or a string.

      true or false

      op_le(`col1`, 1.0)

      AND

      op_and(value1, value2)

      • value1: a Boolean value.

      • value2: a Boolean value.

      true or false

      op_and(`is_male`, `is_student`)

      OR

      op_or(value1, value2)

      • value1: a Boolean value.

      • value2: a Boolean value.

      true or false

      op_or(`is_male`, `is_student`)

      IN

      op_in(value, json_array)

      • value: an arbitrary value.

      • json_array: a JSON string.

      true or false

      op_in(`id`,json_array('["0","1","2","3","4","5","6","7","8"]'))

      Determine whether the value is empty

      op_is_null(value)

      value: an arbitrary value.

      true or false

      op_is_null(`name`)

      Determine whether the value is not empty

      op_is_not_null(value)

      value: an arbitrary value.

      true or false

      op_is_not_null(`name`)

    • String functions

      Operation

      Syntax

      Valid value

      Return value

      Example

      Append strings

      op_add(str_1,str_2,...,str_n)

      • str_1: a string.

      • str_2: a string.

      • ...

      • str_n: a string.

      The string after the append operation.

      op_add(`col`,'hangzhou','dts')

      Format strings and append strings

      str_format(format, value1, value2, value3, ...)

      • format: a string. Braces ({}) are used as placeholders. Example: "part1: {}, part2: {}".

      • value1: an arbitrary value.

      • value2: an arbitrary value.

      The string after the format operation.

      str_format("part1: {}, part2: {}", `col1`, `col2`). In this example, if the value of col1 is ab and the value of col2 is 12, "part1: ab, part2: 12" is returned.

      Replace strings

      str_replace(original, oldStr, newStr, count)

      • original: the original string.

      • oldStr: the string to be replaced.

      • newStr: the string after the replacement.

      • count: an integer that indicates the maximum number of times that a string can be replaced. A value of -1 indicates that all oldStr is replaced with newStr.

      The string after the replace operation.

      Example 1: str_replace(`name`, "a", 'b', 1). In this example, if the name is aba, bba is returned. Example 2: str_replace(`name`, "a", 'b', -1). In this example, if the name is aba, bbb is returned.

      Replace strings in the values of fields of all string types, such as the VARCHAR, TEXT, or CHAR type

      tail_replace_string_field(search, replace, all)

      • search: the string to be replaced.

      • replace: the string after the replacement.

      • all: indicates whether to replace all matched strings. Only a value of true is supported.

        Note

        If you do not need to replace all matched strings, use the str_replace() function.

      The string after the replace operation.

      tail_replace_string_field('\u000f','',true). In this example, all "\u000f" strings in the field values of the string type are replaced with spaces.

      Remove specific characters at the start and end of a string

      str_strip(string_val, charSet)

      • string_val: the original string.

      • char_set: the set of the first characters and the last characters of the string.

      The string after the remove operation.

      str_strip(`name`, 'ab'). In this example, if the name is axbzb, xbz is returned.

      Convert strings to lowercase letters

      str_lower(value)

      value: a column of the string type or a string constant.

      The string after the convert operation.

      str_lower(`str_col`)

      Convert strings to uppercase letters

      str_upper(value)

      value: a column of the string type or a string constant.

      The string after the convert operation.

      str_upper(`str_col`)

      Convert strings to numbers

      cast_string_to_long(value)

      value: a string.

      The integer after the convert operation.

      cast_string_to_long(`col`)

      Convert numbers to strings

      cast_long_to_string(value)

      value: an integer.

      The string after the convert operation.

      cast_long_to_string(`col`)

      Count strings

      str_count(str,pattern)

      • str: a column of the string type or a string constant.

      • pattern: the substring to query.

      The number of times for which the substring appears.

      str_count(`str_col`, 'abc'). In this example, if the value of str_col is zabcyabcz, 2 is returned.

      Query strings

      str_find(str, pattern)

      • str: a column of the string type or a string constant.

      • pattern: the substring to query.

      The position in which the substring matches for the first time. If no match is found, -1 is returned.

      str_find(`str_col`, 'abc'). In this example, if the value of str_col is xabcy, 1 is returned.

      Determine whether a string contains only letters

      str_isalpha(str)

      str: a column of the string type or a string constant.

      true or false

      str_isalpha(`str_col`)

      Determine whether a string contains only digits

      str_isdigit(str)

      • str: a column of the string type or a string constant.

      true or false

      str_isdigit(`str_col`)

      Regular expression match

      regex_match(str,regex)

      • str: a column of the string type or a string constant.

      • regex: a regular expression.

      true or false

      regex_match(__TB__,'user_\\d+')

      Mask part of a string with specific characters. This operation can be used for data masking. For example, mask the last four digits of a mobile phone number with four asterisks (*).

      str_mask(str, start, end, maskStr)

      • str: a column of the string type or a string constant.

      • start: an integer that indicates the start position of the masking. The minimum value is 0.

      • end: an integer that indicates the end position of the masking. The maximum value is the length of the string minus 1.

      • maskStr: a string. The length is 1. Example: #.

      The string whose part from start to end is masked with the specified characters.

      str_mask(`phone`, 7, 10, '#')

      Truncate a string from the cond string to the last character

      substring_after(str, cond)

      • str: the original string.

      • cond: a string.

      The string after the truncate operation.

      Note

      The return value does not contain the cond string.

      substring_after(`col`, 'abc')

      Truncate a string from the first character to the cond string

      substring_before(str, cond)

      • str: the original string.

      • cond: a string.

      The string after the truncate operation.

      Note

      The return value does not contain the cond string.

      substring_before(`col`, 'efg')

      Truncate a string from the cond1 string to the cond2 string

      substring_between(str, cond1, cond2)

      • str: the original string.

      • cond1: a string.

      • cond2: a string.

      The string after the truncate operation.

      Note

      The return value does not contain the cond1 and cond2 strings.

      substring_between(`col`, 'abc','efg')

      Determine whether the value is of the string type

      is_string_value(value)

      value: a string or a column name.

      true or false

      is_string_value(`col1`)

      Replace strings in the field values of the string type. The replacement starts from the end of the field values.

      tail_replace_string_field(search, replace, all)

      search: the string to be replaced.

      replace: the string after the replacement.

      all: indicates whether to replace all matched strings. Valid values: true and false.

      The string after the replace operation.

      In the following example, all "\u000f" strings in the field values of the string type are replaced with spaces:

      tail_replace_string_field('\u000f','',true)

      Query the value of a field in a MongoDB document

      bson_value("field1","field2","field3", ...)

      • field1: the name of the level -1 field.

      • field2: the name of the level -2 field.

      The value of the corresponding field in the document.

      • e_set(`user_id`, bson_value("id"))

      • e_set(`user_name`, bson_value("person","name"))

    • Conditional expressions

      Operation

      Syntax

      Valid value

      Return value

      Example

      Return a value based on the result of the condition. This expression is similar to the ternary operator (? :) in the C language.

      (cond ? val_1 : val_2)

      • cond: a field or an expression whose value is a Boolean value.

      • val_1: return value 1.

      • val_2: return value 2.

        Note

        The values of val_1 and val_2 must be of the same data type.

      If the value of cond is true, val_1 is returned. Otherwise, val_2 is returned.

      (id>1000? 1 : 0)

    • Time functions

      Operation

      Syntax

      Valid value

      Return value

      Example

      Query the current time

      dt_now()

      N/A

      A value of the DATETIME data type that is accurate to seconds.

      dts_now()

      dt_now_millis()

      N/A

      A value of the DATETIME data type that is accurate to milliseconds.

      dt_now_millis()

      Convert a UTC timestamp in seconds to a DATETIME value

      dt_fromtimestamp(value,[timezone])

      • value: an integer.

      • timezone: the time zone. This parameter is optional.

      A value of the DATETIME data type that is accurate to seconds.

      dt_fromtimestamp(1626837629)

      dt_fromtimestamp(1626837629,'GMT+08')

      Convert a UTC timestamp in milliseconds to a DATETIME value

      dt_fromtimestamp_millis(value,[timezone])

      • value: an integer.

      • timezone: the time zone. This parameter is optional.

      A value of the DATETIME data type that is accurate to milliseconds.

      dt_fromtimestamp_millis(1626837629123);

      dt_fromtimestamp_millis(1626837629123,'GMT+08')

      Convert a DATETIME value to a UTC timestamp in seconds

      dt_parsetimestamp(value,[timezone])

      • value: a value of the DATETIME data type.

      • timezone: the time zone. This parameter is optional.

      The integer after the convert operation.

      dt_parsetimestamp(`datetime_col`)

      dt_parsetimestamp(`datetime_col`,'GMT+08')

      Convert a DATETIME value to a UTC timestamp in milliseconds

      dt_parsetimestamp_millis(value,[timezone])

      • value: a value of the DATETIME data type.

      • timezone: the time zone. This parameter is optional.

      The integer after the convert operation.

      dt_parsetimestamp_millis(`datetime_col`)

      dt_parsetimestamp_millis(`datetime_col`,'GMT+08')

      Convert a DATETIME value to a string

      dt_str(value, format)

      • value: a value of the DATETIME data type.

      • format: a string that indicates the time format. Example: yyyy-MM-dd HH:mm:ss.

      The string after the convert operation.

      dt_str(`col1`, 'yyyy-MM-dd HH:mm:ss')

      Convert a string to a DATETIME value

      dt_strptime(value,format)

      • value: a string.

      • format: a string that indicates the time format. Example: yyyy-MM-dd HH:mm:ss.

      The DATETIME value after the convert operation.

      dt_strptime('2021-07-21 03:20:29', 'yyyy-MM-dd hh:mm:ss')

      Change the time by increasing or decreasing the values of one or more time granularities, such as year, month, day, hour, minute, and second

      dt_add(value, [years=intVal],

      [months=intVal],

      [days=intVal],

      [hours=intVal],

      [minutes=intVal]

      )

      • value: a value of the DATETIME data type.

      • intVal: an integer.

        Note

        A minus sign (-) indicates that the value of the specified time granularity is decreased.

      The DATETIME value after the change operation.

      • dt_add(datetime_col,years=-1)

      • dt_add(datetime_col,years=1,months=1)