CreateDIJob - DataWorks - Alibaba Cloud Documentation Center

Creates a new-version synchronization task.

Debugging

You can run this interface directly in OpenAPI Explorer, saving you the trouble of calculating signatures. After running successfully, OpenAPI Explorer can automatically generate SDK code samples.

Debug

Authorization information

The following table shows the authorization information corresponding to the API. The authorization information can be used in the Action policy element to grant a RAM user or RAM role the permissions to call this API operation. Description:

Operation: the value that you can use in the Action element to specify the operation on a resource.
Access level: the access level of each operation. The levels are read, write, and list.
Resource type: the type of the resource on which you can authorize the RAM user or the RAM role to perform the operation. Take note of the following items:
- The required resource types are displayed in bold characters.
- If the permissions cannot be granted at the resource level, All Resources is used in the Resource type column of the operation.
Condition Key: the condition key that is defined by the cloud service.
Associated operation: other operations that the RAM user or the RAM role must have permissions to perform to complete the operation. To complete the operation, the RAM user or the RAM role must have the permissions to perform the associated operations.

Operation	Access level	Resource type	Condition key	Associated operation
dataworks:CreateDIJob	create	All Resources ``	none	none

Request parameters

Parameter	Type	Required	Description	Example
DestinationDataSourceType	string	Yes	The destination type. Valid values: Hologres and Hive.	Hologres
Description	string	No	The description of the synchronization task.
SourceDataSourceType	string	Yes	The source type. Set this parameter to MySQL.	MySQL
ProjectId	long	No	The DataWorks workspace ID. You can log on to the DataWorks console and go to the Workspace page to query the ID. You must configure this parameter to specify the DataWorks workspace to which the API operation is applied.	10000
JobName	string	Yes	The name of the synchronization task.	mysql_to_holo_sync_8772
MigrationType	string	Yes	The synchronization type. Valid values: FullAndRealtimeIncremental: one-time full synchronization and real-time incremental synchronization RealtimeIncremental: real-time incremental synchronization Full: full synchronization OfflineIncremental: batch incremental synchronization FullAndOfflineIncremental: one-time full synchronization and batch incremental synchronization	FullAndRealtimeIncremental
SourceDataSourceSettings	array<object>	Yes	The settings of the source. Only a single source is supported.
	object	Yes	The settings of the source. Only a single source is supported.
DataSourceName	string	No	The name of the data source.	mysql_datasource_1
DataSourceProperties	object	No	The properties of the data source.
Encoding	string	No	The encoding format of the database.	UTF-8
Timezone	string	No	The time zone.	GMT+8
DestinationDataSourceSettings	array<object>	Yes	The settings of the destination. Only a single destination is supported.
	object	Yes	The settings of the destination. Only a single destination is supported.
DataSourceName	string	No	The name of the data source.	holo_datasource_1
ResourceSettings	object	Yes	The resource settings.
OfflineResourceSettings	object	No	The resource used for batch synchronization.
RequestedCu	double	No	The number of compute units (CUs) in the resource group for Data Integration that are used for batch synchronization.	2.0
ResourceGroupIdentifier	string	No	The identifier of the resource group for Data Integration used for batch synchronization.	S_res_group_111_222
RealtimeResourceSettings	object	No	The resource used for real-time synchronization.
RequestedCu	double	No	The number of CUs in the resource group for Data Integration that are used for real-time synchronization.	2.0
ResourceGroupIdentifier	string	No	The identifier of the resource group for Data Integration used for real-time synchronization.	S_res_group_111_222
ScheduleResourceSettings	object	No	The resource used for scheduling.
RequestedCu	double	No	The number of CUs in the resource group for scheduling that are used for batch synchronization.	2.0
ResourceGroupIdentifier	string	No	The identifier of the resource group for scheduling used for batch synchronization.	S_res_group_235454102432001_1579085295030
TransformationRules	array<object>	No	The list of transformation rules for objects involved in the synchronization task. Each entry in the list defines a transformation rule.
	object	No	The transformation rule for objects involved in the synchronization task.
RuleActionType	string	No	The action type. Valid values: DefinePrimaryKey Rename AddColumn HandleDml DefineIncrementalCondition DefineCycleScheduleSettings DefineRuntimeSettings DefinePartitionKey	Rename
RuleExpression	string	No	The expression of the rule. The expression must be a JSON string. Example of a renaming rule: {"expression":"${srcDatasourceName}_${srcDatabaseName}_0922","variables":[{"variableName":"srcDatabaseName","variableRules":[{"from":"fromdb","to":"todb"}]}]} expression: the expression of the renaming rule. You can use the following variables in an expression: ${srcDatasourceName}, ${srcDatabaseName}, and ${srcTableName}. ${srcDatasourceName} specifies the name of the source. ${srcDatabaseName} specifies the name of a source database. ${srcTableName} specifies the name of a source table. variables: the generation rule for a variable used in the expression of the renaming rule. The default value of the specified variable is the original value of the object indicated by the variable. You can define a group of string replacement rules to change the original values based on your business requirements. variableName: the name of the variable. The variable name cannot be enclosed in ${}. variableRules: the string replacement rules for variables. The system runs the string replacement rules in sequence. from specifies the original string. to specifies the new string. Example of a rule used to add a specific field to the destination and assign a value to the field: {"columns":[{"columnName":"my_add_column","columnValueType":"Constant","columnValue":"123"}]} If you do not configure such a rule, no fields are added to the destination and no values are assigned by default. columnName: the name of the field that you want to add. columnValueType: the value type of the field. Valid values: Constant and Variable. columnValue: the value of the field. If you set the valueType parameter to Constant, set the columnValue parameter to a custom constant of the STRING type. If you set the valueType parameter to Variable, set the columnValue to a built-in variable. The following built-in variables are supported: EXECUTE_TIME (LONG data type), DB_NAME_SRC (STRING data type), DATASOURCE_NAME_SRC (STRING data type), TABLE_NAME_SRC (STRING data type), DB_NAME_DEST (STRING data type), DATASOURCE_NAME_DEST (STRING data type), TABLE_NAME_DEST (STRING data type), and DB_NAME_SRC_TRANSED (STRING data type). EXECUTE_TIME specifies the execution time. DB_NAME_SRC specifies the name of a source database. DATASOURCE_NAME_SRC specifies the name of the source. TABLE_NAME_SRC specifies the name of a source table. DB_NAME_DEST specifies the name of a destination database. DATASOURCE_NAME_DEST specifies the name of the destination. TABLE_NAME_DEST specifies the name of a destination table. DB_NAME_SRC_TRANSED specifies the database name obtained after a transformation. Example of a rule used to specify primary key fields for a destination table: {"columns":["ukcolumn1","ukcolumn2"]} If you do not configure such a rule, the primary key fields in the mapped source table are used for the destination table by default. If the destination table is an existing table, Data Integration does not modify the schema of the destination table. If the specified primary key fields do not exist in the destination table, an error is reported when the synchronization task starts to run. If the destination table is automatically created by the system, Data Integration automatically creates the schema of the destination table. The schema contains the primary key fields that you specify. If the specified primary key fields do not exist in the destination table, an error is reported when the synchronization task starts to run. Example of a rule used to process DML messages: {"dmlPolicies":[{"dmlType":"Delete","dmlAction":"Filter","filterCondition":"id > 1"}]} If you do not configure such a rule, the default processing policy for messages generated for insert, update, and delete operations is Normal. dmlType: the DML operation. Valid values: Insert, Update, and Delete. dmlAction: the processing policy for DML messages. Valid values: Normal, Ignore, Filter, and LogicalDelete. Filter indicates conditional processing. You can set the dmlAction parameter to Filter only when the dmlType parameter is set to Update or Delete. filterCondition: the condition used to filter DML messages. This parameter is required only when the dmlAction parameter is set to Filter.	{"expression":"${srcDatasoureName}_${srcDatabaseName}"}
RuleName	string	No	The name of the rule. If the values of the RuleActionType parameter and the RuleTargetType parameter are the same for multiple transformation rules, you must make sure that the transformation rule names are unique.	rename_rule_1
RuleTargetType	string	No	The type of the object on which you want to perform the action. Valid values: Table Schema	Table
TableMappings	array<object>	Yes	The list of mappings between rules used to select synchronization objects in the source and transformation rules applied to the selected synchronization objects. Each entry in the list displays a mapping between a rule used to select synchronization objects and a transformation rule applied to the selected synchronization objects.
	object	Yes	The mapping between a rule used to select synchronization objects in the source and a transformation rule applied to the selected synchronization objects.
SourceObjectSelectionRules	array<object>	No	The list of rules used to select synchronization objects in the source. The objects can be databases or tables.
	object	No	The rule used to select synchronization objects in the source. The objects can be databases or tables.
Action	string	No	The operation that is performed to select objects. Valid values: Include and Exclude.	Include
Expression	string	No	The expression.	mysql_table_1
ExpressionType	string	No	The expression type. Valid values: Exact and Regex.	Exact
ObjectType	string	No	The object type. Valid values: Table Database	Table
TransformationRules	array<object>	No	The list of transformation rules that you want to apply to the synchronization objects selected from the source. Each entry in the list defines a transformation rule.
	object	No	The transformation rule that you want to apply to the synchronization objects selected from the source.
RuleName	string	No	The name of the rule. If the values of the RuleActionType parameter and the RuleTargetType parameter are the same for multiple transformation rules, you must make sure that the transformation rule names are unique.	rename_rule_1
RuleActionType	string	No	The action type. Valid values: DefinePrimaryKey Rename AddColumn HandleDml DefineIncrementalCondition DefineCycleScheduleSettings DefineRuntimeSettings DefinePartitionKey	Rename
RuleTargetType	string	No	The type of the object on which you want to perform the action. Valid values: Table Schema	Table
JobSettings	object	No	The settings for the dimension of the synchronization task. The settings include processing policies for DDL messages, policies for data type mappings between source fields and destination fields, and runtime parameters of the synchronization task.
ChannelSettings	string	No	The channel control settings for the synchronization task. The value of this parameter must be a JSON string.	{"structInfo":"MANAGED","storageType":"TEXTFILE","writeMode":"APPEND","partitionColumns":[{"columnName":"pt","columnType":"STRING","comment":""}],"fieldDelimiter":""}
ColumnDataTypeSettings	array<object>	No	The data type mappings between source fields and destination fields.
	object	No	The data type mapping between a source field and a destination field.
DestinationDataType	string	No	The data type of the destination field.	text
SourceDataType	string	No	The data type of the source field.	bigint
CycleScheduleSettings	object	No	The settings for periodic scheduling.
CycleMigrationType	string	No	The synchronization type that requires periodic scheduling. Valid values: Full: full synchronization OfflineIncremental: batch incremental synchronization	Full
ScheduleParameters	string	No	The scheduling parameters.	bizdate=$bizdate
DdlHandlingSettings	array<object>	No	The processing settings for DDL messages.
	object	No	The processing setting for a specific type of DDL message.
Action	string	No	The processing policy. Valid values: Ignore: ignores a DDL message. Critical: reports an error for a DDL message. Normal: normally processes a DDL message.	Critical
Type	string	No	The type of the DDL operation. Valid values: RenameColumn ModifyColumn CreateTable TruncateTable DropTable DropColumn AddColumn	AddColumn
RuntimeSettings	array<object>	No	The runtime settings.
	object	No
Name	string	No	The name of the configuration item. Valid values: runtime.offline.speed.limit.mb: specifies the maximum transmission rate that is allowed for a batch synchronization task. This configuration item takes effect only when runtime.offline.speed.limit.enable is set to true. runtime.offline.speed.limit.enable: specifies whether throttling is enabled for a batch synchronization task. dst.offline.connection.max: specifies the maximum number of connections that are allowed for writing data to the destination of a batch synchronization task. runtime.offline.concurrent: specifies the maximum number of parallel threads that are allowed for a batch synchronization task. dst.realtime.connection.max: specifies the maximum number of connections that are allowed for writing data to the destination of a real-time synchronization task. runtime.enable.auto.create.schema: specifies whether schemas are automatically created in the destination of a synchronization task. src.offline.datasource.max.connection: specifies the maximum number of connections that are allowed for reading data from the source of a batch synchronization task. runtime.realtime.concurrent: specifies the maximum number of parallel threads that are allowed for a real-time synchronization task.	runtime.offline.concurrent
Value	string	No	The value of the configuration item.	1

Response parameters

Parameter	Type	Description	Example
	object	The response paramaters.
DIJobId	long	The ID of the synchronization task.	11792
RequestId	string	The request ID. You can use the ID to query logs and troubleshoot issues.	4F6AB6B3-41FB-5EBB-AFB2-0C98D49DA2BB

Examples

Sample success responses

JSONformat

{
  "DIJobId": 11792,
  "RequestId": "4F6AB6B3-41FB-5EBB-AFB2-0C98D49DA2BB"
}

Error codes

For a list of error codes, visit the Service error codes.