CreateFile - DataWorks - Alibaba Cloud Documentation Center

Creates a file in DataStudio. You cannot call this operation to create files for Data Integration nodes.

Debugging

You can run this interface directly in OpenAPI Explorer, saving you the trouble of calculating signatures. After running successfully, OpenAPI Explorer can automatically generate SDK code samples.

Debug

Authorization information

The following table shows the authorization information corresponding to the API. The authorization information can be used in the Action policy element to grant a RAM user or RAM role the permissions to call this API operation. Description:

Operation: the value that you can use in the Action element to specify the operation on a resource.
Access level: the access level of each operation. The levels are read, write, and list.
Resource type: the type of the resource on which you can authorize the RAM user or the RAM role to perform the operation. Take note of the following items:
- The required resource types are displayed in bold characters.
- If the permissions cannot be granted at the resource level, All Resources is used in the Resource type column of the operation.
Condition Key: the condition key that is defined by the cloud service.
Associated operation: other operations that the RAM user or the RAM role must have permissions to perform to complete the operation. To complete the operation, the RAM user or the RAM role must have the permissions to perform the associated operations.

Operation	Access level	Resource type	Condition key	Associated operation
dataworks:*	create	All Resources ``	none	none

Request parameters

Parameter	Type	Required	Description	Example
FileFolderPath	string	Yes	The path of the file.
ProjectId	long	Yes	The ID of the DataWorks workspace. You can log on to the DataWorks console and go to the Workspace Management page to obtain the workspace ID. You must configure this parameter or the ProjectIdentifier parameter to determine the DataWorks workspace to which the operation is applied.	10000
FileName	string	Yes	The name of the file.
FileDescription	string	No	The description of the file.
FileType	integer	Yes	The type of the code for the file. The code for files varies based on the file type. For more information, see DataWorks nodes. You can call the ListFileType operation to query the type of the code for the file.	10
Owner	string	No	The ID of the Alibaba Cloud account used by the file owner. If this parameter is not configured, the ID of the Alibaba Cloud account of the user who calls the operation is used.	1000000000001
Content	string	No	The code for the file. The code format varies based on the file type. To view the code format for a specific file type, go to Operation Center, right-click a node of the file type, and then select View Code.	SHOW TABLES;
AutoRerunTimes	integer	No	The number of automatic reruns that are allowed after an error occurs. Maximum value: 10.	3
AutoRerunIntervalMillis	integer	No	The interval between automatic reruns after an error occurs. Unit: milliseconds. Maximum value: 1800000 (30 minutes). This parameter corresponds to the Rerun Interval parameter that is displayed after the Auto Rerun upon Error check box is selected in the Schedule section of the Properties tab in the DataWorks console. The interval that you specify in the DataWorks console is measured in minutes. Pay attention to the conversion between the units of time when you call the operation.	120000
RerunMode	string	No	Specifies whether the node that corresponds to the file can be rerun. Valid values: ALL_ALLOWED: The node can be rerun regardless of whether it is successfully run or fails to run. FAILURE_ALLOWED: The node can be rerun only after it fails to run. ALL_DENIED: The node cannot be rerun regardless of whether it is successfully run or fails to run. This parameter corresponds to the Rerun parameter in the Schedule section of the Properties tab on the DataStudio page in the DataWorks console.	ALL_ALLOWED
Stop	boolean	No	Specifies whether to suspend the scheduling of the node. Valid values: true false This parameter corresponds to the Recurrence parameter in the Schedule section of the Properties tab on the DataStudio page in the DataWorks console.	false
ParaValue	string	No	The scheduling parameters of the node. Separate multiple parameters with spaces. This parameter corresponds to the Parameters section of the Properties tab in the DataWorks console. For more information about the configurations of the scheduling parameters, see Configure scheduling parameters.	a=x b=y
StartEffectDate	long	No	The start time of automatic scheduling. Set the value to a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC. Configuring this parameter is equivalent to specifying a start time for the Validity Period parameter in the Schedule section of the Properties tab on the DataStudio page in the DataWorks console.	1671608450000
EndEffectDate	long	No	The end time of automatic scheduling. Set the value to a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC. This parameter corresponds to the Validity Period parameter in the Schedule section of the Properties tab in the DataWorks console.	1671694850000
CronExpress	string	No	The CRON expression that represents the periodic scheduling policy of the node. This parameter corresponds to the Cron Expression parameter in the Schedule section of the Properties tab on the DataStudio page in the DataWorks console. After you configure the Scheduling Cycle and Scheduled time parameters in the DataWorks console, DataWorks generates the value of the Cron Expression parameter. Examples: CRON expression for a node that is scheduled to run at 05:30 every day: `00 30 05 * * ?` CRON expression for a node that is scheduled to run at the fifteenth minute of each hour: `00 15 00-23/1 * * ?` CRON expression for a node that is scheduled to run every 10 minutes: `00 00/10 * * * ?` CRON expression for a node that is scheduled to run every 10 minutes from 08:00 to 17:00 every day: `00 00-59/10 8-17 * * * ?` CRON expression for a node that is scheduled to run at 00:20 on the first day of each month: `00 20 00 1 * ?` CRON expression for a node that is scheduled to run every three months from 00:10 on January 1: `00 10 00 1 1-12/3 ?` CRON expression for a node that is scheduled to run at 00:05 every Tuesday and Friday: `00 05 00 * * 2,5` The scheduling system of DataWorks imposes the following limits on CRON expressions: The minimum interval specified in a CRON expression to schedule a node is 5 minutes. The earliest time specified in a CRON expression to schedule a node every day is 00:05.	00 05 00 * * ?
CycleType	string	No	The type of the scheduling cycle of the node that corresponds to the file. Valid values: NOT_DAY and DAY. The value NOT_DAY indicates that the node is scheduled to run by minute or hour. The value DAY indicates that the node is scheduled to run by day, week, or month. This parameter corresponds to the Scheduling Cycle parameter in the Schedule section of the Properties tab on the DataStudio page in the DataWorks console.	DAY
DependentType	string	No	The type of the cross-cycle scheduling dependency of the node. Valid values: SELF: The instance generated for the node in the current cycle depends on the instance generated for the node in the previous cycle. CHILD: The instance generated for the node in the current cycle depends on the instances generated for the descendant nodes at the nearest level of the node in the previous cycle. USER_DEFINE: The instance generated for the node in the current cycle depends on the instances generated for one or more specified nodes in the previous cycle. NONE: No cross-cycle scheduling dependency type is selected for the node. USER_DEFINE_AND_SELF: The instance generated for the node in the current cycle depends on the instance generated for the node in the previous cycle and the instances generated for one or more specified nodes in the previous cycle. CHILD_AND_SELF: The instance generated for the node in the current cycle depends on the instances generated for the descendant nodes at the nearest level of the node in the previous cycle and the instance generated for the node in the previous cycle.	NONE
DependentNodeIdList	string	No	The IDs of the nodes that generate instances in the previous cycle on which the current node depends.	abc
InputList	string	Yes	The output name of the parent file on which the current file depends. If you specify multiple output names, separate them with commas (,). This parameter corresponds to the Output Name parameter under Parent Nodes in the Dependencies section of the Properties tab in the DataWorks console.	project_root,project.file1,project.001_out
ProjectIdentifier	string	No	The name of the DataWorks workspace. You can log on to the DataWorks console and go to the Workspace Management page to obtain the workspace name. You must configure this parameter or the ProjectId parameter to determine the DataWorks workspace to which the operation is applied.	dw_project
ResourceGroupIdentifier	string	No	The identifier of the resource group that is used to run the node. You can call the ListResourceGroups operation to query the available resource groups in the workspace. The Identifier parameter in the response of the operation indicates the identifier of an available resource group. Note You must make sure that the available resource groups in the response of the ListResourceGroups operation are associated with the workspace for which you want to create a file by calling the CreateFile operation.	group_375827434852437
ResourceGroupId	long	No	This parameter is deprecated. Do not use this parameter. The identifier of the resource group that is used to run the node. This parameter corresponds to the Resource Group parameter in the Resource Group section of the Properties tab in the DataWorks console. You must configure one of the ResourceGroupId and ResourceGroupIdentifier parameters to determine the resource group that is used to run the node. You can call the ListResourceGroups operation to query the available resource groups in the workspace. When you call the operation, set the ResourceGroupType parameter to 1. The response parameter Id indicates the ID of an available resource group.	375827434852437
ConnectionName	string	No	The name of the data source for which the node is run. You can call the UpdateDataSource operation to query the available data sources in the workspace.	odps_first
AutoParsing	boolean	No	Specifies whether to enable the automatic parsing feature for the file. Valid values: true false This parameter corresponds to the Analyze Code parameter that is displayed after Same Cycle is selected in the Dependencies section of the Properties tab on the DataStudio page in the DataWorks console.	true
SchedulerType	string	No	The scheduling type of the node. Valid values: NORMAL: The node is an auto triggered node. MANUAL: The node is a manually triggered node. Manually triggered nodes cannot be automatically triggered. They correspond to the nodes in the Manually Triggered Workflows pane. PAUSE: The node is a paused node. SKIP: The node is a dry-run node. Dry-run nodes are started as scheduled, but the system sets the status of the nodes to successful when it starts to run them	NORMAL
AdvancedSettings	string	No	The advanced configurations of the node. This parameter is valid only for an EMR Spark Streaming node or an EMR Streaming SQL node. This parameter corresponds to the Advanced Settings tab of the node in the DataWorks console. The value of this parameter must be in the JSON format.	{"queue":"default","SPARK_CONF":"--conf spark.driver.memory=2g"}
StartImmediately	boolean	No	Specifies whether to immediately run a node after the node is deployed. This parameter is valid only for an EMR Spark Streaming node or an EMR Streaming SQL node. This parameter corresponds to the Start Method parameter in the Schedule section of the Configure tab in the DataWorks console.	true
InputParameters	string	No	The input parameters of the node. The value of this parameter must be in the JSON format. For more information about the input parameters, see the InputContextParameterList parameter in the Response parameters section of the GetFile operation. This parameter corresponds to the Input Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console.	[{"ValueSource": "project_001.first_node:bizdate_param","ParameterName": "bizdate_input"}]
OutputParameters	string	No	The output parameters of the node. The value of this parameter must be in the JSON format. For more information about the output parameters, see the OutputContextParameterList parameter in the Response parameters section of the GetFile operation. This parameter corresponds to the Output Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console.	[{"Type": 1,"Value": "${bizdate}","ParameterName": "bizdate_param"}]
ApplyScheduleImmediately	boolean	No	Specifies whether scheduling configurations immediately take effect after the node is deployed.	true
Timeout	integer	No	The timeout period.	1

Response parameters

Parameter	Type	Description	Example
	object
HttpStatusCode	integer	The HTTP status code returned.	200
Data	long	The ID of the file that was created.	1000001
RequestId	string	The ID of the request. You can use the ID to troubleshoot issues.	0000-ABCD-EFG
ErrorMessage	string	The error message returned.	The connection does not exist.
Success	boolean	Indicates whether the request was successful. Valid values: true: The request was successful. false: The request failed.	true
ErrorCode	string	The error code returned.	Invalid.Tenant.ConnectionNotExists

Examples

Sample success responses

JSONformat

{
  "HttpStatusCode": 200,
  "Data": 1000001,
  "RequestId": "0000-ABCD-EFG",
  "ErrorMessage": "The connection does not exist.",
  "Success": true,
  "ErrorCode": "Invalid.Tenant.ConnectionNotExists"
}

Error codes

HTTP status code	Error code	Error message	Description
403	Forbidden.Access	Access is forbidden. Please first activate DataWorks Enterprise Edition or Flagship Edition.	No permission, please authorize
429	Throttling.Api	The request for this resource has exceeded your available limit.	-
429	Throttling.System	The DataWorks system is busy. Try again later.	-
429	Throttling.User	Your request is too frequent. Try again later.	-
500	InternalError.System	An internal system error occurred. Try again later.	-
500	InternalError.UserId.Missing	An internal system error occurred. Try again later.	-

For a list of error codes, visit the Service error codes.

Change history

Change time	Summary of changes	Operation
2024-12-13	The Error code has changed. The request parameters of the API has changed	View Change Details
2024-09-02	The Error code has changed. The request parameters of the API has changed	View Change Details
2024-04-03	The Error code has changed	View Change Details
2023-07-14	The Error code has changed. The request parameters of the API has changed	View Change Details
2023-04-25	The Error code has changed. The request parameters of the API has changed	View Change Details