Creates a file in DataStudio. You cannot call this operation to create files for Data Integration nodes.
Debugging
Request Parameters
Parameter | Type | Required | Example | Description |
Action | String | Yes | CreateFile | The operation that you want to perform. Set the value to CreateFile. |
FileFolderPath | String | Yes | Workflow/1/MaxCompute/Folder 1/Folder 2 | The path of the file. |
ProjectId | Long | Yes | 10000 | The DataWorks workspace ID. You can log on to the DataWorks console and go to the Workspace page to obtain the ID. You must configure either this parameter or the ProjectIdentifier parameter to determine the DataWorks workspace to which the operation is applied. |
FileName | String | Yes | File name | The name of the file. |
FileDescription | String | No | File description | The description of the file. |
FileType | Integer | Yes | 10 | The type of the code for the file. Valid values: 6 (Shell), 10 (ODPS SQL), 11 (ODPS MR), 24 (ODPS Script), 99 (zero load), 221 (PyODPS 2), 225 (ODPS Spark), 227 (EMR Hive), 228 (EMR Spark), 229 (EMR Spark SQL), 230 (EMR MR), 239 (OSS object inspection), 257 (EMR Shell), 258 (EMR Spark Shell), 259 (EMR Presto), 260 (EMR Impala), 900 (real-time synchronization), 1089 (cross-tenant collaboration), 1091 (Hologres development), 1093 (Hologres SQL), 1100 (assignment), and 1221 (PyODPS 3). You can call the ListFileType operation to query the type of the code for the file. |
Owner | String | No | 1000000000001 | The ID of the Alibaba Cloud account used by the file owner. If this parameter is not configured, the ID of the Alibaba Cloud account of the user who calls the operation is used by default. |
Content | String | No | SHOW TABLES; | The code for the file. The code format varies based on the file type. To view the code format for a specific file type, go to Operation Center, find a node of the file type, and then open the directed acyclic graph (DAG) of the node. Right-click the node in the DAG and select View Code. |
AutoRerunTimes | Integer | No | 3 | The number of automatic reruns that are allowed after an error occurs. Maximum value: 10. |
AutoRerunIntervalMillis | Integer | No | 120000 | The interval between automatic reruns after an error occurs. Unit: milliseconds. Maximum value: 1800000 (30 minutes). This parameter corresponds to the Rerun Interval parameter that is displayed after the Auto Rerun upon Error check box is selected in the Schedule section of the Properties tab in the DataWorks console. The interval that you specify in the DataWorks console is measured in minutes. Pay attention to the conversion between the units of time when you call the operation. |
RerunMode | String | No | ALL_ALLOWED | Specifies whether the node that corresponds to the file can be rerun. Valid values:
This parameter corresponds to the Rerun parameter in the Schedule section of the Properties tab in the DataWorks console. |
Stop | Boolean | No | false | Specifies whether to suspend the scheduling of the node. Valid values:
This parameter corresponds to the Recurrence parameter in the Schedule section of the Properties tab in the DataWorks console. |
ParaValue | String | No | a=x b=y | The scheduling parameters of the node. Separate multiple parameters with spaces. This parameter corresponds to the Parameters section of the Properties tab in the DataWorks console. For more information about the configurations of the scheduling parameters, see Configure scheduling parameters. |
StartEffectDate | Long | No | 1671608450000 | The start time of automatic scheduling. Set the value to a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC. Configuring this parameter is equivalent to specifying a start time for the Validity Period parameter in the Schedule section of the Properties tab in the DataWorks console. |
EndEffectDate | Long | No | 1671694850000 | The end time of automatic scheduling. Set the value to a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC. Configuring this parameter is equivalent to specifying an end time for the Validity Period parameter in the Schedule section of the Properties tab in the DataWorks console. |
CronExpress | String | No | 00 05 00 * * ? | The CRON expression that represents the periodic scheduling policy of the node. This parameter corresponds to the Cron Expression parameter in the Schedule section of the Properties tab in the DataWorks console. After you configure the Scheduling Cycle and Run At parameters in the DataWorks console, DataWorks generates the value of the Cron Expression parameter. Examples:
The scheduling system of DataWorks imposes the following limits on CRON expressions:
|
CycleType | String | No | DAY | The type of the scheduling cycle. Valid values: NOT_DAY and DAY. The value NOT_DAY indicates that the node is scheduled to run by minute or hour. The value DAY indicates that the node is scheduled to run by day, week, or month. This parameter corresponds to the Scheduling Cycle parameter in the Schedule section of the Properties tab in the DataWorks console. |
DependentType | String | No | NONE | The type of the cross-cycle scheduling dependency of the node. Valid values:
|
DependentNodeIdList | String | No | abc | The IDs of the nodes that generate instances in the previous cycle on which the current node depends. |
InputList | String | Yes | project_root,project.file1,project.001_out | The output name of the parent file on which the current file depends. If you specify multiple output names, separate them with commas (,). This parameter corresponds to the Output Name of Ancestor Node parameter that is displayed after you select Same Cycle in the Dependencies section of the Properties tab in the DataWorks console. |
ProjectIdentifier | String | No | dw_project | The name of the DataWorks workspace. You can log on to the DataWorks console and go to the Workspace page to obtain the workspace name. You must configure either this parameter or ProjectId to determine the DataWorks workspace to which the operation is applied. |
ResourceGroupIdentifier | String | No | group_375827434852437 | The identifier of the resource group that is used to run the node. You can call the ListResourceGroups operation to query the available resource groups in the workspace. The Identifier parameter in the response of the operation indicates the identifier of an available resource group. |
ResourceGroupId | Long | No | 375827434852437 | This parameter is deprecated. Do not use this parameter. The ID of the resource group that is used to run the node. This parameter corresponds to the Resource Group parameter in the Resource Group section of the Properties tab in the DataWorks console. You must configure this parameter or ResourceGroupIdentifier to determine the resource group that is used to run the node. You can call the ListResourceGroups operation to query the available resource groups in the workspace. When you call the operation, set ResourceGroupType to 1. The response parameter Id indicates the ID of an available resource group. |
ConnectionName | String | No | odps_source | The name of the data source for which the node is run. You can call the UpdateDataSource operation to query the available data sources in the workspace. |
AutoParsing | Boolean | No | true | Specifies whether to enable the automatic parsing feature for the file. Valid values:
This parameter corresponds to the Automatic Parsing From Code Before Node Committing parameter that is displayed after you select Same Cycle in the Dependencies section of the Properties tab in the DataWorks console. |
SchedulerType | String | No | NORMAL | The scheduling type of the node. Valid values:
|
AdvancedSettings | String | No | {"queue":"default","SPARK_CONF":"--conf spark.driver.memory=2g"} | The advanced configurations of the node. This parameter is valid only for an EMR Spark Streaming node or an EMR Streaming SQL node. This parameter corresponds to the Advanced Settings tab of the node in the DataWorks console. The value of this parameter must be in the JSON format. |
StartImmediately | Boolean | No | true | Specifies whether to immediately run a node after the node is deployed to the production environment. This parameter is valid only for an EMR Spark Streaming node or an EMR Streaming SQL node. This parameter corresponds to the Start Method parameter in the Schedule section of the Configure tab in the DataWorks console. |
InputParameters | String | No | [{"ValueSource": "project_001.first_node:bizdate_param","ParameterName": "bizdate_input"}] | The input parameters of the node. The value of this parameter must be in the JSON format. For more information about the input parameters, see the InputContextParameterList parameter in the Response parameters section of the GetFile operation. This parameter corresponds to the Input Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console. |
OutputParameters | String | No | [{"Type": 1,"Value": "${bizdate}","ParameterName": "bizdate_param"}] | The output parameters of the node. The value of this parameter must be in the JSON format. For more information about the output parameters, see the OutputContextParameterList parameter in the Response parameters section of the GetFile operation. This parameter corresponds to the Output Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console. |
IgnoreParentSkipRunningProperty | Boolean | No | false | Specifies whether to use the dry-run property of the previous cycle. Valid values:
|
CreateFolderIfNotExists | Boolean | No | false | Specifies whether to automatically create the directory that is specified by the FileFolderPath parameter if the directory does not exist. Valid values: true: The system automatically creates the directory if the directory does not exist. false: The system does not automatically create the directory if the directory does not exist. In this case, the call fails. |
Response parameters
Parameter | Type | Example | Description |
HttpStatusCode | Integer | 200 | The HTTP status code. |
Data | Long | 1000001 | The ID of the file that is created. |
RequestId | String | 0000-ABCD-EFG | The request ID. |
ErrorMessage | String | The connection does not exist. | The error message. |
Success | Boolean | true | Indicates whether the request was successful. Valid values:
|
ErrorCode | String | Invalid.Tenant.ConnectionNotExists | The error code. |
Examples
Sample requests
http(s)://[Endpoint]/?Action=CreateFile
&FileFolderPath=Workflow/1/MaxCompute/Folder 1/Folder 2
&ProjectId=10000
&FileName=File name
&FileDescription=File description
&FileType=10
&Owner=1000000000001
&Content=SHOW TABLES;
&AutoRerunTimes=3
&AutoRerunIntervalMillis=120000
&RerunMode=ALL_ALLOWED
&Stop=false
&ParaValue=a=x b=y
&StartEffectDate=1671608450000
&EndEffectDate=1671694850000
&CronExpress=00 05 00 * * ?
&CycleType=DAY
&DependentType=NONE
&DependentNodeIdList=abc
&InputList=project_root,project.file1,project.001_out
&ProjectIdentifier=dw_project
&ResourceGroupIdentifier=group_375827434852437
&ResourceGroupId=375827434852437
&ConnectionName=odps_source
&AutoParsing=true
&SchedulerType=NORMAL
&AdvancedSettings={"queue":"default","SPARK_CONF":"--conf spark.driver.memory=2g"}
&StartImmediately=true
&InputParameters=[{"ValueSource": "project_001.first_node:bizdate_param","ParameterName": "bizdate_input"}]
&OutputParameters=[{"Type": 1,"Value": "${bizdate}","ParameterName": "bizdate_param"}]
&IgnoreParentSkipRunningProperty=false
&CreateFolderIfNotExists=false
&<Common request parameters>
Sample success responses
XML
format
HTTP/1.1 200 OK
Content-Type:application/xml
<CreateFileResponse>
<HttpStatusCode>200</HttpStatusCode>
<Data>1000001</Data>
<RequestId>0000-ABCD-EFG</RequestId>
<ErrorMessage>The connection does not exist.</ErrorMessage>
<Success>true</Success>
<ErrorCode>Invalid.Tenant.ConnectionNotExists</ErrorCode>
</CreateFileResponse>
JSON
format
HTTP/1.1 200 OK
Content-Type:application/json
{
"HttpStatusCode" : 200,
"Data" : 1000001,
"RequestId" : "0000-ABCD-EFG",
"ErrorMessage" : "The connection does not exist.",
"Success" : true,
"ErrorCode" : "Invalid.Tenant.ConnectionNotExists"
}
Error codes
HTTP status code | Error code | Error message | Description |
429 | Throttling.Api | The request for this resource has exceeded your available limit. | The number of requests for the resource has exceeded the upper limit. |
429 | Throttling.System | The DataWorks system is busy. Try again later. | The DataWorks system is busy. Try again later. |
429 | Throttling.User | Your request is too frequent. Try again later. | Excessive requests have been submitted within a short period of time. Try again later. |
500 | InternalError.System | An internal system error occurred. Try again later. | An internal error has occurred. Try again later. |
500 | InternalError.UserId.Missing | An internal system error occurred. Try again later. | An internal error has occurred. Try again later. |
For a list of error codes, see Service error codes.