All Products
Search
Document Center

DataWorks:GetFile

Last Updated:Nov 21, 2024

Queries the information about a file.

Debugging

You can run this interface directly in OpenAPI Explorer, saving you the trouble of calculating signatures. After running successfully, OpenAPI Explorer can automatically generate SDK code samples.

Authorization information

The following table shows the authorization information corresponding to the API. The authorization information can be used in the Action policy element to grant a RAM user or RAM role the permissions to call this API operation. Description:

  • Operation: the value that you can use in the Action element to specify the operation on a resource.
  • Access level: the access level of each operation. The levels are read, write, and list.
  • Resource type: the type of the resource on which you can authorize the RAM user or the RAM role to perform the operation. Take note of the following items:
    • The required resource types are displayed in bold characters.
    • If the permissions cannot be granted at the resource level, All Resources is used in the Resource type column of the operation.
  • Condition Key: the condition key that is defined by the cloud service.
  • Associated operation: other operations that the RAM user or the RAM role must have permissions to perform to complete the operation. To complete the operation, the RAM user or the RAM role must have the permissions to perform the associated operations.
OperationAccess levelResource typeCondition keyAssociated operation
dataworks:*get
*All Resources
*
    none
none

Request parameters

ParameterTypeRequiredDescriptionExample
ProjectIdlongNo

The ID of the DataWorks workspace. You can log on to the DataWorks console and go to the Workspace Management page to obtain the workspace ID.

You must configure this parameter or the ProjectIdentifier parameter to determine the DataWorks workspace to which the operation is applied.

10000
ProjectIdentifierstringNo

The name of the DataWorks workspace. You can log on to the DataWorks console and go to the Workspace Management page to obtain the workspace name.

You must configure this parameter or the ProjectId parameter to determine the DataWorks workspace to which the operation is applied.

dw_project
FileIdlongNo

The ID of the file. You can call the ListFiles operation to obtain the ID.

100000001
NodeIdlongNo

The ID of the node that is scheduled. You can call the ListFiles operation to obtain the node ID.

200000001

Response parameters

ParameterTypeDescriptionExample
object

The response parameters.

HttpStatusCodeinteger

The HTTP status code returned.

200
ErrorMessagestring

The error message returned.

The connection does not exist.
RequestIdstring

The ID of the request. You can use the ID to troubleshoot issues.

0000-ABCD-EFG****
ErrorCodestring

The error code returned.

Invalid.Tenant.ConnectionNotExists
Successboolean

Indicates whether the request is successful. Valid values:

  • true: The request is successful.
  • false: The request fails.
true
Dataobject

The details of the file.

Fileobject

The basic information about the file.

CommitStatusinteger

Indicates whether the latest code in the file is committed. Valid values: 0 and 1. The value 0 indicates that the latest code in the file is not committed. The value 1 indicates that the latest code in the file is committed.

0
AutoParsingboolean

Indicates whether the automatic parsing feature is enabled for the file. Valid values:

  • true: The automatic parsing feature is enabled for the file.
  • false: The automatic parsing feature is not enabled for the file.

This parameter corresponds to the Analyze Code parameter that is displayed after Same Cycle is selected in the Dependencies section of the Properties tab in the DataWorks console.

true
Ownerstring

The ID of the Alibaba Cloud account used by the file owner.

7775674356****
CreateTimelong

The time when the file was created. This value is a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC.

1593879116000
FileTypeinteger

The type of the code for the file. Valid values: 6 (Shell), 10 (ODPS SQL), 11 (ODPS MR), 23 (Data Integration), 24 (ODPS Script), 99 (zero load), 221 (PyODPS 2), 225 (ODPS Spark), 227 (EMR Hive), 228 (EMR Spark), 229 (EMR Spark SQL), 230 (EMR MR), 239 (OSS object inspection), 257 (EMR Shell), 258 (EMR Spark Shell), 259 (EMR Presto), 260 (EMR Impala), 900 (real-time synchronization), 1089 (cross-tenant collaboration), 1091 (Hologres development), 1093 (Hologres SQL), 1100 (assignment), and 1221 (PyODPS 3).

10
CurrentVersioninteger

The latest version number of the file.

3
BizIdlong

The ID of the workflow to which the file belongs. This parameter is deprecated and replaced by the BusinessId parameter.

1000001
LastEditUserstring

The ID of the Alibaba Cloud account used to last modify the file.

62465892****
FileNamestring

The name of the file.

ods_user_info_d
ConnectionNamestring

The ID of the compute engine instance that is used to run the node that corresponds to the file.

odps_first
UseTypestring

The module to which the file belongs. Valid values:

  • NORMAL: The file is used for DataStudio.
  • MANUAL: The file is used for a manually triggered node.
  • MANUAL_BIZ: The file is used for a manually triggered workflow.
  • SKIP: The file is used for a dry-run DataStudio node.
  • ADHOCQUERY: The file is used for an ad hoc query.
  • COMPONENT: The file is used for a snippet.
NORMAL
FileFolderIdstring

The ID of the folder to which the file belongs.

2735c2****
ParentIdlong

The ID of the node group file to which the current file belongs. This parameter is returned only if the current file is an inner file of the node group file.

-1
CreateUserstring

The ID of the Alibaba Cloud account used to create the file.

424732****
IsMaxComputeboolean

Indicates whether the file needs to be uploaded to MaxCompute.

This parameter is returned only if the file is a MaxCompute resource file.

true
BusinessIdlong

The ID of the workflow to which the file belongs.

1000001
FileDescriptionstring

The description of the file.

DeletedStatusstring

The status of the file. Valid values:

  • NORMAL: The file is not deleted.
  • RECYCLE_BIN: The file is stored in the recycle bin.
  • DELETED: The file is deleted.
RECYCLE
LastEditTimelong

The time when the file was last modified. This value is a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC.

1593879116000
Contentstring

The code in the file.

SHOW TABLES;
NodeIdlong

The ID of the auto triggered node that is generated in the scheduling system after the file is committed.

300001
AdvancedSettingsstring

The advanced configurations of the node.

This parameter is valid for an EMR node. This parameter corresponds to the Advanced Settings tab in the right-side navigation pane on the configuration tab of the node in the DataWorks console.

Note You cannot configure advanced parameters for EMR Shell nodes.

For information about the advanced parameters of each type of EMR node, see Develop EMR tasks.

{"queue":"default","SPARK_CONF":"--conf spark.driver.memory=2g"}
NodeConfigurationobject

The scheduling configurations of the file.

RerunModestring

Indicates whether the node that corresponds to the file can be rerun. Valid values:

  • ALL_ALLOWED: The node can be rerun regardless of whether it is successfully run or fails to run.
  • FAILURE_ALLOWED: The node can be rerun only after it fails to run.
  • ALL_DENIED: The node cannot be rerun regardless of whether it is successfully run or fails to run.

This parameter corresponds to the Rerun parameter in the Schedule section of the Properties tab in the DataWorks console.

ALL_ALLOWED
SchedulerTypestring

The scheduling type of the node. Valid values:

  • NORMAL: The node is an auto triggered node.
  • MANUAL: The node is a manually triggered node. Manually triggered nodes cannot be automatically triggered. They correspond to the nodes in the Manually Triggered Workflows pane.
  • PAUSE: The node is a paused node.
  • SKIP: The node is a dry-run node. Dry-run nodes are started as scheduled but the system sets the status of the nodes to successful when it starts to run them.
NORMAL
Stopboolean

Indicates whether the scheduling for the node is suspended Valid values:

  • true: The scheduling for the node is suspended.
  • false: The scheduling for the node is not suspended.

This parameter corresponds to the Recurrence parameter in the Schedule section of the Properties tab in the DataWorks console.

false
ParaValuestring

The scheduling parameters of the node.

This parameter corresponds to the Parameters section of the Properties tab in the DataWorks console. For more information about the configurations of the scheduling parameters, see Configure scheduling parameters.

a=x b=y
StartEffectDatelong

The start time of automatic scheduling. This value is a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC.

This parameter corresponds to the Validity Period parameter in the Schedule section of the Properties tab in the DataWorks console.

936923400000
EndEffectDatelong

The end time of automatic scheduling. This value is a UNIX timestamp representing the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC.

This parameter corresponds to the Validity Period parameter in the Schedule section of the Properties tab in the DataWorks console.

4155787800000
CycleTypestring

The type of the scheduling cycle. Valid values: NOT_DAY and DAY. The value NOT_DAY indicates that the node is scheduled to run by minute or hour. The value DAY indicates that the node is scheduled to run by day, week, or month.

This parameter corresponds to the Scheduling Cycle parameter in the Schedule section of the Properties tab in the DataWorks console.

DAY
DependentNodeIdListstring

The ID of the node on which the node corresponding to the file depends when the DependentType parameter is set to USER_DEFINE. Multiple IDs are separated by commas (,).

The value of this parameter is equivalent to the ID of the node that you specified after you select Previous Cycle and set Depend On to Other Nodes in the Dependencies section of the Properties tab in the DataWorks console.

5,10,15,20
ResourceGroupIdlong

The ID of the resource group that is used to run the node. You can call the ListResourceGroups operation to query the available resource groups in the workspace.

375827434852437
DependentTypestring

The type of the cross-cycle scheduling dependency of the node. Valid values:

  • SELF: The instance generated for the node in the current cycle depends on the instance generated for the node in the previous cycle.
  • CHILD: The instance generated for the node in the current cycle depends on the instances generated for the descendant nodes at the nearest level of the node in the previous cycle.
  • USER_DEFINE: The instance generated for the node in the current cycle depends on the instances generated for one or more specified nodes in the previous cycle.
  • NONE: No cross-cycle scheduling dependency type is selected for the node.
USER_DEFINE
AutoRerunTimesinteger

The number of automatic reruns that are allowed after an error occurs.

3
AutoRerunIntervalMillisinteger

The interval between automatic reruns after an error occurs. Unit: milliseconds.

This parameter corresponds to the Rerun Interval parameter that is displayed after the Auto Rerun upon Error check box is selected in the Schedule section of the Properties tab in the DataWorks console.

The interval that you specify in the DataWorks console is measured in minutes. Pay attention to the conversion between the units of time when you call the operation.

120000
CronExpressstring

The CRON expression that represents the periodic scheduling policy of the node.

00 05 00 * * ?
InputListarray<object>

The output names of the parent files on which the current file depends.

NodeInputOutputobject
Inputstring

The output name of the parent file on which the current file depends.

This parameter corresponds to the Output Name of Ancestor Node parameter under Parent Nodes after Same Cycle is selected in the Dependencies section of the Properties tab in the DataWorks console.

project.001_out
ParseTypestring

The mode of the configuration file dependency. Valid values:

  • MANUAL: Scheduling dependencies are manually configured.
  • AUTO: Scheduling dependencies are automatically parsed.
MANUAL
OutputListarray<object>

The output names of the current file.

This parameter corresponds to the Output Name parameter under Output after Same Cycle is selected in the Dependencies section of the Properties tab in the DataWorks console.

NodeInputOutputobject
RefTableNamestring

The output table name of the current file.

This parameter corresponds to the Output Table Name parameter under Output after Same Cycle is selected in the Dependencies section of the Properties tab in the DataWorks console.

ods_user_info_d
Outputstring

The output name of the current file.

This parameter corresponds to the Output Name parameter under Output after Same Cycle is selected in the Dependencies section of the Properties tab in the DataWorks console.

dw_project.002_out
StartImmediatelyboolean

Indicates whether a node is immediately run after the node is deployed to the production environment.

This parameter is valid only for an EMR Spark Streaming node or an EMR Streaming SQL node. This parameter corresponds to the Start Method parameter in the Schedule section of the Configure tab in the DataWorks console.

true
InputParametersarray<object>

Input parameters of the node.

This parameter corresponds to the Input Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console.

InputContextParameterobject
ParameterNamestring

The name of the input parameter of the node. In the code, you can use the ${...} method to reference the input parameter of the node.

This parameter corresponds to the Parameter Name parameter in the Input Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console.

input
ValueSourcestring

The value source of the input parameter of the node.

This parameter corresponds to the Value Source parameter in the Input Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console.

project_001.parent_node:outputs
OutputParametersarray<object>

Output parameters of the node.

This parameter corresponds to the Output Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console.

OutputContextParameterobject
ParameterNamestring

The name of the output parameter of the node.

This parameter corresponds to the Parameter Name parameter in the Output Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console.

output
Valuestring

The value of the output parameter of the node.

This parameter corresponds to the Value parameter in the Output Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console.

${bizdate}
Typestring

The type of the output parameter of the node. Valid values:

  • 1: indicates a constant.
  • 2: indicates a variable.
  • 3: indicates a pass-through variable.

This parameter corresponds to the Type parameter in the Output Parameters table in the Input and Output Parameters section of the Properties tab in the DataWorks console.

1
Descriptionstring

The description of the output parameter of the node.

It's a context output parameter.
ApplyScheduleImmediatelystring

Indicates whether scheduling configurations immediately take effect after the deployment.

true

Examples

Sample success responses

JSONformat

{
  "HttpStatusCode": 200,
  "ErrorMessage": "The connection does not exist.",
  "RequestId": "0000-ABCD-EFG****",
  "ErrorCode": "Invalid.Tenant.ConnectionNotExists",
  "Success": true,
  "Data": {
    "File": {
      "CommitStatus": 0,
      "AutoParsing": true,
      "Owner": "7775674356****",
      "CreateTime": 1593879116000,
      "FileType": 10,
      "CurrentVersion": 3,
      "BizId": 1000001,
      "LastEditUser": "62465892****",
      "FileName": "ods_user_info_d",
      "ConnectionName": "odps_first",
      "UseType": "NORMAL",
      "FileFolderId": "2735c2****",
      "ParentId": -1,
      "CreateUser": "424732****",
      "IsMaxCompute": true,
      "BusinessId": 1000001,
      "FileDescription": "",
      "DeletedStatus": "RECYCLE",
      "LastEditTime": 1593879116000,
      "Content": "SHOW TABLES;",
      "NodeId": 300001,
      "AdvancedSettings": "{\"queue\":\"default\",\"SPARK_CONF\":\"--conf spark.driver.memory=2g\"}",
      "FileId": 100000001
    },
    "NodeConfiguration": {
      "RerunMode": "ALL_ALLOWED",
      "SchedulerType": "NORMAL",
      "Stop": false,
      "ParaValue": "a=x b=y",
      "StartEffectDate": 936923400000,
      "EndEffectDate": 4155787800000,
      "CycleType": "DAY",
      "DependentNodeIdList": "5,10,15,20",
      "ResourceGroupId": 375827434852437,
      "DependentType": "USER_DEFINE",
      "AutoRerunTimes": 3,
      "AutoRerunIntervalMillis": 120000,
      "CronExpress": "00 05 00 * * ?",
      "InputList": [
        {
          "Input": "project.001_out",
          "ParseType": "MANUAL"
        }
      ],
      "OutputList": [
        {
          "RefTableName": "ods_user_info_d",
          "Output": "dw_project.002_out"
        }
      ],
      "StartImmediately": true,
      "InputParameters": [
        {
          "ParameterName": "input",
          "ValueSource": "project_001.parent_node:outputs"
        }
      ],
      "OutputParameters": [
        {
          "ParameterName": "output",
          "Value": "${bizdate}",
          "Type": "1",
          "Description": "It's a context output parameter."
        }
      ],
      "ApplyScheduleImmediately": "true"
    },
    "ResourceDownloadLink": {
      "downloadLink": ""
    }
  }
}

Error codes

HTTP status codeError codeError messageDescription
403Forbidden.AccessAccess is forbidden. Please first activate DataWorks Enterprise Edition or Flagship Edition.No permission, please authorize
429Throttling.ApiThe request for this resource has exceeded your available limit.-
429Throttling.SystemThe DataWorks system is busy. Try again later.-
429Throttling.UserYour request is too frequent. Try again later.-
500InternalError.SystemAn internal system error occurred. Try again later.-
500InternalError.UserId.MissingAn internal system error occurred. Try again later.-

For a list of error codes, visit the Service error codes.

Change history

Change timeSummary of changesOperation
2024-09-03The Error code has changed. The response structure of the API has changedView Change Details
2024-09-02The Error code has changed. The response structure of the API has changedView Change Details
2023-09-12The Error code has changed. The response structure of the API has changedView Change Details