ALIYUN::PAIDLC::Job is used to create a Machine Learning Platform for AI (PAI) job to run in a cluster.
Syntax
{
"Type": "ALIYUN::PAIDLC::Job",
"Properties": {
"ThirdpartyLibs": List,
"Options": String,
"Priority": Integer,
"Envs": String,
"JobMaxRunningTimeMinutes": Integer,
"WorkspaceId": String,
"CodeSource": Map,
"UserVpc": Map,
"JobSpecs": List,
"UserCommand": String,
"DataSources": List,
"JobType": String,
"ResourceId": String,
"ThirdpartyLibDir": String,
"DisplayName": String,
"SuccessPolicy": String,
"Settings": Map
}
}
Properties
Property | Type | Required | Editable | Description | Constraint |
ThirdpartyLibs | List | No | No | The third-party Python library and its version. | Example: |
Options | String | No | No | The additional configurations of the job. | You can use this property to adjust the behavior of the attached data source. For example, if the attached data source of the job is of the Object Storage Service (OSS) type, you can use this property to add the following configurations to override the default parameters of |
Priority | Integer | No | Yes | The priority of the job. | Default value: 1. Valid values: 1 to 9. Each value specifies a different priority:
|
Envs | String | No | No | The environment variable configurations. | None. |
JobMaxRunningTimeMinutes | Integer | No | No | The maximum running duration of the job. | Unit: minutes. |
WorkspaceId | String | Yes | No | The workspace ID. | None. |
CodeSource | Map | No | No | The code source of the job. | Before the node of the job starts to run, Deep Learning Containers (DLC) automatically downloads the configured code from the code source and mounts the code to the on-premises path of the container. For more information, see CodeSource properties. |
UserVpc | Map | No | No | The configurations of the user virtual private cloud (VPC). | For more information, see UserVpc syntax. |
JobSpecs | List | Yes | No | The running configurations of the job. | For more information, see JobSpecs properties. |
UserCommand | String | Yes | No | The startup command for all nodes of the job. | None. |
DataSources | List | No | No | All data sources of the job. | The data source is mounted to the on-premises path of the container that runs on each node based on the configuration in the data sources. The MountPath property in DataSources specifies the on-premises path. The process in the startup command of the job directly accesses the distributed file system that resides in the path specified by MountPath. Each data source represents a distributed file system. For more information, see DataSources properties. |
JobType | String | Yes | No | The job type. | The value is case-sensitive. The following job types are supported:
|
ResourceId | String | No | No | The ID of the resource group to which the job belongs. | This property is optional.
|
ThirdpartyLibDir | String | No | No | The name of the folder in which the requirements.txt file of the Python third-party library resides. | Before each node runs the startup command specified by UserCommand, DLC fetches the |
DisplayName | String | Yes | No | The name of the job. | The name must meet the following requirements:
|
SuccessPolicy | String | No | No | The policy that is used to check whether a distributed multi-node job is successful. | Only TensorFlow distributed multi-node jobs are supported. Valid values:
|
Settings | Map | No | No | The additional parameter configurations of the job. | None. |
CodeSource syntax
"CodeSource": {
"MountPath": String,
"Commit": String,
"Branch": String,
"CodeSourceId": String
}
CodeSource properties
Property | Type | Required | Editable | Description | Constraint |
MountPath | String | No | No | The path to which you want to mount the job. | By default, the mount path that is configured in the data source is used. |
Commit | String | No | No | The commit ID of the code that is required to be downloaded for the job. | By default, the commit ID that is configured in the code source is used. |
Branch | String | No | No | The branch of the code repository that is referenced when the job is running. | By default, the branch that is configured in the code source is used. |
CodeSourceId | String | Yes | No | The ID of the code source. | None. |
UserVpc syntax
"UserVpc": {
"VpcId": String,
"SecurityGroupId": String,
"SwitchId": String,
"ExtendedCIDRs": List
}
UserVpc properties
Property | Type | Required | Editable | Description | Constraint |
VpcId | String | Yes | No | The VPC ID. | None. |
SecurityGroupId | String | No | No | The ID of the security group. | None. |
SwitchId | String | No | No | The vSwitch ID. | This property is optional.
|
ExtendedCIDRs | List | No | No | The extended CIDR blocks. | Valid values:
|
JobSpecs syntax
"JobSpecs": [
{
"PodCount": Integer,
"ImageConfig": Map,
"UseSpotInstance": Boolean,
"Type": String,
"EcsSpec": String,
"ResourceConfig": Map,
"Image": String,
"ExtraPodSpec": Map
}
]
JobSpecs properties
Property | Type | Required | Editable | Description | Constraint |
PodCount | Integer | Yes | No | The number of replicas. | None. |
ImageConfig | Map | No | No | The private image configurations. | None. |
UseSpotInstance | Boolean | Yes | No | Specifies whether to use preemptible instances. | Valid values:
|
Type | String | Yes | No | The node type. | Type is closely related to JobType. The valid values of Type vary based on the value of JobType:
The master node for a PyTorch or XGBoost job is optional. If you do not specify the master node for a PyTorch or XGBoost job, the system automatically uses the first worker node as the master node. |
EcsSpec | String | Yes | No | The Elastic Compute Service (ECS) instance specifications of the worker node. | The price varies based on instance specifications. For more information, see Billing of DLC. |
ResourceConfig | Map | No | No | The resource configurations. | None. |
Image | String | Yes | No | The address of the image that is run by the worker node. | You can call the ListImages operation to query community images provided by PAI and images optimized by PAI. You can also specify a third-party public image. |
ExtraPodSpec | Map | No | No | The additional pod configurations. | None. |
DataSources syntax
"DataSources": [
{
"MountPath": String,
"DataSourceId": String
}
]
DataSources properties
Property | Type | Required | Editable | Description | Constraint |
MountPath | String | No | No | The path to which you want to mount the job. | By default, the mount path that is configured in the data source is used. |
DataSourceId | String | Yes | No | The ID of the data source. | None. |
Return values
Fn::GetAtt
JobId: the job ID.