Deep Learning Containers (DLC) of Platform for AI (PAI) is used to run training tasks in a distributed manner. DataWorks provides PAI DLC nodes. You can use this type of node to load existing DLC tasks and configure scheduling dependencies to implement periodic scheduling of DLC tasks.
Prerequisites
DataWorks is authorized to access PAI.
You can complete the authorization with one click on the authorization page. For more information about the service-linked role that is created based on the authorization, see Role 1: AliyunServiceRoleForDataworksEngine. Only an Alibaba Cloud account or a RAM user to which the AliyunDataWorksFullAccess policy is attached can perform one-click authorization.
A workflow is created.
In DataStudio, development operations are performed on different development engines based on workflows. You must create a workflow before you can create a node. For more information, see Create a workflow.
Precautions
Each time a PAI DLC node is run, a new DLC task is generated on the DLC platform of PAI. To prevent multiple tasks that have the same name from being generated in PAI when you use DataWorks to periodically schedule PAI DLC nodes, we recommend that you configure an appropriate scheduling cycle based on your business requirements when you develop DLC tasks in DataWorks. We also recommend that you add a datetime variable to the task name and assign a time-based scheduling parameter to the variable as a value. This way, you can add a date and time to the task name. For more information, see the Step 2: Develop a PAI DLC task section in this topic.
You cannot use the shared resource group for scheduling to run PAI DLC tasks.
The operations described in this topic are performed in the China (Shanghai) region. You can perform operations in other regions based on the instructions displayed in the DataWorks console.
Step 1: Create a PAI DLC node
Go to the DataStudio page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.
On the DataStudio page, find the desired workflow, right-click the workflow name, and then choose
.In the Create Node dialog box, configure the Name parameter and click Confirm. Then, you can use the node to develop tasks and configure task scheduling properties.
Step 2: Develop a PAI DLC task
Develop task code: Simple example
On the configuration tab of the PAI DLC node, you can use one of the following methods to write DLC task code:
Write task code based on an existing DLC task.
You can load a DLC task that is created in PAI by task name. After you load the task, the DLC node editor generates node code based on the configurations of the task in PAI. Then, you can modify task configurations based on the code.
NoteIf you do not have the required permissions to load or create a task, follow the on-screen instructions to obtain the required permissions.
If no task is available, create a task in the PAI console. Various methods are provided for you to create a DLC task in PAI. You can select a method based on your business requirements. For more information, see Submit training jobs, Submit jobs by using the SDK for Python, and Submit jobs by using CLIs.
Directly write DLC task code.
In the code editor of the PAI DLC node in DataWorks, write task code based on your business requirements.
After you write and run the task code, a new DLC task is generated in PAI based on the task code. Sample task code:
dlc submit xgboostjob \ #Submit a DLC task.
--name=wsytest_pai04_XGBoost \ #The name of the DLC task. We recommend that you use a variable name or a node name in DataWorks.
--command='echo '\''${Variable}'\'';' \ #The command to run in the DLC task.
--workspace_id=80593 \ #The workspace in which you want to run the DLC task.
--priority=1 \ #The task priority. Valid values: 1 to 9. The value 1 specifies the lowest priority. The value 9 specifies the highest priority.
--workers=1 \ #The number of nodes on which you want to run the task. If the number of nodes is greater than 1, the task is a distributed task and can be concurrently run on multiple nodes.
--worker_image=registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:2.3-cpu-py36-ubuntu18.04 \ #The node image that is used to provide a runtime environment for the DLC task.
--worker_spec=ecs.g6.xlarge #The instance type of compute nodes.
Develop SQL code: Use scheduling parameters
DataWorks provides scheduling parameters whose values are dynamically replaced in the code of a node based on the configurations of the scheduling parameters in periodic scheduling scenarios. You can define variables in the node code in the ${Variable}
format and assign values to the variables in the Scheduling Parameter section of the Properties tab. For information about the supported formats of scheduling parameters, see Supported formats of scheduling parameters.
Sample code of scheduling parameters:
--command='echo '\''${Variable}'\'';' \ --You can assign a specific scheduling parameter to the variable.
Step 3: Configure task scheduling properties
If you want to periodically run tasks on the created node, click Properties in the right-side navigation pane of the node configuration tab to configure the scheduling information of the node based on your business requirements. For more information, see Overview.
You must configure the Rerun and Parent Nodes parameters on the Properties tab before you commit a task on the node.
Step 4: Debug task code
You can perform the following operations to check whether the task is configured as expected based on your business requirements:
Optional. Select a resource group and assign custom parameters to variables.
Click the icon in the top toolbar of the configuration tab of the node. In the Parameters dialog box, select a resource group for scheduling that you want to use to debug and run task code.
If you use scheduling parameters in your task code, you can assign the scheduling parameters to variables as values in the task code for debugging. For more information about the value assignment logic of scheduling parameters, see What are the differences in the value assignment logic of scheduling parameters among the Run, Run with Parameters, and Perform Smoke Testing in Development Environment modes?
Save and execute SQL statements.
In the top toolbar, click the icon to save SQL statements. Then, click the icon to execute the SQL statements.
Optional. Perform smoke testing.
You can perform smoke testing on the task in the development environment to check whether the task is run as expected when you commit the task or after you commit the task. For more information, see Perform smoke testing.
Step 5: Commit and deploy the task
After a task on a node is configured, you must commit and deploy the task. After you commit and deploy the task, the system runs the task on a regular basis based on scheduling configurations.
Click the icon in the top toolbar to save the task.
Click the icon in the top toolbar to commit the task.
In the Submit dialog box, configure the Change description parameter. Then, determine whether to review task code after you commit the task based on your business requirements.
NoteYou must configure the Rerun and Parent Nodes parameters on the Properties tab before you commit the task.
You can use the code review feature to ensure the code quality of tasks and prevent task execution errors caused by invalid task code. If you enable the code review feature, the task code that is committed can be deployed only after the task code passes the code review. For more information, see Code review.
If you use a workspace in standard mode, you must deploy the task in the production environment after you commit the task. To deploy a task on a node, click Deploy in the upper-right corner of the configuration tab of the node. For more information, see Deploy nodes.
What to do next
After you commit and deploy the task, the task is periodically run based on the scheduling configurations. You can click Operation Center in the upper-right corner of the configuration tab of the corresponding node to go to Operation Center and view the scheduling status of the task. For more information, see View and manage auto triggered nodes.