File Storage NAS (NAS) is seamlessly integrated with Alibaba Cloud Platform For AI (PAI). You can configure a NAS file system as a dataset to persistently store data during deployment and training. This topic describes how to deploy a NAS file system on a Data Science Workshop (DSW) instance in the console.
Prerequisites
PAI is activated and a default workspace is created. For more information, see Activate PAI and create a default workspace.
NAS is activated.
The first time you visit the product page of NAS, follow the instructions to activate the NAS service.
A virtual private cloud (VPC) is created. If no VPC is available, create a VPC in the VPC console.
Step 1: Create a file system
If you have created a file system, go to the next step.
Log on to the NAS console.
In the lower part of the Overview page, click Create General-purpose NAS File System.
On the General-purpose NAS (Pay-as-you-go) page, configure the parameters described in the following table.
Parameter
Description
Region
The file system and the DSW instance must reside in the same region. In this example, select the China (Hangzhou) region.
Zone
Select the zone where the vSwitch resides to prevent cross-zone latency. In this example, select Hangzhou Zone G.
Storage Class
Capacity
Protocol Type
NFS
Recycle Bin
We recommend that you enable the recycle bin feature. After you enable the feature, deleted files or directories are temporarily stored in the recycle bin to prevent accidental deletion.
Lifecycle Management
Configure the lifecycle management feature based on your business scenario. We recommend that you disable this feature.
Encryption Type
Select Not Encrypted.
Data Backup
Select Disable.
Network Type
Select VPC.
VPC
Select the VPC that you created from the drop-down list.
vSwitch
Select a vSwitch that resides in the VPC.
Click Buy Now and follow the instructions to complete the payment.
Step 2: Create a DSW instance
This topic describes how to create a DSW instance in a shared resource group.
Log on to the PAI console.
On the Overview page, select a region in the top navigation bar.
In the left-side navigation pane, click Workspaces. On the Workspace page, click the name of the workspace.
In the left-side navigation pane of the workspace page, choose Model Training > Data Science Workshop (DSW) to go to the DSW page.
Click New Instance.
In the Configure Instance step, configure the key parameters described in the following table. Retain the default settings of other parameters. For more information, see Create a DSW instance.
Parameter
Description
Region and Zone
In this example, select China (Hangzhou).
Instance Name
You can customize the instance name. In this example, enter test_01.
Resource Type
In this example, select GPU Specifications and set the specification name to ecs.gn7i-c8g1.2xlarge.
Storage
In the Shared Datasets section, click Create Dataset. In the From Alibaba Cloud Create Dataset panel, configure the following parameters. Retain the default settings of other parameters. For more information about datasets, see Create and manage datasets.
Select Data Storage: Select General-purpose NAS file system.
Select File System: Select the file system whose name contains NAS. In this example, select the NAS file system created in the preceding step.
Select Image
In this example, select
stable-diffusion-webui-develop:1.0.0-pytorch2.01-gpu-py310-cu117-ubuntu22.04
in Alibaba Cloud Image.In the Confirm step, check the parameter configurations, read and select Machine Learning DSW Terms of Service, and then click Create Instance.
Creating a DSW instance requires about 10 minutes. After the DSW instance is created, it is in the Running state.
Step 3: Verify the mounting
Return to the DSW page and click Open in the Actions column of the created DSW instance.
On the DSW Instances page, click the Terminal tab in the top navigation bar. Then, follow the instructions to open the terminal.
On the Terminal page, run the following command to check whether the NAS dataset is mounted:
mount | grep nas
If an output similar to the following example appears, the dataset is mounted.
In the command output, /mnt/data is the mount path that you specify when you create the DSW instance. Then, data and code can be stored persistently on condition that your NAS file system runs properly.
References
For more information about how DSW is billed, see Billing of DSW.
If you use NAS to store data, we recommend that you purchase resource plans to offset data storage fees. For more information, see Resource plan overview.
For more information about how to release a NAS file system, see Release a NAS file system.
NAS also supports data storage for Deep Learning Containers (DLC) and Elastic Algorithm Service (EAS). For more information, see Create and manage datasets or Mount storage to services (advanced).
For more information about the best practices of DSW, see General solutions that use DSW.