Data Science Workshop (DSW) of Platform for AI (PAI) is a one-stop Integrated Development Environment (IDE) for AI tailored for algorithm developers. DSW integrates multiple development environments, such as Notebook, VSCode, and Terminal, for coding, debugging, and task running. DSW provides various heterogeneous computing resources and open-source images and supports mounting of datasets of the Object Storage Service (OSS), File Storage NAS (NAS), and Cloud Parallel File Storage (CPFS) types. You can manage the lifecycle of DSW instances and use DSW for development in an easy and efficient manner.
Features
Flexibility and ease of use
DSW provides built-in development environments, such as Notebook, VSCode, and Terminal, to meet various development requirements.
DSW provides images of multiple open-source frameworks such as PyTorch and TensorFlow, and supports custom images.
DSW provides various heterogeneous computing resources, including public resource groups, dedicated resource groups, and Lingjun resources. You can flexibly configure and manage resources in DSW.
DSW supports the writing and execution of R language and SQL statements on top of Python.
One-stop service
DSW allows you to mount file systems, such as OSS, NAS, and CPFS file systems, access MaxCompute data.
DSW provides Deep Learning Containers (DLC) and Elastic Algorithm Service (EAS) tools, implementing a full AI development pipeline from data processing, coding, debugging, model training, to model deployment.
Fine-grained management
DSW allows you to configure scheduled stop for an instance or auto stop for idle instances to reduce costs.
DSW provides real-time monitoring of CPU, GPU, and memory usage to help you analyze the resource usage in real time.
Scenario-based tutorials
DSW provides Notebook Gallery as a content platform for developers. You can use the tutorials for large language model (LLM) and AI content generation-related industries in Notebook Gallery to quickly get started with development.
Enterprise-class capabilities
DSW uses the workspace administrator role to allocate global resources and configure resource reclaim policies.
Scenarios
Machine learning and data science
DSW supports the Notebook interactive programming environment and provides various images such as PyTorch and TensorFlow images. You can easily perform tasks such as data engineering, model development and training, and visual analysis without the need to perform resource O&M and environment configuration.
Generative AI and LLM
Notebook Gallery provides various use cases and best practices in common scenarios that you can access in Notebook, such as Stable Diffusion, Llama2, and Tongyi Qianwen. You can directly use the tutorials in DSW and code based on the tutorials.
AI and big data integration
In addition to Python and R languages, DSW supports big data integration. You can use the SQL File plug-in to query data from MaxCompute data sources by using SQL statements or connect to E-MapReduce clusters by using Notebook to submit Spark jobs.