Platform for AI (PAI) provides an all-in-one solution for machine learning, covering data labeling, model development, model training, and model deployment. This topic describes what PAI is.
What is PAI?
PAI is a one-stop machine learning platform for developers. With its core modules such as Machine Learning Designer, Data Science Workshop (DSW), Deep Learning Containers (DLC), and Elastic Algorithm Service (EAS), PAI provides an all-in-one solution for machine learning, covering data labeling, model development, model training, and model deployment. PAI is flexible and easy to use, supporting multiple open-source frameworks and AI optimization capabilities.
Core modules
Name | Description | Scenario | Example |
Drag and drop components to build a visualized AI pipeline for model development. | Model code development, model training, workflow development, or workflow scheduling. | ||
A cloud-based IDE, integrated with Notebook, VSCode, and Terminal. | Model code development and training. | ||
A cloud-native AI training platform capable of handling large-scale distributed deep learning tasks. | Model training or code execution after coding. Distributed execution across machines. | ||
Deploy models as online services, featuring elastic scaling, version management, and resource monitoring. | Deploy trained models as online services. |
For a list of features, see Features of PAI.
Benefits
Full-lifecycle end-to-end service for AI research and development:
Supports data labeling, model development, model training, model optimization, model deployment, and AI O&M as a one-stop AI platform.
Provides over 140 types of optimized built-in algorithm components.
Provides core capabilities such as multiple modes, deep integration with big data engines, multi-framework compatibility, and custom images.
Provides cloud-native AI development, training, and deployment services.
Support for multiple open-source frameworks:
The stream processing framework Flink.
Deeply optimized deep learning frameworks based on open-source versions, including TensorFlow, PyTorch, Megatron, and DeepSpeed.
The trillion-feature-sample parallel computing framework Parameter Server.
Industry-leading open-source frameworks such as Spark, PySpark, and MapReduce.
Industry-leading AI optimization:
Supports high-performance training framework, sparse training scenarios, billions to tens of billions of sparse features, tens to hundreds of billions of samples, and distributed incremental training of thousands of workers.
Supports acceleration of mainstream framework models such as RestNet50 and Transformer language model (LM) by using PAI Blade.
Diverse service modes:
Supports fully managed and semi-managed services for public cloud.
Provides high-performance AI computing clusters and lightweight service modes.
Integration with DataWorks:
Allows you to process data by using SQL, user-defined functions (UDFs), user-defined aggregation functions (UDAFs), and MapReduce, ensuring higher flexibility and efficiency.
Supports using DataWorks to schedule tasks in the staging or production environment, enabling data isolation.