Use OSS Connector for AI/ML to access and store OSS data in PyTorch training jobs - Object Storage Service

0.0.201

Object Storage Service (OSS) Connector for AI/ML is a Python library that is used to efficiently access and store OSS data in PyTorch training jobs.

Benefits

Item	Do not use OSS Connector for AI/ML	Use OSS Connector for AI/ML

Item	Do not use OSS Connector for AI/ML	Use OSS Connector for AI/ML
Performance	You must manually optimize performance, which may be inefficient.	OSS Connector for AI/ML automatically optimizes the performance of OSS data download and checkpoint storage.
Data loading method	You must download data in advance, which increases costs and management workloads.	OSS Connector for AI/ML supports stream load to reduce cost and management complexity.
Data access	You must read and write data by using adapters, which increases access complexity.	OSS Connector for AI/ML directly reads and writes data in OSS to simplify access.
Configuration difficulty	You must compile code, which makes configuration difficult.	OSS Connector for AI/ML provides simple configuration items to improve development efficiency.

How it works

The following figure shows how OSS Connector for AI/ML runs PyTorch training jobs by using data in OSS.

Feature description

The following table describes the main features of OSS Connector for AI/ML.

Item	Feature	Class	Method

Item	Feature	Class	Method
Map-style dataset	Suitable for random access to facilitate quick access to specific data during training.	OssMapDataset	The OssMapDataset and OssIterableDataset classes provide the same methods to build a dataset. from_prefix() Use the OSS_URI prefix to build a dataset. This method is suitable for scenarios in which the storage paths of OSS data have uniform rules. from_objects() Use the OSS_URI list in OSS to build a dataset. This method is suitable for scenarios in which the storage paths of OSS data are clear but scattered. from_manifest_file() Create a manifest file and use the manifest file to build a dataset. This method is suitable for scenarios in which the dataset that you want to create contains a large number of files, such as tens of millions, the dataset is frequently loaded, and data indexing is enabled for the bucket.
Iterable-style dataset	Suitable for sequential streaming reading and allows you to efficiently process a large number of continuous data streams.	OssIterableDataset
Checkpoint API operations	Loads checkpoints from OSS during model training and saves checkpoints to OSS after periodic model training. This way, workflow is simplified.	OssCheckpoint	OssCheckpoint() Initialize an OssCheckpoint object that is used to read and write checkpoints during model training. reader() Read checkpoints from OSS. writer() Write checkpoints to OSS.

Procedure

Before you access and store data in OSS in a PyTorch training job, you must install and configure OSS Connector for AI/ML. For more information, see Install OSS Connector for AI/ML and Configure OSS Connector for AI/ML.
After you install and configure OSS Connector for AI/ML, you can perform the following operations in Pytorch training jobs:
- Use OssMapDataset to build a map-style dataset suitable for random reading. For more information, see Use data in OSS to build a map dataset suitable for random reading.
- Use OssIterableDataset to build an iterable-style dataset suitable for sequential streaming reading. For more information, see Use data in OSS to build an iterable dataset suitable for sequential streaming reading.
- Use OssCheckpoint to store and access checkpoints. For more information, see Store and access checkpoints in OSS.
- Note
  Data in map-style and iterable-style datasets and checkpoints is of the same type. For more information about the supported methods of the data type, see Data type in OSS Connector for AI/ML.

Use cases

If you want to quickly learn how to use OSS data to run a PyTorch training job and save the training results to OSS, we provide a demo that uses OSS Connector for AI/ML to train a handwritten digit recognition model. For more information, see Get started with OSS Connector for AI/ML.
To further improve the performance of OSS Connector for AI/ML, we recommend that you use the accelerated endpoint of an OSS accelerator instead of the OSS internal endpoint. For more information about the performance comparison between OSS Connector for AI/ML that uses an OSS internal endpoint and OSS Connector for AI/ML that uses the accelerated endpoint of an OSS accelerator, see Performance testing.
If you want to use OSS Connector for AI/ML in a containerized environment, you can use a Docker image that contains an OSS Connector for AI/ML environment. For more information about how to build a Docker image, see Build a Docker image that contains an OSS Connector for AI/ML environment.

Feedback

Previous: RAM Policy EditorNext: Installation

On this page （1）

Benefits

How it works

Feature description

Procedure

Use cases

About Alibaba Cloud

Our Global Network

Quick Start

Global Offices

Olympic Games Paris 2024 New

Stade Roland Garros – Glitz from the Past New

Place de la Concorde – “Breaking” the Barriers New

Vaires-sur-Marne Nautical Stadium – Sports with Sustainability New

International Broadcast Center – Images, Sounds, and Data that Captivate Billions New

Customer Success Stories New

Trust Center

Security & Compliance Center

Cloud Compliance Resources

Security Compliance FAQs

Product & Feature Update New

Cloud Forward

Press Room

Alibaba Cloud e-Magazine New

Alibaba Cloud in Analyst Research

Notice

Go Global Service New

Go Global Alliance with Alibaba Cloud

Asia Accelerator Hot

Information Compliance

China Gateway - MLPS 2.0 Compliance New

China Gateway - Networking

China Gateway - Global Application Acceleration New

China Gateway - Security

China Gateway - Data Security New

ICP Support Hot

China Gateway - Omnichannel Data Mid-End New

China Gateway - Organizational Data Mid-End New

China Gateway - Business Mid-End New

China Gateway - AI Service for Conversational Chatbots New

China Gateway - Online Education

China Gateway - Domain Registration

Work at Alibaba Cloud

Experienced Professionals

Students and Graduates

Free Trial

Pricing

Promo Center

Price Reduction

Pay Less and Deploy More

FinOps

Elastic Compute Service (ECS)

Simple Application Server (SAS)

Elastic GPU Service

Elastic Desktop Service (EDS)

Object Storage Service (OSS)

Cloud Enterprise Network (CEN)

Web Application Firewall (WAF)

Domain Names

Container Compute Service (ACS)

Secure Access Service Edge (SASE)

Intelligent Media Services(IMS)

Edge Security Acceleration (ESA)(Original DCDN)

Intelligent Media Management

DingTalk Enterprise

YiDA

Alibaba Cloud Model Studio

Apsara Prime - For Easy Cloud Product Selection

Alibaba Cloud ECS - Cater All Your Cloud Hosting Needs

1TB CDN—Get Free 1 TB Outbound Traffic Plan Now

Security—Under Attack? Get Free Security Support

Short Message Service - Free Testing is Available

Elastic Compute Service (ECS) Hot

CloudBox

Compute Nest

Dedicated Host Hot

ECS Bare Metal Instance

Elastic GPU Service Featured

Simple Application Server (SAS) Hot

Auto Scaling

Cloud Phone Beta

Elastic Desktop Service (EDS) Featured

Batch Compute

Elastic High Performance Computing (E-HPC)

Super Computing Cluster (SCC)

Function Compute (FC)