Efficient data loading and processing are crucial to model training in machine learning (ML) and deep learning. This topic compares performances of loading data from an OssIterableDataset, OssMapDataset, and dataset created by using ossfs together with ImageFolder over internal endpoints and based on the OSS accelerator. You can optimize your data access based on performance test results provided in this topic.
Test description
Test scenarios: Test performance of reading data from datasets by using internal endpoints and accelerated endpoints.
Test data: The performance tests are performed based on approximately 1 TB of 10,000,000 images with an average size of 100 KB.
Test environment: The performance tests are performed based on a network-enhanced general-purpose g7nex Elastic Compute Service (ECS) instance with 128 vCPUs, 512 GB of memory, and 160 Gbit/s internal bandwidth.
Datasets: The dataset of type OssIterableDataset and the dataset of type OssMapDataset are created by using OSS Connector for AI/ML. The dataset of type ossfs with ImageFolder is created by using ossfs.
Performance tests
Test parameters
Parameter
Value/Operation
Description
dataloader batch size
256
Each batch task processes 256 samples.
dataloader workers
32
Data is loaded in parallel by using 32 processes.
transform
trans = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) def transform(object): img = Image.open(io.BytesIO(object.read())).convert('RGB') val = trans(img) return val, object.label
Data is preprocessed.
Test results
Dataset created by using
Dataset type
Use the internal endpoint
Use the OSS accelerator for data preloading and the accelerated endpoint for accelerated access
OSS Connector for AI/ML
OssIterableDataset
4582 img/s
4744 img/s
OssMapDataset
4010 img/s
4370 img/s
Ossfs with ImageFolder
56 img/s
251 img/s
Optimal performance tests
Test parameters
Parameter
Value/Operation
Description
dataloader batch size
256
Each batch task processes 256 samples.
dataloader workers
32
Data is loaded in parallel by using 32 processes.
transform
def transform(object): data = object.read() return object.key, object.label
Data is not preprocessed.
Test results
Dataset created by using
Dataset type
Use the internal endpoint
Use the OSS accelerator for data preloading and the accelerated endpoint for accelerated access
OSS Connector for AI/ML
OssIterableDataset
99920 img/s
123043 img/s
OssMapDataset
56564 img/s
78264 img/s
Performance data analysis
When data is read from datasets, OssIterableDataset and OssMapDataset datasets offer significantly better performance compared with datasets created using ossfs and ImageFolder. Specifically, it is approximately 80 times faster when the OSS accelerator is not used and approximately 18 times faster when the OSS accelerator is utilized. The performance tests show that OSS Connector for AI/ML can significantly improve the data processing speed and model training efficiency.
Reading data from OssIterableDataset and OssMapDataset datasets with the OSS accelerator enabled is approximately 1.6 times faster than reading with the OSS accelerator disabled. OSS Connector for AI/ML can handle highly concurrent access at a high bandwidth level when the OSS accelerator is disabled. Combining OSS Connector for AI/ML with the OSS accelerator provides even more powerful performance.
Conclusion
You can use OSS Connector for AI/ML in your Python code to access OSS objects in streams. OSS Connector for AI/ML can increase data read speeds and is suitable for most model training scenarios. If you want higher data processing performance for model training, you can use OSS Connector for AI/ML together with the OSS accelerator.