AIACC Graph Speeding, also known as AIACC-AGSpeed (AGSpeed) is an optimizing compiler for AI training developed by Alibaba Cloud. It is used to optimize the computing performance of the PyTorch models on Alibaba Cloud GPU-accelerated compute-optimized instances. AGSpeed can be considered an improved version of the original AIACC, and is an independent product that can implement imperceptible computing optimization.
Introduction
AGSpeed is an in-house optimizing compiler for AI training developed by Alibaba Cloud. It has significant performance advantages in PyTorch models training scenarios.
Component | Description |
---|---|
Frontend | The AGSpeed frontend is integrated with a version of TorchDynamo that has been optimized by the AIACC training performance and acceleration team. This enables you to obtain the computing diagram directly from PyTorch Eager API and then process the diagram with AGSpeed Backend Autotuner without the need to modify your code. Autotuner automatically selects the optimal backend implementation solution for your use case. |
Backend | In the backend, AGSpeed integrates the in-house intermediate representation (IR) optimization pass plugin that is developed based on TorchScript IR. This enables more fusion operations to improve performance. In addition, AGSpeed also integrates an optimized version of NvFuser to its backend. Compared to the native NvFuser, the optimized NVFuser is more robust and provides better performance. |
Limits
agspeed.optimize()
to optimize the static part of the model. The following section describes the causes and suggestions.Causes
- If you use dynamic tensor shape in the frontend, it may cause TorchDynamo to repeatedly obtain the computing diagram and perform the convert frame operation. This greatly reduces the effects of the optimization process.
- If you use dynamic tensor shape in the backend, TorchScript will repeatedly perform graph specialization and all optimization passes. In addition, the NvFuser may also recompile a new kernel for the new tensor shape, which greatly reduces performance.
Suggestions
You can use the agspeed.optimize()
operation to optimize the static part of the model to effectively avoid the preceding consequences. For example, the head of the model may incur dynamic shapes during the computing process and affect the performance. In this case, you can use agspeed.optimize()
to optimize only the backbone of the model, instead of the head.
Contact us
If you need assistance with AIACC, join the Alibaba Cloud AIACC support group for external users (Group ID: 33617640
). (Download DingTalk.)