To improve the execution efficiency of pipelines, Machine Learning Designer of Machine Learning Platform for AI (PAI) allows you to group multiple Alink nodes on the canvas and run them at a time. In addition, Machine Learning Designer provides the Alink intelligent aggregation feature. This feature can automatically identify the Alink nodes that can be grouped.
Background information
Alink is a new-generation machine learning algorithm framework and component library developed by the Alibaba Cloud PAI team based on Realtime Compute for Apache Flink. Machine Learning Designer provides the stream processing and batch processing components of Alink. These components can help you streamline machine learning workflows based on Flink. The workflows include data preprocessing, feature engineering, model training, and prediction.
Group multiple Alink nodes
Alink components can be used in the same way as the components of other frameworks. Machine Learning Designer provides a feature that allows you to run all Alink nodes in a group at a time. This feature is developed based on Flink, a high-performance in-memory data processing engine.
- Select multiple Alink nodes on the canvas.
You can press Shift and click multiple Alink nodes. Alternatively, you can click the icon in the upper-left corner of the canvas and draw a box to select multiple Alink nodes.
- Right-click a blank area on the canvas and select Select Nodes into Alink from the shortcut menu that appears. On the canvas, Alink nodes that belong to one group are displayed in a dashed and rounded rectangle, as shown in the following figure.
You can click the icon in the upper-right corner of the rectangle and set the Workers and Memory per Worker Node parameters for the Alink group. The settings of the Alink group have a higher priority than the settings of each Alink node in the Alink group. The Alink nodes in the Alink group are executed at a time. The system does not store the intermediate data that is generated during execution in disks. This improves execution efficiency and resource utilization.
Alink intelligent aggregation
The Alink intelligent aggregation feature can automatically identify the Alink nodes that can be grouped on the canvas and group them to reduce the overheads of transmitting the intermediate data. This improves resource utilization and the execution efficiency of pipelines.
To enable the Alink intelligent aggregation feature for a pipeline in Machine Learning Designer, click the icon in the upper-left corner of the canvas.