By Queyue
With the development of deep learning, all areas of our lives are undergoing intelligent transformations. As the team positioned closest to users, the frontend also wants to use AI capabilities to improve our efficiency, reduce labor costs, and provide users with a better experience. Intelligent transformation is seen as an important area of growth for the future of frontend development. However, frontend engineers have the following doubts:
To resolve these doubts, we need a solution that enables AI to improve the frontend efficiency and minimize the costs and difficulty of using machine learning. Therefore, we came up with the idea of developing a JavaScript (JS) framework that is friendly to frontend engineers. It would allow them to quickly collect and process data and conduct machine learning experiments without having to master advanced mathematical and deep learning knowledge. This framework must also be flexible, scalable, and possess industrial-level availability. To achieve these goals, we launched Pipcook, a frontend algorithm framework based on tfjs-node.
Through communication with frontend engineers and research, we discovered the main reasons that prevent frontend teams from entering the AI field:
A Problem Analysis Diagram
With the rapid development of AI, intelligence has empowered many industries. We believe that there are some web scenarios where AI can be applied. However, in many cases, non-algorithm engineers cannot effectively identify and determine the scenarios where machine learning can be used. In addition, they are not sure to what extent deep learning can solve problems and whether its performance is better than traditional rule engines due to their lack of an in-depth understanding of models and algorithms. To solve this problem, we can use either of the following methods:
We know that data and models are the core elements of deep learning. If the model is a rocket engine, data is its fuel. Machine learning needs a large amount of high-quality fuel to allow it to realize its full potential. The frontend has accumulated some data over the years, and we also have advantages in data collection because we are the team closest to users.
The frontend possesses the following data:
The data can be classified into computer vision (CV) data and text data. CV and natural language processing (NLP) are also the focus of machine learning. However, frontend engineers often do not know how to process data so that it can be turned into fuel for their models. Our framework must provide fast and simple data processing, as well as convenient capabilities, such as data quality assessment and data visualization.
For non-algorithm engineers, models and algorithms represent another huge obstacle. They always worry that they do not understand the mathematical principles of a model and do not know how to use deep learning frameworks, such as TensorFlow. This problem is both easy and difficult to solve.
It is easy to solve because experience has been accumulated in some traditional deep learning fields over many years, and almost every field has its own popular and mature models with industrial availability. We only need to provide model implementations in the framework. In this way, non-algorithm engineers can use models without any configuration required and do not need to worry about internal implementations. However, this problem is difficult to solve because some non-algorithm engineers think that models are too much like black-boxes and want to slightly adjust them based on their known algorithm knowledge. Therefore, we must also provide intervention and adjustment capabilities in the framework.
JS vs. Python
The language problem is both simple and complex. As a simple solution, we can use JS, which is the language that frontend developers are most familiar with. Therefore, we developed Pipcook purely with Typescript, provided JS-based APIs, and implemented plugins for data processing and models based on tfjs-node. However, the JS-based machine learning ecosystem is still developing, and we cannot hope the JS ecosystem will provide the same richness as Python in a short time. Therefore, if our framework only uses JS, it is bound to be incomplete to a greater or lesser extent. Our solution is to provide a Node version of Python, like Swift, so Python libraries can be called in Node.js to help the frontend team.
After solving the preceding problems, we know why we need to use machine learning, when it can be used, and how to use it. In addition, we have provided solutions for each problem from the perspective of frontend engineers. As Pipcook and the entire JS-based machine learning ecosystem gradually mature, we believe that frontend engineers will get better at using intelligent capabilities.
A Diagram of the Pipcook Architecture
After we solved the scenario, algorithm, data processing, and language problems, we designed a pipeline-based frontend stream-format machine learning framework, as shown in the preceding figure. Models and data flow in the pipeline. We can embed plugins in this pipeline to process models and data and forward them downstream. Each plugin is responsible for a specific task in the machine learning cycle. Pipcook defines a series of specifications that allow third-party developers to develop plugins to extend Pipcook's capabilities. Our framework is based on TensorFlow.js for machine learning and training. We can also use the Python ecosystem through Python bridging. The following sections introduce several key parts of the framework.
A Sequence Diagram
Pipcook is a pipeline-based framework that includes data collection, data access, data processing, model configuration, model training, model service deployment, and online training. A specific plugin is responsible for each process. Plugins allow you to customize each process, and pipelines allow you to connect plugins in a series to implement algorithm engineering. The whole process is based on Node.js, and Node Package Manager (NPM) manages and maintains the plugins. The plugins for data processing and model service deployment can be deeply integrated with the existing frontend technical system.
Pipcook defines a set of dataset specifications. This prevents data access and usage costs resulting from different dataset standards in plugins for data collection, access, and processing. It also ensures that data can be shared between different pipelines. The protocols used by these plugins can generate standard and unified datasets under different labeling tools. The data processing plugin makes it easier to understand and optimize datasets.
The underlying models and algorithm capabilities of Pipcook are provided by the node version of TensorFlow, a well-known machine learning framework. The tfjs-node makes it much easier to use JS for machine learning. Therefore, our JS-based machine learning platform can also easily use the tfjs-node. For example, we can use mature official models (such as MobileNets), use basic operators to build a new model, or use its tensor capabilities to make up for the fact that the JS platform does not have something similar to NumPy.
As a brand new JS-based machine learning platform that has only been open source for a short time, Pipcook still has many imperfections. To push the whole frontend industry towards intelligent development, we will work to continually optimize Pipcook.
Currently, Pipcook's built-in plugins support a pipeline for image classification and object detection, and the pipeline for object detection uses Python capabilities. In the future, we hope to develop models based on the native tfjs-node to expand the JS-based machine learning ecosystem. In addition, Pipcook will continue to provide more plugins to support popular deep learning tasks, such as NLP and image segmentation. We also welcome third-party developers to contribute to these models.
As data volumes and model complexity increase, our computing power may prove insufficient. In the future, we will train models on multiple devices, support parallel, distributed parallel, and asynchronous data training, and use clusters to solve computing power problems.
Currently, Pipcook only supports simple solutions, such as local deployment. In the future, Pipcook will cooperate with various cloud service providers, such as Alibaba Cloud, AWS, and Google Cloud, to deploy models to cloud computing machine learning deployment services in the pipeline. This will allow you to start using prediction services as soon as training is completed.
In the future, we hope to combine the power of Alibaba's intelligent frontend team and the entire open-source community to continuously optimize Pipcook and the push for intelligent frontend capabilities it represents. This way, we can provide inclusive technical solutions for intelligent frontend capabilities, accumulate more competitive samples and models, provide intelligent code generation services with higher accuracy and availability, and improve frontend R&D efficiency. In addition, frontend engineers will no longer have to do simple and repetitive work, giving them more time to focus on challenging work.
TensorFlow.js Helps Recognize Large Quantities of Icons in Milliseconds!
Information Input Considerations for Intelligently Generating Frontend Code
66 posts | 3 followers
FollowAlibaba F(x) Team - December 10, 2020
Alibaba F(x) Team - February 26, 2021
Alibaba F(x) Team - December 8, 2020
Alibaba F(x) Team - June 22, 2021
Alibaba F(x) Team - September 1, 2021
Alibaba Clouder - December 31, 2020
66 posts | 3 followers
FollowA platform that provides enterprise-level data modeling services based on machine learning algorithms to quickly meet your needs for data-driven operations.
Learn MoreThis technology can be used to predict the spread of COVID-19 and help decision makers evaluate the impact of various prevention and control measures on the development of the epidemic.
Learn MoreOffline SDKs for visual production, such as image segmentation, video segmentation, and character recognition, based on deep learning technologies developed by Alibaba Cloud.
Learn MoreA dialogue platform that enables smart dialog (based on natural language processing) through a range of dialogue-enabling clients
Learn MoreMore Posts by Alibaba F(x) Team