By Tianke
During the frontend development process, icons of images need to be restored. In most cases, the icons do not have a corresponding type
field. If users need to look for what they need from hundreds of icons with their naked eyes, the result is a poor user experience.
Therefore, last year, I submitted a pull request for Ant Design open-source projects. The pull request contributes to a new feature, searching icons with screenshots, based on the deep learning technology. When users click, drag, or paste an icon screenshot in a design or any image to upload, they can search for the best matches and the corresponding matching rates. Remember that all recognition tasks are completed by the frontend!
The following figure shows the effect:
You can also experience it on the official website: https://ant.design/components/icon/
How can we implement this technology? This article will reveal the secrets behind the technology:
As described in the preceding section, this feature is implemented based on deep learning. What is deep learning? Deep learning is a type of machine learning. Machine learning is the study of computer algorithms that automatically improve based on "experience."
The keyword here is experience. Humans have long addressed problems based on their experience. For example, as early as medieval times, someone estimated the average foot length of all men by measuring the average foot length of 16 men.
Here is another example. If you are given a lot of height and weight data as well as the height of a single person, can you estimate the person's weight?
Of course you can! You can calculate the values of a and b in the formula of y = ax + b shown in the preceding figure, and then calculate a person's weight based on his or her height. In machine learning, a is called weight, and b is called bias. More specifically, this is called linear regression in machine learning.
A computer can learn number patterns. If we convert images, voices, or texts into numbers, can a computer recognize the patterns? Of course it can! However, the model is much more complicated.
We use a deep learning model called Convolutional Neural Network to classify icon screenshots.
Whether we are talking about simple linear regression or complicated deep learning, the model learns from "experience." The "experience" here is called "samples" in machine learning. Therefore, we must first generate samples for machine learning.
In this icon classification task, the samples consist have two parts:
The labels refer to the category names of images. For example, if you want to identify whether it is a cat or a dog in the image, then the cat and dog are labels.
Studies show that the more samples you generate, the better the deep learning model will learn. Therefore, we have adopted the method of integrating a sample page. It uses Puppeteer with FaaS to quickly generate tens of thousands of icon images and their corresponding labels. How do we achieve it?
As such, a number of samples are available.
When samples are available, you can start to train the model. We use the TensorFlow framework. The TensorFlow framework provides an example of image classification for you to download. When you run it, specify parameters based on the sample we just generated. https://github.com/tensorflow/hub/tree/master/examples/image_retraining
You can perform the model training on your PC. The training model is slow, but it can be finished during a lunch break!
Alibaba Cloud Machine Learning Platform for AI (PAI) provides available image classification algorithms and GPU for accelerated training. I do not use image classification algorithms on PAI. Instead, I have deployed the TensorFlow code to PAI for model training. It's remarkably fast!
After the model is trained, it can be used to recognize images. However, the Python code must be deployed on the server before you can use the model. This can bring the following disadvantages:
In view of these disadvantages, we intend to convert our model into the TensorFlow.js model and allow users to download the latter model to their browsers for recognition. This can bring the following benefits:
Both model conversion and compression use the tfjs-converter: https://github.com/tensorflow/tfjs/tree/master/tfjs-converter
We use MobileNet for transfer learning. The model is 16 MB, which is compressed to about 3 MB and released to jsDelivr for global acceleration.
Now, you need to write some TensorFlow.js code. First, load the model file. The code snippet is shown below:
const MODEL_PATH = 'https://cdn.jsdelivr.net/gh/lewis617/antd-icon-classifier@0.0.1/model/model.json';
model = await tfconv.loadGraphModel(MODEL_PATH);
Next, convert icon screenshots into tensors.
A tensor is a type of data structure that is similar to a multidimensional array. In TensorFlow, the inputs and outputs of a model are tensors. Therefore, data must be converted into tensors before training and recognition.
// Convert images into tensors
constimg=tf.browser.fromPixels(imgEl).toFloat();
constoffset=tf.scalar(127.5);
// Normalize an image from [0, 255] to [-1, 1]
constnormalized=img.sub(offset).div(offset);
// Change the image size
let resized = normalized;
if (img.shape[0] !== IMAGE_SIZE || img.shape[1] !== IMAGE_SIZE) {
const alignCorners = true;
resized = tf.image.resizeBilinear(
normalized, [IMAGE_SIZE, IMAGE_SIZE], alignCorners,
);
}
// Change the shape of a tensor to meet the model requirements
constbatched=resized.reshape([-1,IMAGE_SIZE,IMAGE_SIZE,3]);
Then, start to recognize images. The code snippet is shown below:
pred=model.predict(batched).squeeze().arraySync();
// Find the categories with the highest matching degree
const predictions = findIndicesOfMax(pred, 5).map(i => ({
className: ICON_CLASSES[i],
score: pred[i],
}));
Then, the final result is shown!
The complete code is available on GitHub. Click this link to obtain the complete code.
Building a High-Level Frontend Machine Learning Framework Based on the tfjs-node
Pipcook — Providing the Frontend with a Complete Intelligent Algorithm Framework
66 posts | 3 followers
FollowAlibaba F(x) Team - June 7, 2021
Alibaba F(x) Team - June 22, 2021
Alibaba F(x) Team - December 10, 2020
Alibaba F(x) Team - December 9, 2020
Alibaba F(x) Team - June 20, 2022
Rupal_Click2Cloud - September 24, 2021
66 posts | 3 followers
FollowOffline SDKs for visual production, such as image segmentation, video segmentation, and character recognition, based on deep learning technologies developed by Alibaba Cloud.
Learn MoreThis technology can assist realizing quantitative analysis, speeding up CT image analytics, avoiding errors caused by fatigue and adjusting treatment plans in time.
Learn MoreA platform that provides enterprise-level data modeling services based on machine learning algorithms to quickly meet your needs for data-driven operations.
Learn MoreAn intelligent image search service with product search and generic search features to help users resolve image search requests.
Learn MoreMore Posts by Alibaba F(x) Team
Adnan Zaidi December 17, 2020 at 1:15 pm
Really informative to start with !!!