By Qinqi
This article is a part of a series that mainly introduces interesting tasks involved in the field of code intelligence. I hope to give everyone a deep understanding of code intelligence, including a brief introduction, history, and the current situation of these tasks.
The focus of this article is code search, which uses a piece of natural language to search out the code that best satisfies the programmer's intentions.
The task itself is partial to text understanding and semantic analysis. The main simulation scenario is the code for developers to find their intentions. This function should be familiar to every programmer. It is similar to searching for the desired content through keywords or paragraphs on search engines (like Baidu or Google). The code search here is as broad as possible. We will discuss all the things related to code search, not only how to use Machine Learning training to perform code search on a fixed dataset.
The most common scenario for code search is to search for common usages and practices when programmers are unfamiliar with a function or a tool. Generally, it is divided into several situations:
These common scenarios can easily search for relevant documents with the help of powerful search engine capabilities. The result of the first scenario is some blogs, forums, or code exchange platforms, such as stack overflow. The second is the document center, such as MDN, commonly used at the front end. The third is the corresponding GitHub or official website.
All of them are to find the corresponding code through powerful search engine capabilities. After all, being proficient with Google is a necessary ability for programmers.
There are some websites dedicated to code search, hoping to help programmers find the code they want.
These search methods are similar, and they are all based on a set of expressions containing syntax. For example, you can search for react keywords through the expression react stars:>100
on GitHub, and the number of stars in the warehouse must be over 100. This is similar to spelling out the corresponding SQL code and directly executing it in the database. This type of syntax is often highly flexible and can be freely combined into complex search expressions to meet various search demands of users.
However, a high degree of flexibility brings a certain degree of difficulty in getting started. You must be familiar with this set of syntax to be able to find the target code quickly. Also, you must make flexible combinations and even add logic (such as and/or/not). Thus, the complexity has skyrocketed. Then again, such complex search expressions are rarely used.
Compared with this type of search, if some searches can be completed through natural language description, it is undoubtedly a better choice. This type of search method has low cost and is similar to ordinary language communication, such as the search results of show me the code about js string reverse
, which can quickly search out the actual code related to string reverse. Since no service or website with good support for natural language code search has been found, we will use Google search as an example. Whether it is keywords or natural language, Google's analysis and processing are still good, which is convenient for users to find the code they are looking for quickly.
In addition, there are natural language code search models implemented by Meta based on Machine Learning: NCS and UNIF. They are based on unsupervised and supervised methods to explore the improvement of Machine Learning capabilities for natural language code search. Here are some examples. (The code comes from the Facebook blog.):
Q: How to get the ActionBar height?
A:
public int getActionBarHeight() {
return mActionBarHeight;
}
public static int getActionBarHeightPixel(Context context) {
TypedValue tv = new TypedValue();
if (context.getTheme().resolveAttribute(android.R.attr.actionBarSize, tv, true)) {
return TypedValue.complexToDimensionPixelSize(tv.data,
context.getResources().getDisplayMetrics());
} else if (context.getTheme().resolveAttribute(R.attr.actionBarSize, tv, true)) {
return TypedValue.complexToDimensionPixelSize(tv.data,
context.getResources().getDisplayMetrics());
} else {
return 0;
}
}
The relevant code is recommended according to the natural language description. Whether it can be used in production still needs to be considered for verification. Whether the searched code will have license problems also needs to be considered.
Code search is essential to check documents, usage, and best practice in the daily development process. The process of inquiry is a process of progress. Currently, Google is still better, but semantic code search is making rapid progress, the core of which is the ability of semantic analysis. Semantic analysis is widely used in code search and other fields, such as nl2sql (natural language generated SQL). One sentence for data analysis is a popular scenario, too. In a word, the future can be expected. I am looking forward to the birth of smarter and better code search tools.
Do I Need to Transcode the Code of ES6 or Its Later Versions to the Code of ES5?
66 posts | 3 followers
FollowAlibaba Clouder - February 1, 2021
Alibaba Cloud Community - January 4, 2024
Alibaba Clouder - September 30, 2017
Alibaba Clouder - August 4, 2020
Alibaba Clouder - April 1, 2021
Alibaba Cloud Community - September 15, 2022
66 posts | 3 followers
FollowOffline SDKs for visual production, such as image segmentation, video segmentation, and character recognition, based on deep learning technologies developed by Alibaba Cloud.
Learn MoreSelf-service network O&M service that features network status visualization and intelligent diagnostics capabilities
Learn MoreMore Posts by Alibaba F(x) Team