Researchers are increasingly interested in the application of the deep learning model for natural language processing (NLP), focusing on the representation and learning of words, sentences, articles, and relevant applications. For example, Bengio et al. obtained a new vector image called word embedding or word vector using the neural network model [27]. This vector is a low-dimensional, dense, and continuous vector representation, and contains semantic and grammatical information of the words. At present, word vector representation influences the implementation of most neural network based NLP methods.
Researchers designed the DNN model to learn about vector representation of sentences, which includes sentence modeling of the recursive neural network, the recurrent neural network (RNN) and the convolutional neural network (CNN) [28-30]. Researchers applied sentence representation to a large number of NLP tasks and achieved prominent outcomes, such as machine translation [31, 32] and sentiment analysis [33, 34]. The representation of sentences and the learning of articles are still relatively difficult and receive little research. An example of such research is that done by Li and his team, who implemented a representation of articles by encoding and decoding them via hierarchical RNN [35].
Given the language representation capability that CNN and RNN have shown in the NLP field in recent years, more researchers are trying the deep learning method to complete key activities in the QA field, such as question classification, answer selection, and automatic response generation. Also, the naturally annotated data [50] generated by internet users for exchange of information, such as microblog replies and community QA pairs provide reliable data resources for training the DNN model, thereby solving the data shortage problem in the QA research field to a large extent.
DNNs are gaining popularity in the world of machine translation. Researchers have designed various kinds of DNNs, such as deep stack networks (DSNs), deep belief networks (DBNs), recurrent neural networks (RNNs) and convolutional neural networks (CNNs). In NLP, the primary aim of all DNNs is to learn the syntactic and semantic representations of words, sentences, phrases, structures, and sentences so that it can grasp similar words (phrases or structures).
CNN-based sentence modeling can be presented as a "combination operator" with a local selection function. With the progressive deepening of the model level, the representation output obtained from the model can cover a wider range of words in a sentence. A multi-layer operation achieves sentence representation vectors of fixed dimension. This process is functionally similar to the recurrent operation mechanism [33] of "recursive automatic coding."
The sentence model formed through one layer of convolution and global max pooling is called a shallow convolutional neural network model. It is widely used for sentence level classification in NLP, for example, sentence classification [36] and relation classification [37]. However, the shallow convolutional neural network model can neither be used for complicated local semantic relations in sentences nor provide a better representation of semantic combination at a deeper layer in the sentence. Global max pooling results in the loss of word order characteristics in the sentence. As a result, the shallow convolutional neural network model only can be used for local attribute matching between statements. For complex and diversified natural language representations in questions and answers, the QA matching model [38-40] usually uses the deep convolutional neural network (DCNN) to complete the sentence modeling for questions and answers and conducts QA matching by transferring QA semantic representations from high-level output to multilayer perceptrons (MLP).
For more information, please go to QA Systems and Deep Learning Technologies – Part 2.
Nowadays, the neural networks MT has dramatically improved the translation smoothness, and can almost meet the high requirement on target language readability.
Quantity is not the only indicator for data collection, because quality also matters. Especially for a neural network system, there are very high requirements for training data's quality. Alibaba Translation uses methods that get commonly used in academic circles (such as the IBM model and recurrent neural network-based (RNN-based) force decoding method) in rating the collected data. Data with different quality levels get used in different application scenarios.
Alibaba Machine Intelligence Technology (MIT) Lab Lab mainly consists of four big teams, which are respectively Speech Technologies, Natural Language Processing (comprising smaller groups, such as machine translation, question answering, and sentiment analysis), Image/Video Content Analysis, and Deep Learning & Optimization.
All machine translations are a result of data learning. Alibaba has also been using data selection methods that get used in academic circles, including information from the data source, subject model-based, voice model-based, and convolutional neural network-based data selection methods.
Neural networks and deep learning technologies underpin most of the advanced intelligent applications today. In this article, Dr. Sun Fei (Danfeng), a high-level algorithms expert from Alibaba's Search Department, will provide a brief overview of the evolution of neural networks and discuss the latest approaches in the field. The article is primarily centered on the following five items:
AlexNet is a CNN network developed in 2012 by Alex Krizhevsky using five-layer convolution and three-layer ReLU layer, and won the ImageNet competition (ILSVRC). AlexNet proves the effectiveness in classification (15.3% error rate) of CNN, against the 25% error rate by previous image recognition tools. The emergence of this network marks a milestone for deep learning applications in the computer vision field.
AlexNet is also a common performance indicator tool for deep learning framework. TensorFlow provides the alexnet_benchmark.py tool to test GPU and CPU performance. This document uses AlexNet as an example to illustrate how to run a GPU application in Alibaba Cloud Container Service easily and quickly.
This tutorial intoduce the technologies related to text analysis for machine learning, including String Similarity Calculation which is a basic operation in machine learning, mainly used for information retrieval, natural language processing, bioinformatics, ect. and Keyword Extraction which is an important natural language processing technique.
Alibaba Cloud Machine Learning Platform for AI provides an all-in-one machine learning service featuring low user technical skills requirements, but with high performance results.
Machine Translation is an online intelligent machine translation service developed by Alibaba. With the cutting-edge natural language processing technology and massive Internet data, Alibaba successfully launched its attention-based deep neural machine translation (NMT), helping users cross the language divide, share and access information smoothly, and achieve barrier-free communication.
Introduction and Application of Convolutional Neural Networks
2,599 posts | 762 followers
Followshiming xie - November 4, 2019
Alibaba Clouder - March 9, 2017
Alibaba Clouder - March 8, 2017
Alibaba Clouder - November 5, 2019
Alibaba Clouder - October 12, 2019
Alibaba F(x) Team - August 29, 2022
2,599 posts | 762 followers
FollowConduct large-scale data warehousing with MaxCompute
Learn MoreA Big Data service that uses Apache Hadoop and Spark to process and analyze data
Learn MoreMore Posts by Alibaba Clouder