A Hybrid Neural Network for Sentence Classification

Renquan Zhou, Xiaoping Du
2017 Proceedings of The 7th International Conference on Computer Engineering and Networks — PoS(CENet2017)   unpublished
The sentence classification is the foundation of many Natural Language Processing applications. Prior neural network which use one type network for sentence classification can't use the abundant information in a sentence. In this paper, we proposed a hybrid neural network in combination with recurrent neural network and convolutional neural networks for sentence classification. The recurrent neural network can model long distance global information in a text, but it can't effectively extract
more » ... ectively extract the local information and convolutional neural network inversely. The proposed hybrid neural network takes full advantage of the advantages of these two networks while extracting global feature and local feature at the same time. In order to get the global feature, we also proposed three different methods to make use of hidden states generated by recurrent neural network. We conducted experiments on four public open datasets. The results show that our hybrid neural network does better than models by using the recurrent neural network or the convolutional neural networks alone, higher and completive classification accuracy is obtained. 1.Introduction With the development of deep learning, deep neural network has demonstrated great capability in various tasks of Natural Language Processing, such as text classification, machine translation and other fields [1] . The sentence classification is the foundation of many Natural Language Processing applications, such as question classification and sentiment analysis [2] . According to the number of words that each record contains, the text can be divided into sentence, paragraph and document. This study is based on sentence, which contains dozens of words. Kim [3] first applied the convolution neural network for text classification by using convolution filter to extract the local feature with different widths, then used the pooling layer (Max-pooling) to get text vector representation and proved that the convolution neural network can get considerable or better results than traditional feature engineering method. Kalchbrenner[4] proposed a convolutional neural network for text classification too while the pooling method was changed to dynamic K-Max poolingto extract multiple ordered features simultaneously and use some global information of the whole text. The recurrent neural network is widely used in the sequence-sequence language tasks. Bahdanau[5] applied the recurrent neural network to Machine Translation tasks. Bi-LSTM model was used to encode source language and decode target language to get better results than the feature-based Machine Translation methods. Zichao Yang[6] proposed the hierarchical attention network for document classification. The first layer used Bi-LSTM and attention mechanism to get sentence vector representation with processing word vectors one by one, the second layer used the same structure as the first layer to obtain document vector representation sentence by sentence too. Attention mechanism can focus on core words and sentences in a document. There are some studies about text classification by using the model with conjunction of recurrent neural network and convolutional neural network. Chunting Zhou [7] proposed a network called C-LSTM for text classification which used convolutional neural network for feature extraction firstly, and then used the LSTM network to generate text representation, the results were better than that by using convolutional neural network alone. Siwei Lai[8] used recurrent convolution neural network for text classification. The recurrent network was used to capture contextual information of words in text from left and right respectively as far as possible, and used the convolutional network for feature extraction to generate document representation. This model not only used the information above a word but used the information below. At present, most proposed model have used only one network for text classification, either recurrent neural network or convolutional neural network. Even if the combined model uses one network as the main feature extractor, the other is just an auxiliary. Intuitively, the recurrent neural network is more conducive to model the global information. Whereas the convolutional neural network has advantage for modeling local information, we can learn better representation for text with fully utilization of local information and global information together to improve the text classification results. This paper proposed a hybrid neural network by using both two type networks for sentence classfication. This paper contributes mainly from three aspects: 1) it proposes a hybird neural network in combination with CNN and Bi-LSTM for sentence calssfication; 2) it proposes three effective methods to get global feature from the hidden states
doi:10.22323/1.299.0057 fatcat:ddm5onutjrfylik7h2bf4mcg34