Investigation of Senone-based Long-Short Term Memory RNNs for Spoken Language Recognition

Yao Tian, Liang He, Yi Liu, Jia Liu
2016 Odyssey 2016  
Recently, the integration of deep neural networks (DNNs) trained to predict senone posteriors with conventional language modeling methods has been proved effective for spoken language recognition. This work extends some of the senone-based DNN frameworks by replacing the DNN with the LSTM RNN. Two of these approaches use the LSTM RNN to generate features. The features are extracted from the recurrent projection layer in the LSTM RNN either as frame-level acoustic features or utterance-level
more » ... ures and are then processed in different ways to produce scores for each target language. In the third approach, the conventional i-vector model is modified to use the LSTM RNN to produce frame alignments for sufficient statistics extraction. Experiments on the NIST LRE 2015 demonstrate the effectiveness of the proposed methods.
doi:10.21437/odyssey.2016-13 dblp:conf/odyssey/TianHLL16 fatcat:oaaqran3lncjnpxgzdykphea34