Improved phonotactic language recognition based on RNN feature reconstruction
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Nowadays phone recognition followed by support vector machine (PR-SVM) has been proposed in language recognition tasks and shown encouraging results. However, it still suffers from the problems such as the curse of dimensionality led by the increasing order of the N-gram feature supervector, the fast increasing number of possible parameters because of fast exact match of the phoneme history, etc.. These problems hamper the capability of N-gram vector space model (VSM) of handling long-term
... xts. In this paper, a recurrent neural networks (RNN) based feature reconstruction (FR) method is presented to compensate for the deficiency of the N-grams feature for phonotactic language recognition in this paper. Experiments are implemented on 2009 National Institute of Standards and Technology language recognition evaluation (NIST LRE) database. The results show that the proposed method gives 8.76%, 3.82%, 11.93% relative error rate reduction for 30s, 10s, 3s respectively comparing with the baseline system.