Bidirectional LSTM-RNN for Improving Automated Assessment of Non-Native Children's Speech

Yao Qian, Keelan Evanini, Xinhao Wang, Chong Min Lee, Matthew Mulholland
2017 Interspeech 2017   unpublished
Recent advances in ASR and spoken language processing have led to improved systems for automated assessment for spoken language. However, it is still challenging for automated scoring systems to achieve high performance in terms of the agreement with human experts when applied to non-native children's spontaneous speech. The subpar performance is mainly caused by the relatively low recognition rate on non-native children's speech. In this paper, we investigate different neural network
more » ... res for improving non-native children's speech recognition and the impact of the features extracted from the corresponding ASR output on the automated assessment of speaking proficiency. Experimental results show that bidirectional LSTM-RNN can outperform feed-forward DNN in ASR, with an overall relative WER reduction of 13.4%. The improved speech recognition can then boost the language proficiency assessment performance. Correlations between the rounded automated scores and expert scores range from 0.66 to 0.70 for the three speaking tasks studied, similar to the humanhuman agreement levels for these tasks.
doi:10.21437/interspeech.2017-250 fatcat:qbzbflvuznf4rf2xfxlmt4z56e