Multi-Language Neural Network Language Models

Anton Ragni, Edgar Dakin, Xie Chen, Mark J.F. Gales, Kate M. Knill
2016 Interspeech 2016  
In recent years there has been considerable interest in neural network based language models. These models typically consist of vocabulary dependent input and output layers and one, or more, hidden layers. A standard problem with these networks is that large quantities of training data are needed to robustly estimate the model parameters. This poses a challenge when only limited data is available for the target language. One way to address this issue is to make use of overlapping vocabularies
more » ... ping vocabularies between related languages. However this is only applicable to a small set of languages, and the impact is expected to be limited for more general applications. This paper describes a general solution that allows data from any language to be used. Here, only the input and output layers are vocabulary dependent whilst hidden layers are shared, language independent. This multi-task training set-up allows the quantity of data available to train the hidden layers to be increased. This multi-language network can be used in a range of configurations, including as initialisation for previously unseen languages. As a proof of concept this paper examines multilingual recurrent neural network language models. Experiments are conducted using language packs released within the IARPA Babel program.
doi:10.21437/interspeech.2016-371 dblp:conf/interspeech/RagniDCGK16 fatcat:pfsjqehdbnemfo3ahqqmvud43y