Training Recurrent Neural Network through Moment Matching for NLP Applications

Yue Deng, Yilin Shen, KaWai Chen, Hongxia Jin
2018 Interspeech 2018  
Recurrent neural network (RNN) is conventionally trained in the supervised mode but used in the free-running mode for inferences on testing samples. The supervised mode takes ground truth token values as RNN inputs but the free-running mode can only use self-predicted token values as surrogating inputs. Such inconsistency inevitably results in poor generalizations of RNN on out-of-sample data. We propose a moment matching (MM) training strategy to alleviate such inconsistency by simultaneously
more » ... aking these two distinct modes and their corresponding dynamics into consideration. Our MM-RNN shows significant performance improvements over existing approaches when tested on practical NLP applications including logic form generation and image captioning.
doi:10.21437/interspeech.2018-1369 dblp:conf/interspeech/DengSCJ18 fatcat:lvw3w2s22bgnxloeyoxuctsufi