Sequence Labeling and Transduction with Output-Adjusted Actor-Critic Training of RNNs

Saeed Najafi
2018
Neural approaches to sequence labeling often use a Conditional Random Field (CRF) to model their output dependencies, while Recurrent Neural Networks (RNN) are used for the same purpose in other tasks. We set out to establish RNNs as an attractive alternative to CRFs for sequence labeling. To do so, we address one of the RNN's most prominent shortcomings, the fact that it is not exposed to its own errors with the maximum-likelihood training. We frame the prediction of the output sequence as a
more » ... quential decision-making process, where the RNN takes a series of actions without being conditioned on the ground-truth labels. We then train the network with an output-adjusted actorcritic algorithm (AC-RNN). We comprehensively compare this strategy with maximum-likelihood training for both RNNs and CRFs on three structuredoutput tasks. The proposed AC-RNN efficiently matches the performance of the CRF on NER and CCG tagging, and outperforms it on machine transliteration. We show that the output-adjusted actor-critic training is significantly better than other techniques for addressing RNN's exposure bias, such as Scheduled Sampling, and Self-Critical policy training. ii I was very lucky to have two great supervisors; Thank you Greg and Colin. I was a naive graduate student whom you helped to do research. I must appreciate our great NLP group for all those exciting shared tasks we participated. Thank you Garrett, Bradley, Mohammad, Rashed, and Leyuan.
doi:10.7939/r39z90t8b fatcat:62avmkie2nfevg4ksmer6d7sdy