A Neural Network Architecture for Multilingual Punctuation Generation

Miguel Ballesteros, Leo Wanner
2016 Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing  
Even syntactically correct sentences are perceived as awkward if they do not contain correct punctuation. Still, the problem of automatic generation of punctuation marks has been largely neglected for a long time. We present a novel model that introduces punctuation marks into raw text material with transition-based algorithm using LSTMs. Unlike the state-of-the-art approaches, our model is language-independent and also neutral with respect to the intended use of the punctuation. Multilingual
more » ... periments show that it achieves high accuracy on the full range of punctuation marks across languages.
doi:10.18653/v1/d16-1111 dblp:conf/emnlp/BallesterosW16 fatcat:xgntnaogljex5pynbp23gctvtm