Paragraph text segmentation into lines with Recurrent Neural Networks

Bastien Moysset, Christopher Kermorvant, Christian Wolf, Jerome Louradour
2015 2015 13th International Conference on Document Analysis and Recognition (ICDAR)  
The detection of text lines, as a first processing step, is critical in all text recognition systems. State-of-the-art methods to locate lines of text are based on handcrafted heuristics finetuned by the image processing community's experience. They succeed under certain constraints; for instance the background has to be roughly uniform. We propose to use more "agnostic" Machine Learning-based approaches to address text line location. The main motivation is to be able to process either damaged
more » ... ess either damaged documents, or flows of documents with a high variety of layouts and other characteristics. A new method is presented in this work, inspired by the latest generation of optical models used for text recognition, namely Recurrent Neural Networks. As these models are sequential, a column of text lines in our application plays here the same role as a line of characters in more traditional text recognition settings. A key advantage of the proposed method over other data-driven approaches is that compiling a training dataset does not require labeling line boundaries: only the number of lines are required for each paragraph. Experimental results show that our approach gives similar or better results than traditional handcrafted approaches, with little engineering efforts and less hyper-parameter tuning.
doi:10.1109/icdar.2015.7333803 dblp:conf/icdar/MoyssetKWL15 fatcat:m5766frr25d4dlvekk6z6f2nwu