Deep CNN-based Speech Balloon Detection and Segmentation for Comic Books [article]

David Dubray, Jochen Laubrock
2019 arXiv   pre-print
We develop a method for the automated detection and segmentation of speech balloons in comic books, including their carrier and tails. Our method is based on a deep convolutional neural network that was trained on annotated pages of the Graphic Narrative Corpus. More precisely, we are using a fully convolutional network approach inspired by the U-Net architecture, combined with a VGG-16 based encoder. The trained model delivers state-of-the-art performance with an F1-score of over 0.94.
more » ... ive results suggest that wiggly tails, curved corners, and even illusory contours do not pose a major problem. Furthermore, the model has learned to distinguish speech balloons from captions. We compare our model to earlier results and discuss some possible applications.
arXiv:1902.08137v1 fatcat:mwpogbg2jfgozdu3qbwx5wyo4i