A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2015; you can also visit the original URL.
The file type is
In document image understanding, public datasets with ground-truth are an important part of scientific work. They are not only helpful for developing new methods, but also provide a way of comparing performance. Generating these datasets, however, is time consuming and cost-intensive work, requiring a lot of manual effort. In this paper we both propose a way to semi-automatically generate groundtruthed datasets for newspapers and provide a comprehensive dataset. The focus of this paper isdoi:10.1109/icdar.2009.214 dblp:conf/icdar/StreckerBAB09 fatcat:u5vlxrj47jeqtdlpl66d2nlisq