A Complete Approach to the Conversion of Typewritten Historical Documents for Digital Archives [chapter]

Apostolos Antonacopoulos, Dimosthenis Karatzas
<span title="">2004</span> <i title="Springer Berlin Heidelberg"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
This paper presents a complete system that historians/archivists can use to digitize whole collections of documents relating to personal information. The system integrates tools and processes that facilitate scanning, image indexing, document (physical and logical) structure definition, document image analysis, recognition, proofreading/correction and semantic tagging. The system is described in the context of different types of typewritten documents relating to prisoners in World-War II
ration camps and is the result of a multinational collaboration under the MEMORIAL project funded (€1.5M) by the European Union (www.memorial-project.info). Results on a representative selection of documents show a significant improvement not only in terms of OCR accuracy but also in terms of overall time/cost involved in converting these documents for digital archives.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-540-28640-0_9">doi:10.1007/978-3-540-28640-0_9</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/olce7jt4abgstkvs252f3l4zwy">fatcat:olce7jt4abgstkvs252f3l4zwy</a> </span>
