77,977 Hits in 4.0 sec

Text line extraction for historical document images

Raid Saabni, Abedelkadir Asi, Jihad El-Sana
2014 Pattern Recognition Letters  
The first algorithm works on binary document images and assumes it is possible to extract the components along text lines.  ...  In this paper we present a language independent global method for automatic text line extraction.  ...  A dynamic programming based approach, was presented by Liwicki et al. (2007) for on-line text line segmentation, and adapted by Fischer et al. (2010) for historical documents.  ... 
doi:10.1016/j.patrec.2013.07.007 fatcat:hrzv7ujnxza5vp7fvba6jsz4d4

Text extraction from gray scale historical document images using adaptive local connectivity map

Zhixin Shi, S. Setlur, V. Govindaraju
2005 Eighth International Conference on Document Analysis and Recognition (ICDAR'05)  
This paper presents an algorithm using adaptive local connectivity map for retrieving text lines from the complex handwritten documents such as handwritten historical manuscripts.  ...  These problems include fluctuating text lines, touching or crossing text lines and low quality image that do not lend themselves easily to binarizations.  ...  Extraction of Text The text line patterns that we have extracted are location masks of text lines. The extraction of text from a gray scale image using these locations bring up two issues.  ... 
doi:10.1109/icdar.2005.229 dblp:conf/icdar/ShiSG05 fatcat:abizguavubgbnlkoctujckarzu

Using local maxima profile and Piece-Wise technique for line segmentation on Thai handwritten historical documents

Seksan Sangsawad, Rapeeporn Chamchong, Chun Che Fung
2011 2011 International Conference on Machine Learning and Cybernetics  
The interested mask text is used to map with text image in order to extract the text lines.  ...  This paper presents a new approach for segmenting text lines on Thai handwritten documents.  ...  Finally, the location mask block is mapped with the binary image for extracting text line. In this paper, a new text line location and extraction algorithm for historical documents is proposed.  ... 
doi:10.1109/icmlc.2011.6016974 dblp:conf/icmlc/SangsawadCF11 fatcat:slbrzkucufaulm6rgc75h22hii

A new Connected Component Analysis based System for Text Segmentation in Degraded Historical Document Images

So, there is a need for text segmentation and feature extraction to convert these manuscripts into machine editable format.  ...  In this paper, horizontal histogram, vertical histogram and connected component analysis is used to segment text documents images.  ...  segmentation for historical document images" (2019).  ... 
doi:10.35940/ijitee.f3503.049620 fatcat:kz5joulttnarzljot6adjn3p5e

Segmentation and Recognition for Historical Tibetan Document Images

Longlong Ma, Congjun Long, Lijuan Duan, Xiqun Zhang, Yanxing Li, Quanchao Zhao
2020 IEEE Access  
Thirdly, in order to solve the problems of touching strokes between text-lines and curvilinear text-lines, we present a text-line segmentation method based on graph model for historical Tibetan text-line  ...  This paper proposes an overall segmentation and recognition framework for historical Tibetan document images.  ...  TEXT-LINE SEGMENTATION Few researches have been done for text-line segmentation of historical Tibetan documents.  ... 
doi:10.1109/access.2020.2975023 fatcat:qlvqseky65d37jfcp6hv65whcq

Contextual Word Spotting in Historical Handwritten Documents

David Fernández
2015 ELCVIA Electronic Letters on Computer Vision and Image Analysis  
Once the text lines are extracted, words are localized inside the text lines using a word segmentation technique from the state of the art.  ...  There are countless collections of historical documents in archives and libraries that contain plenty of valuable information for historians and researchers.  ...  The most frequent words in each semantic cluster are extracted and the same text is used to transcribe all them.  ... 
doi:10.5565/rev/elcvia.741 fatcat:fbir4l2bx5bjvpkgnc2w3lddlm

Text line segmentation of historical documents: a survey

Laurence Likforman-Sulem, Abderrazak Zahour, Bruno Taconet
2006 International Journal on Document Analysis and Recognition  
For all these tasks, a major step is document segmentation into text lines.  ...  Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use  ...  When words of the image document are extracted by top down segmentation, which is generally the case, text lines are extracted first.  ... 
doi:10.1007/s10032-006-0023-z fatcat:suqv7dkiwvaypd5hyn4jdiztgi

Learning-Free Text Line Segmentation for Historical Handwritten Documents

Berat Kurar Barakat, Rafi Cohen, Ahmad Droby, Irina Rabaev, Jihad El-Sana
2020 Applied Sciences  
We present a learning-free method for text line segmentation of historical handwritten document images.  ...  Historical handwritten documents contain noise, heterogeneous text line heights, skews and touching characters among text lines.  ...  Acknowledgments: This research was supported by the Frankel Center for Computer Science in Ben-Gurion University of the Negev. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/app10228276 fatcat:mp277cng65e2xlyg7ui4an5k5u

Indexing of Historical Document Images: Ad Hoc Dewarping Technique for Handwritten Text [chapter]

Federico Bolelli
2017 Communications in Computer and Information Science  
The novelty introduced with this work regards the possibility of applying dewarping to document images which contain both handwritten and typewritten text.  ...  This work presents a research project, named XDOCS, aimed at extending to a much wider audience the possibility to access a variety of historical documents published on the web.  ...  Fig. 7 . 7 Examples of dewarping applied on historical digital document images. 1 For all images Take one Center Rectify the image Save Adjust REPOSITORY  ... 
doi:10.1007/978-3-319-68130-6_4 fatcat:oo47dm6eqjf2bk5mceseji4oji

A Complete Optical Character Recognition Methodology for Historical Documents

G. Vamvakas, B. Gatos, N. Stamatopoulos, S.J. Perantonis
2008 2008 The Eighth IAPR International Workshop on Document Analysis Systems  
In this paper a complete OCR methodology for recognizing historical documents, either printed or handwritten without any knowledge of the font, is presented.  ...  This methodology consists of three steps: The first two steps refer to creating a database for training using a set of documents, while the third one refers to recognition of new document images.  ...  Acknowledgments This research is carried out within the framework of the Greek Ministry of Research funded R&D project POLYTIMO [22] which aims to process and provide access to the content of valuable historical  ... 
doi:10.1109/das.2008.73 dblp:conf/das/VamvakasGSP08 fatcat:raf5w3l7evevbcu7rh7jr524gu

Robust text and drawing segmentation algorithm for historical documents

Rafi Cohen, Abedelkadir Asi, Klara Kedem, Jihad El-Sana, Itshak Dinstein
2013 Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing - HIP '13  
We present a method to segment historical document images into regions of different content. First, we segment text elements from non-text elements using a binarized version of the document.  ...  We examine the segmentation quality on 252 pages of a historical manuscript, for which the suggested method achieves about 92% and 90% segmentation accuracy of drawings and text elements, respectively.  ...  FI 1494/3-2, the Ministry of Science and Technology of Israel, the Council of Higher Education of Israel, the Lynn and William Frankel Center for Computer Sciences and by the Paul Ivanier Center for Robotics  ... 
doi:10.1145/2501115.2501117 dblp:conf/icdar/CohenAKED13 fatcat:7q3mhoqtfrfwld4vqjqdqpsq34

Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths

Nikos Nikolaou, Michael Makridis, Basilis Gatos, Nikolaos Stamatopoulos, Nikos Papamarkos
2010 Image and Vision Computing  
text columns or text lines, and (iv) use of skeleton segmentation paths in order to isolate possible connected characters.  ...  Comparative experiments using several historical machine-printed documents prove the efficiency of the proposed technique.  ...  For the purpose of the evaluation, we manually marked and extracted the ground truth on a set of 63 images for the case of evaluating the text line segmentation (3880 text line segments), 43 images for  ... 
doi:10.1016/j.imavis.2009.09.013 fatcat:iqdu2sjlgngwfe4pv7ki76vok4

Ground truth creation for handwriting recognition in historical documents

Andreas Fischer, Emanuel Indermühle, Horst Bunke, Gabriel Viehhauser, Michael Stolz
2010 Proceedings of the 8th IAPR International Workshop on Document Analysis Systems - DAS '10  
Handwriting recognition in historical documents is vital for the creation of digital libraries.  ...  For historical documents, ground truth creation is more difficult and timeconsuming when compared with modern documents.  ...  Furthermore, we would like to thank Amrei Schroettke, Chatrina Casutt, and Matthias Zaugg for creating the ground truth of the IAM-HistDB dataset.  ... 
doi:10.1145/1815330.1815331 dblp:conf/das/FischerIBVS10 fatcat:imkg4tdi6fampltbqtiqt6hooy

Text Extraction from Document Images- A Review

Deepika Ghai, Neelu Jain
2013 International Journal of Computer Applications  
[20] proposed an algorithm using adaptive local connectivity map (ALCM) for text extraction from complex handwritten historical document. The gray-scale document image is transformed into ALCM.  ...  [19] suggested a method for text line extraction in handwritten document with Kalman filter applied on low resolution images.  ... 
doi:10.5120/14559-2661 fatcat:iw3m6ooygjamhm3lkbsfe4ryt4

Text line segmentation for gray scale historical document images

Abedelkadir Asi, Raid Saabni, Jihad El-Sana
2011 Proceedings of the 2011 Workshop on Historical Document Imaging and Processing - HIP '11  
In this paper we present a new approach for text line segmentation that works directly on gray-scale document images.  ...  The medial seam determines a text line and the separating seams define the upper and lower boundaries of the text line.  ...  We would like to thank the reviewers for their insightful comments which led to several improvements in the presentation of this paper.  ... 
doi:10.1145/2037342.2037362 dblp:conf/icdar/AsiSE11 fatcat:j23z5ohrqvfelkhgcxfphdq2w4
« Previous Showing results 1 — 15 out of 77,977 results