Filters








10,304 Hits in 6.2 sec

Robust text and drawing segmentation algorithm for historical documents

Rafi Cohen, Abedelkadir Asi, Klara Kedem, Jihad El-Sana, Itshak Dinstein
2013 Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing - HIP '13  
We examine the segmentation quality on 252 pages of a historical manuscript, for which the suggested method achieves about 92% and 90% segmentation accuracy of drawings and text elements, respectively.  ...  We present a method to segment historical document images into regions of different content. First, we segment text elements from non-text elements using a binarized version of the document.  ...  FI 1494/3-2, the Ministry of Science and Technology of Israel, the Council of Higher Education of Israel, the Lynn and William Frankel Center for Computer Sciences and by the Paul Ivanier Center for Robotics  ... 
doi:10.1145/2501115.2501117 dblp:conf/icdar/CohenAKED13 fatcat:7q3mhoqtfrfwld4vqjqdqpsq34

Layout Analysis for Arabic Historical Document Images Using Machine Learning

Syed Saqib Bukhari, Thomas M. Breuel, Abedelkadir Asi, Jihad El-Sana
2012 2012 International Conference on Frontiers in Handwriting Recognition  
Simple and discriminative features are extracted in a connected-component level and subsequently robust feature vectors are generated.  ...  We introduce an approach that segments text appearing in page margins (a.k.a side-notes text) from manuscripts with complex layout format.  ...  FI 1494/3-1 and the Lynn and William Frankel Center for Computer Science at Ben-Gurion University of the Negev.  ... 
doi:10.1109/icfhr.2012.227 dblp:conf/icfhr/BukhariBAE12 fatcat:npw7po3mjrad7cg62bti736uzq

A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts [article]

Ruggero Pintus, Ying Yang, Enrico Gobbetti, Holly Rushmeier
2014 Eurographics Workshop on Graphics and Cultural Heritage  
In this paper, we present a completely automatic algorithm to perform a robust text segmentation of old handwritten manuscripts on a per-book basis, and we show how to exploit this outcome to find two  ...  layout elements, i.e., text blocks and text lines.  ...  Efficient algorithms exist to cope with modern machineprinted documents or historical documents from the handpress period, and to solve the traditional problems of layout analysis, such as document classification  ... 
doi:10.2312/gch.20141302 fatcat:bzzmqmaafncp7nw4phy3nipbby

Illustrations Segmentation in Digitized Documents Using Local Correlation Features

Dalia Coppi, Costantino Grana, Rita Cucchiara
2014 Procedia Computer Science  
In this paper we propose an approach for Document Layout Analysis based on local correlation features.  ...  The proposal has been demonstrated to be effective on historical datasets and to outperform the state-of-the-art in presence of challenging documents with a large variety of pictorial elements.  ...  The XY cut is a well known recursive algorithm for top-down page segmentation.  ... 
doi:10.1016/j.procs.2014.10.014 fatcat:32xydpox4je5nemm2hsngpaihi

Extraction of Homogeneous Regions in Historical Document Images

Maroua Mehri, Pierre Héroux, Nabil Sliti, Petra Gomez-Krämer, Najoua Essoukri Ben Amara, Rémy Mullot
2015 Proceedings of the 10th International Conference on Computer Vision Theory and Applications  
Indeed, determining graphic regions can help to segment and analyze the graphical part in historical heritage, while finding text zones can be used as a pre-processing stage for character recognition,  ...  Thus, we propose in this article an automatic segmentation method for historical document images based on extraction of homogeneous or similar content regions.  ...  Figure 3 (b)) and the other represents the foreground (e.g. noise, text fields, drawings, etc.) (cf. Figure 3 (c)).  ... 
doi:10.5220/0005265500470054 dblp:conf/visapp/MehriHSGAM15 fatcat:z4dhhtjvuzfj5hiukasqi5nelu

Using Scale-Space Anisotropic Smoothing for Text Line Extraction in Historical Documents [chapter]

Rafi Cohen, Itshak Dinstein, Jihad El-Sana, Klara Kedem
2014 Lecture Notes in Computer Science  
Text line extraction is vital pre-requisite for various document processing tasks.  ...  The final stage of the algorithm is based on an energy minimization framework for removing spurious text line and assigning connected components to lines.  ...  FI 1494/3-2, the Ministry of Science and Technology of Israel, the Council of Higher Education of Israel, the Lynn and William Frankel Center for Computer Sciences and by the Paul Ivanier Center for Robotics  ... 
doi:10.1007/978-3-319-11758-4_38 fatcat:g7zcn7cfdfhq3i3twuozuuwkum

A Path Planning for Line Segmentation of Handwritten Documents

Olarik Surinta, Michiel Holtkamp, Faik Karabaa, Jean-Paul Van Oosten, Lambert Schomaker, Marco Wiering
2014 2014 14th International Conference on Frontiers in Handwriting Recognition  
This paper describes the use of a novel A * pathplanning algorithm for performing line segmentation of handwritten documents.  ...  We have performed experiments on the Saint Gall and Monk line segmentation (MLS) datasets.  ...  Such a technique takes into account the diversity of document images, texts, images, mixtures of texts and images, line drawings, and noisy or degraded document images. Bulacu et al.  ... 
doi:10.1109/icfhr.2014.37 dblp:conf/icfhr/SurintaHKOSW14 fatcat:3u3o7kibifgjxdfn4pmrznmgni

Text segmentation in degraded historical document images

A.S. Kavitha, P. Shivakumara, G.H. Kumar, Tong Lu
2016 Egyptian Informatics Journal  
In this paper, we present a new method for segmenting text and non-text in Indus documents based on the fact that text components are less cursive compared to non-text ones.  ...  Text segmentation from degraded Historical Indus script images helps Optical Character Recognizer (OCR) to achieve good recognition rates for Hindus scripts; however, it is challenging due to complex background  ...  Omar and Lu [4] proposed an algorithm to extract text lines from historical document images using Steerable Directional filters.  ... 
doi:10.1016/j.eij.2015.11.003 fatcat:sbc7ftyihvewtbdoy3lqdsz5ai

Evolution Maps for Connected Components in Text Documents

Ofer Biller, Klara Kedem, Itshak Dinstein, Jihad El-Sana
2012 2012 International Conference on Frontiers in Handwriting Recognition  
For highly degraded text documents, common tasks such as binarization and line extraction, remain difficult tasks.  ...  We use these maps to provide a robust algorithm for extracting information about character dimensions in degraded documents, and demonstrate improvement in binarization results using this information.  ...  Paul Ivanier Center for Robotics and Production Management at Ben-Gurion University, Israel.  ... 
doi:10.1109/icfhr.2012.201 dblp:conf/icfhr/BillerKDE12 fatcat:7w4nknua75ce3a4ai4rouuwoo4

Handwritten document image segmentation into text lines and words

Vassilis Papavassiliou, Themos Stafylakis, Vassilis Katsouros, George Carayannis
2010 Pattern Recognition  
Two novel approaches to extract text lines and words from handwritten document are presented.  ...  The line segmentation algorithm is based on locating the optimal succession of text and gap areas within vertical zones by applying Viterbi algorithm.  ...  Acknowledgment The authors would like to thank for the support by the Greek Secretariat for Research and Technology under the program PENED-03/251.  ... 
doi:10.1016/j.patcog.2009.05.007 fatcat:2weprb5frnaubiixvi6dulu3pe

Robustness Assessment of Texture Features for the Segmentation of Ancient Documents

Maroua Mehri, Van Cuong Kieu, Mohamed Mhiri, Pierre Heroux, Petra Gomez-Kramer, Mohamed Ali Mahjoub, Remy Mullot
2014 2014 11th IAPR International Workshop on Document Analysis Systems  
For the segmentation of ancient digitized document images, it has been shown that texture feature analysis is a consistent choice for meeting the need to segment a page layout under significant and various  ...  This study shows the robustness of texture feature extraction for segmentation in the case of noise and the uselessness of a denoising step.  ...  [11] propose to separate drawings from background and noise of historical documents by using spatial and color features of superpixels. Liu et al.  ... 
doi:10.1109/das.2014.22 dblp:conf/das/MehriKMHGMM14 fatcat:itg2xllne5cmnbo5jzr7ng6ote

Historical Document Processing: Historical Document Processing: A Survey of Techniques, Tools, and Trends [article]

James P. Philips, Nasseh Tabrizi
2020 arXiv   pre-print
Historical Document Processing is the process of digitizing written material from the past for future use by historians and other scholars.  ...  This paper surveys the major phases of, standard algorithms, tools, and datasets in the field of Historical Document Processing, discusses the results of a literature review, and finally suggests directions  ...  and text-line segmentation.  ... 
arXiv:2002.06300v2 fatcat:nxufntuk7famfph6ownyuys2py

3D models over the centuries: From old floor plans to 3D representation

Christophe Riedinger, Michel Jordan, Hedi Tabia
2014 2014 International Conference on 3D Imaging (IC3D)  
This paper presents a set of algorithms dedicated to the 3D modeling of historical buildings from a collection of old architecture plans, including floor plans, elevations and cutoffs.  ...  We compute height informations and add textures to the model by analyzing the elevation images from the same collection of documents.  ...  These results show that our algorithms are robust enough to be used on various data sources.  ... 
doi:10.1109/ic3d.2014.7032583 dblp:conf/ic3d/RiedingerJT14 fatcat:q6tomykqyvhbvabte4x5pbyoje

Learning Texture Features for Enhancement and Segmentation of Historical Document Images

Maroua Mehri, Nibal Nayef, Pierre Héroux, Petra Gomez-Krämer, Rémy Mullot
2015 Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing - HIP '15  
EXPERIMENTS We have experimentally evaluated the proposed ancient document enhancement and segmentation algorithm on 100 pages of ancient documents.  ...  The experimental corpus is selected for historical document layout analysis and historical book recognition competitions in the context of ICDAR conference and HIP workshop.  ... 
doi:10.1145/2809544.2809545 dblp:conf/icdar/MehriNHGM15 fatcat:nuztk7z7m5eqjeb7yoxk3r7qni

HAH manuscripts: A holistic paradigm for classifying and retrieving historical Arabic handwritten documents

Zaher Al Aghbari, Salama Brook
2009 Expert systems with applications  
This paper presents a novel holistic technique for classifying and retrieving Arabic handwritten text documents. The retrieval of Arabic handwritten documents is performed in several steps.  ...  First, the Arabic handwritten document images are segmented into words, and then each word is segmented into its connected parts.  ...  For the document to line segmentation, the problem is easy and thus our algorithm can easily detect the line borders.  ... 
doi:10.1016/j.eswa.2009.02.024 fatcat:wyyleohskjbpdndurdxavibyja
« Previous Showing results 1 — 15 out of 10,304 results