152,893 Hits in 3.2 sec

Extraction of Virtual Baselines from Distorted Document Images Using Curvilinear Projection

Gaofeng Meng, Zuming Huang, Yonghong Song, Shiming Xiang, Chunhong Pan
2015 2015 IEEE International Conference on Computer Vision (ICCV)  
The baselines of a document page are a set of virtual horizontal and parallel lines, to which the printed contents of document, e.g., text lines, tables or inserted photos, are aligned.  ...  Accurate baseline extraction is of great importance in the geometric correction of curved document images.  ...  Acknowledgments The authors would like to thank the anonymous reviewers and area chairs for their valuable remarks and sugges-  ... 
doi:10.1109/iccv.2015.447 dblp:conf/iccv/MengHSXP15 fatcat:j5s5uvyicfhi5hettlmibh346q

New baseline correction algorithm for text-line recognition with bidirectional recurrent neural networks

Olivier Morillot, Laurence Likforman-Sulem, Emmanuèle Grosicki
2013 Journal of Electronic Imaging (JEI)  
However, recently, recognition systems have dealt with text blocks and their compound text lines.  ...  Our approach is based on a sliding window within which the vertical position of the baseline is estimated. Segmentation of text lines into subparts is, thus, avoided.  ...  Acknowledgments This work has been supported by Direction Générale  ... 
doi:10.1117/1.jei.22.2.023028 fatcat:wgclsxd4sjalpk2lhdg5p6hivm

Text line segmentation of historical documents: a survey

Laurence Likforman-Sulem, Abderrazak Zahour, Bruno Taconet
2006 International Journal on Document Analysis and Recognition  
Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines),automatic text line segmentation remains an open research field.  ...  Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use  ...  The process of extracting text lines grows more difficult as interlines are narrowing; the lower baseline of the first line is becoming closer to the upper baseline of the second line; also, descenders  ... 
doi:10.1007/s10032-006-0023-z fatcat:suqv7dkiwvaypd5hyn4jdiztgi

Text Normalization Framework for Handwritten Cursive Languages by Detection and Straightness the Writing Baseline

Tarik Abu-Ain, Siti Norul Huda Sheikh Abdullah, Bilal Bataineh, Waleed Abu-Ain, Khairuddin Omar
2013 Procedia Technology - Elsevier  
In this work, a new framework for baseline detection and straightness for cursive handwritten texts is proposed based on analysis and extraction the directions features from the subwords of the text skeleton  ...  It is widely used in many various preprocessing stages as a text normalization including skew, slant and slop corrections, writing lines straightness and characters segmentation, as well as in feature  ...  The proposed framework The detection process of baseline location is very useful in extracting accurate information such as writing directions, ascenders, descenders, dots and diacritics.  ... 
doi:10.1016/j.protcy.2013.12.243 fatcat:kmytcjxdj5ep5lfz7svbid3mbu

Arabic Text Detection in News Video based on Line Segment Detector

Mbarek Charhad And Mounir Zrigui Sadek Mansouri
2017 Zenodo  
The last stage concerns the text line estimation and text detection in video frames.  ...  However, its detection and extraction is still an open problem due to the variety of its size and the complexity of the backgrounds.  ...  Estimating the baseline is a useful task for the reader as well as for Arabic text extraction and recognition.  ... 
doi:10.5281/zenodo.6333944 fatcat:mrwvatf4tjclfjz6fsaaepi5re

A Multi-Agent Approach to Arabic Handwritten Text Segmentation

Ashraf Elnagar, Rahima Bentrcia
2012 Journal of Intelligent Learning Systems and Applications  
Feature points (end points) are extracted from the remaining regions of the word-image.  ...  In this paper, a novel approach is proposed to segment handwritten Arabic text (words). We consider the "Naskh" font style.  ...  Figure 2 . 2 Two lines of text; horizontal projection of the both lines; and the resulting separate lines of text.  ... 
doi:10.4236/jilsa.2012.43021 fatcat:yaytgrhkkjcgzbmwiptq4udzha

Page Layout Analysis System for Unconstrained Historic Documents [article]

Oldřich Kodym, Michal Hradiš
2021 arXiv   pre-print
Extraction of text regions and individual text lines from historic documents is necessary for automatic transcription.  ...  We propose extending a CNN-based text baseline detection system by adding line height and text block boundary predictions to the model output, allowing the system to extract more comprehensive layout information  ...  Although it reaches high baseline detection accuracy even on challenging documents, it does not allow direct extraction of text lines and text blocks.  ... 
arXiv:2102.11838v1 fatcat:eoqmoeoqg5fmnmya7tkqvdblia

Novel Approach for Baseline Detection and Text Line Segmentation

Mahdi KeshavarzBahaghighat, Javad Mohammadi
2012 International Journal of Computer Applications  
Baseline detection and line segmentation are essential preprocessing steps of any OCR system.  ...  Obtained results indicate that in spite of narrow interline spaces and noisy components our method is capable to extract baseline in documents precisely.  ...  Based on the authors, this technique is able to detect text line in handwritten documents which may contain lines oriented in different directions, erasures and annotations between main lines.  ... 
doi:10.5120/8013-1039 fatcat:wfwygcbrnnbj7oolxnsfvdce5e

General text line extraction approach based on locally orientation estimation

Nazih Ouwayed, Abdel Belaïd, François Auger, Laurence Likforman-Sulem, Gady Agam
2010 Document Recognition and Retrieval XVII  
Afterwards, the text lines are extracted locally in each zone basing on the follow-up of the baselines and the proximity of connected components.  ...  This paper presents a novel approach for the multi-oriented text line extraction from historical handwritten Arabic documents.  ...  First, the multi-skewed zones are detected using an automatic paving and the Wigner-Ville Distribution. Then, the text lines are extracted based on the orientation of each zone and the baselines.  ... 
doi:10.1117/12.839518 dblp:conf/drr/OuwayedBA10 fatcat:lquxejp77zdthbtyhwxensdifu

Text Normalization Method for Arabic Handwritten Script

Tarik Abu-Ain, Siti Norul Huda Sheikh Abdullah, Khairuddin Omar, Ashraf Abu-Ein, Bilal Bataineh, Waleed Abu-Ain
2013 Journal of ICT Research and Applications  
as text segmentation, feature extraction and characters recognition.  ...  After that, selection of the correct baseline region is done, and finally, the baselines of all components are aligned with the writing line. The experiments IFN/ENIT benchmark Arabic dataset.  ...  The method consists of several steps, which are: components labeling and segmentation, components text thinning, skeletons direction features extraction, candidate baselines regions determination, correct  ... 
doi:10.5614/itbj.ict.res.appl.2013.7.2.5 fatcat:jdvo555yi5hv5glktmohecr47a

The QCRI Recognition System for Handwritten Arabic [chapter]

Felix Stahlberg, Stephan Vogel
2015 Lecture Notes in Computer Science  
We propose novel text line image normalization procedures and a new feature extraction method.  ...  We show that the combination of sophisticated text image normalization and state-of-the art techniques originating from ASR results in a very robust and accurate recognizer.  ...  Text Line Image Normalization Baseline Estimation We describe our baseline estimation in [18] .  ... 
doi:10.1007/978-3-319-23234-8_26 fatcat:i5tvmxmlmbbxhcrldhwjsmdvfu

Detecting Text Baselines in Historical Documents with Baseline Primitives

Wei Jia, Chixiang Ma, Lei Sun, Qiang Huo
2021 IEEE Access  
Then, a rough text line orientation is estimated by computing the variances of those centers along horizontal and vertical directions because its value can vary significantly along the text line direction  ...  Finally, the detected baseline primitives are grouped into individual text lines according to their predicted link relationships, and the text baselines are extracted accordingly.  ... 
doi:10.1109/access.2021.3093568 fatcat:bzawhelpbrecvkszjgws2ag47m

Multi-oriented Text Line Extraction from Handwritten Arabic Documents

Nazih Ouwayed, Abdel Belaïd
2008 2008 The Eighth IAPR International Workshop on Document Analysis Systems  
Afterwards, the text lines are extracted in each zone basing on the follow-up of the baselines and the proximity of connected components.  ...  In this paper, we present a novel approach for the multi-oriented text line extraction from handwritten Arabic documents.  ...  After then, it follows the baselines to extract the text lines.  ... 
doi:10.1109/das.2008.14 dblp:conf/das/OuwayedB08 fatcat:utlybywgg5dz5flzcpuhu6qemy

Segmentation of Touching, Overlapping, Skewed and Short Handwritten Text Lines

Rohini. S, Uma Devi.R.S, Mohanavel.S Mohanavel.S
2012 International Journal of Computer Applications  
Presence of touching or overlapping text lines, short-lines, curvilinear or skewed lines and small or variant gaps between the text lines make the segmentation challenging.  ...  Text line segmentation is an inherent part of document recognition system and important preprocessing step for word and character segmentation.  ...  Midpoint line between upper baseline and MU separator and midpoint line between lower baseline and ML separator are marked as segmentation points for the adjacent text lines.  ... 
doi:10.5120/7877-1163 fatcat:x5kthfzfznafnnn2ccjtvilmma

A generic method of cleaning and enhancing handwritten data from business forms

Xiangyun Ye, Mohamed Cheriet, Ching Y. Suen
2001 International Journal on Document Analysis and Recognition  
In reality, handwritten data usually touch or cross the preprinted form frames and texts, creating tremendous problems for the recognition engines.  ...  When the handwriting is found touching or crossing preprinted texts, morphological operations based on statistical features are used to clean it.  ...  Claude Rheault from DocImage Inc. for providing training and testing form images, Ms. Christine P. Nadal for her assistance in data collection, and Mr.  ... 
doi:10.1007/s100320100056 fatcat:jofq5eiop5dehijp4cheblenoa
« Previous Showing results 1 — 15 out of 152,893 results