16,587 Hits in 5.4 sec

Marginal Noise Reduction in Historical Handwritten Documents -- A Survey

Arpita Chakraborty, Michael Blumenstein
2016 2016 12th IAPR Workshop on Document Analysis Systems (DAS)  
This paper presents a survey on different approaches for removing the marginal noise from document images, and anlaysing the research challenges of those methods relating to handwritten historical datasets  ...  This survey discusses the difficulties and suitability of the state-of-the-art methods to remove marginal noise as well as preserving the text content from handwritten historical documents.  ...  Double Page Segmentation In [17] , the proposed method detects the optimal page frames of double-page document images based on the white run projections.  ... 
doi:10.1109/das.2016.78 dblp:conf/das/ChakrabortyB16 fatcat:zbhz3tnvdncibamwdohc6x5jky

Border Noise Removal of Camera-Captured Document Images Using Page Frame Detection [chapter]

Syed Saqib Bukhari, Faisal Shafait, Thomas M. Breuel
2012 Lecture Notes in Computer Science  
Experimental results show the effectiveness of our method in comparison to other stateof-the-art page frame detection approaches.  ...  Page frame detection is one of the newly investigated areas in document image processing, which is used to remove border noise and to identify the actual content area of document images.  ...  [9] proposed a method for splitting double-page scanned document images into two pages without noisy borders. Their method is based on vertical and horizontal white runs projections.  ... 
doi:10.1007/978-3-642-29364-1_10 fatcat:l33kxrpfbbhw3di2ge6tvchhly

Image Enhancement of Complex Document Images Using Histogram of Gradient Features

Sajan A. Jain, N. Shobha Rani, N. Chandan
2018 International Journal of Engineering & Technology  
The proposed technique carries out the block wise interpretation of document contents to remove the marginal noise that is present usually at the borders of images.  ...  Complex document images are one of the varied image categories that are difficult to process compared to other types of images.  ...  Including the effects produced by neighbor pages. The algorithm consists of techniques for removing textual and non-textual noise using a method of projection profile analysis.  ... 
doi:10.14419/ijet.v7i4.36.24244 fatcat:ajzzzjv7fbfm7midm3wqf45h5m

A simple and effective approach for border noise removal from document images

Faisal Shafait, Thomas M. Breuel
2009 2009 IEEE 13th International Multitopic Conference  
In this paper, we present a simple and effective approach for removing both textual and non-textual noise by finding borders of noise regions using projection profile analysis.  ...  We demonstrate the effectiveness of our approach by evaluating it quantitatively on the widely used University of Washington (UW3) dataset.  ...  ACKNOWLEDGMENTS This work was partially funded by the BMBF (German Federal Ministry of Education and Research), project PaREn (01 IW 07001).  ... 
doi:10.1109/inmic.2009.5383115 fatcat:ogcm4rirfrhxnpgj4p4pdz3hc4

Development of Nom character segmentation for collecting patterns from historical document pages

Truyen Van Phan, Bilan Zhu, Masaki Nakagawa
2011 Proceedings of the 2011 Workshop on Historical Document Imaging and Processing - HIP '11  
To improve the performance of segmentation, we use the recursive x-y cut method to segment separated regions. We evaluate the performance of this method on several pages in different layouts.  ...  The results confirm that the method is effective for character segmentation in Nom documents.  ...  They may become noises that can affect the segmentation process. Therefore, we have improved the method using projection profile in [11] to remove marginal noises effectively.  ... 
doi:10.1145/2037342.2037365 dblp:conf/icdar/PhanZN11 fatcat:emklf4fwozfofcjvu2ilgrasve

PageNet: Page Boundary Extraction in Historical Handwritten Documents [article]

Chris Tensmeyer, Brian Davis, Curtis Wigington, Iain Lee, Bill Barrett
2017 arXiv   pre-print
In this work, we present a deep learning based system, PageNet, which identifies the main page region in an image in order to segment content from both textual and non-textual border noise.  ...  We evaluate PageNet on 4 collections of historical handwritten documents and obtain over 94% mean intersection over union on all datasets and approach human performance on 2 of these collections.  ...  Stamatopoulos et al. proposed a system based on projection profiles to find the two individual page frames in images of books where two pages are shown.  ... 
arXiv:1709.01618v1 fatcat:sr4tku4uvrg2dd66dkhdw47cjq

Performance Comparison of Six Algorithms for Page Segmentation [chapter]

Faisal Shafait, Daniel Keysers, Thomas M. Breuel
2006 Lecture Notes in Computer Science  
However, we observe that the three best-performing algorithms are those based on constrained text-line finding, Docstrum, and the Voronoi-diagram.  ...  This paper presents a quantitative comparison of six algorithms for page segmentation: X-Y cut, smearing, whitespace analysis, constrained text-line finding, Docstrum, and Voronoi-diagram-based.  ...  Acknowledgments This work was partially funded by the BMBF (German Federal Ministry of Education and Research), project IPeT (01 IW D03).  ... 
doi:10.1007/11669487_33 fatcat:a7lw4t3z6jchnpbx23rio3jayq

Text Line Segmentation Based on Morphology and Histogram Projection

Rodolfo P. dos Santos, Gabriela S. Clemente, Tsang Ing Ren, George D.C. Cavalcanti
2009 2009 10th International Conference on Document Analysis and Recognition  
These procedures, however, cause some loss on the text line area. So, a recovery method is proposed to minimize this effect.  ...  Following, a sequence of histogram projection and recovery is proposed to obtain the line segmented region of the text.  ...  Once the page document has been preprocessed, a technique based on projection profiles is applied.  ... 
doi:10.1109/icdar.2009.183 dblp:conf/icdar/SantosCTC09 fatcat:6aslbhftdnfcxp4xsv24lvdloq

Document cleanup using page frame detection

Faisal Shafait, Joost van Beusekom, Daniel Keysers, Thomas M. Breuel
2008 International Journal on Document Analysis and Recognition  
Acknowledgments This work was partially funded by the BMBF (German Federal Ministry of Education and Research), project IPeT (01 IW D03).  ...  The approach in [10] tries to identify borders of noise regions based on an analysis of the projection profiles of the edges in the image.  ...  The approach in [11] tries to detect page borders based on an analysis of the projection profiles of the smeared image combined with a connected component labeling process.  ... 
doi:10.1007/s10032-008-0071-7 fatcat:57wemirbefgprplvvpcri6cjuu

A Robust Page Frame Detection Method for Complex Historical Document Images

Mohammad Reza, Md. Rakib, Syed Bukhari, Andreas Dengel
2019 Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods  
However, our target is first to remove both textual and non-textual noise from more challenging historical documents that occur in the page border area and then detect the page frame area based on the  ...  Shafait and Breuel have combined projection profile analysis with connected component removal to implement their method that can recognize the borders of noise areas.  ... 
doi:10.5220/0007382405560564 dblp:conf/icpram/RezaRBD19 fatcat:beqm4is7ajckndbtken2n7wxdy

Collecting Handwritten Nom Character Patterns from Historical Document Pages

Truyen Van Phan, Bilan Zhu, Masaki Nakagawa
2012 2012 10th IAPR International Workshop on Document Analysis Systems  
We have employed a projection profile based method for segmenting hundreds of pages into individual characters.  ...  In this paper, we present methods of segmenting Nom historical documents and clustering character patterns to build a Nom character pattern database.  ...  ACKNOWLEDGMENT The authors thank the National Library of Vietnam and the Vietnamese Nom Preservation Foundation for providing Nom historical document pages. The authors also thank Mr.  ... 
doi:10.1109/das.2012.25 dblp:conf/das/PhanZN12 fatcat:3qstvryphnalfemog4cxyibooe

Counteracting Dark Web Text-Based CAPTCHA with Generative Adversarial Learning for Proactive Cyber Threat Intelligence [article]

Ning Zhang, Mohammadreza Ebrahimi, Weifeng Li, Hsinchun Chen
2022 arXiv   pre-print
DW-GAN significantly outperformed the state-of-the-art benchmark methods on all datasets, achieving over 94.4% success rate on a carefully collected real-world dark web dataset...  ...  In particular, text-based CAPTCHA serves as the most prevalent and prohibiting type of these measures in the dark web.  ...  ACKNOWLEDGMENTS This material is based upon work supported by the National Science Foundation (NSF) under the following grants: SaTC-1936370, CICI-1917117, and SFS-1921485.  ... 
arXiv:2201.02799v2 fatcat:nepnavt6onaf7ncinm2x6ar5ey

Image Purification Technique for Myanmar OCR Applying Skew Angle Detection and Free Skew

Chit San Lwin, Wu Xiangqian
2019 International Journal of Scientific Research in Science and Technology  
Our technique implement skew angle detection and free skew, noisy border correction, extra page elimination, line segmentation from scanned images of Myanmar text.  ...  Performance of the proposed method is tested with 430 documents comprising different printed and handwritten Myanmar text of various fonts, sizes, multi-column, tables, stamps or photos, background effects  ...  effects, global and local skews, flow-charts, border, and extra pages.  ... 
doi:10.32628/ijsrst19615 fatcat:6ometi2xond4hatlo62tldcvxa

Page Frame Detection for Marginal Noise Removal from Scanned Documents [chapter]

Faisal Shafait, Joost van Beusekom, Daniel Keysers, Thomas M. Breuel
2007 Lecture Notes in Computer Science  
We describe and evaluate a method to robustly detect the page frame in document images, locating the actual page contents area and removing textual and non-textual noise along the page borders.  ...  We define suitable performance measures and evaluate the algorithm on the UW-III database. The results show that the error rates are below 4% for each of the performance measures used.  ...  This work was partially funded by the BMBF (German Federal Ministry of Education and Research), project IPeT (01 IW D03).  ... 
doi:10.1007/978-3-540-73040-8_66 fatcat:zfo7rqjs65cltdqpb5uukyksiu

The Skincare project, an interactive deep learning system for differential diagnosis of malignant skin lesions. Technical Report [article]

Daniel Sonntag, Fabrizio Nunnari, Hans-Jürgen Profitlich
2020 arXiv   pre-print
We give an overall description of the outcome of the Skincare project, and we focus on the steps to support communication and coordination between humans and machine in IML.  ...  This article describes the Skincare project (H2020, EIT Digital).  ...  See our project page about Skincare and the Interactive Machine Learning Lab (http: //  ... 
arXiv:2005.09448v1 fatcat:ofwcqevb5fagvaxf6cxrtbtb24
« Previous Showing results 1 — 15 out of 16,587 results