A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is text/html
.
Filters
Document Recognition for a Million Books
2006
D-Lib Magazine
Rather, Gamera is mentioned in this context because it provides a useful benchmark for the types of document recognition capabilities that might be useful with a large corpus of digitized books. ...
The presence of a large-scale book image corpus significantly raises the possibilities for these important document recognition capabilities, especially given the potential for statistical inferences or ...
doi:10.1045/march2006-choudhury
fatcat:lzcde56ervbp7bsfggues4276a
A Generic Method for Automatic Ground Truth Generation of Camera-captured Documents
[article]
2016
arXiv
pre-print
The third contribution is a novel method for the recognition of cameracaptured document images. ...
The first contribution is a novel, generic method for automatic ground truth generation of camera-captured document images (books, magazines, articles, invoices, etc.). ...
ACKNOWLEDGMENTS This work is supported in part by CREST and JSPS Grant-in-Aid for Scientific Research (A)(25240028). ...
arXiv:1605.01189v1
fatcat:ldykx3z7yrb63aloo2arqk5mvy
Single Interface For Music Score Searching And Analysis (Simssa)
2015
Zenodo
Single Interface for Music Score Searching and Analysis (SIMSSA) project targets digitized music scores to de-sign a global infrastructure for searching and analyzing music scores. ...
Specifically, we seek to provide research-ers, musicians, and others to access the contents and metadata of a large number of scores in a searchable, digital format. ...
The Discovery sub-axis is developing a system that will automatically crawl millions of page images looking for digitized books with musical examples [8] . ...
doi:10.5281/zenodo.923821
fatcat:rpyvc6e26nepdouyet45ad5zfu
Digitizing a Million Books: Challenges for Document Analysis
[chapter]
2006
Lecture Notes in Computer Science
This paper describes the challenges for document image analysis community for building large digital libraries with diverse document categories. ...
The challenges are identified from the experience of the on-going activities toward digitizing and archiving one million books. ...
Raj Reddy, CMU for his valuable guidance of this project and also for his suggestions towards this paper. We thank Prof. N. Balakrishnan of IISC-Bangalore and Prof. ...
doi:10.1007/11669487_38
fatcat:up52vbsmizh7voy6g5lp56rzfu
Camlens – An Innovative Android Phone Application To Empower The Blind And Visually Impaired In Reading Any Kind Of Printed Text In Real-Time Using Opencv, Optical Character Recognition And Text-To-Speech
2018
Zenodo
by the blind and visually impaired person the usage of earphones.The genesis of the research comes from the fact that the three edges of a page of the book are easier to find with lesser possibilities ...
the perspective transformation of the cropped photographto obtain an image containing the scanned document. ...
INTRODUCTION 285 million people are estimated to be visually impaired worldwide: 39 million are blind and 246 have low vision and 90% of these live in low-income settings. ...
doi:10.5281/zenodo.1451741
fatcat:g4yvu66adbbvtmdfyica46refa
Digital Document Image Retrieval Using Optical Music Recognition
2012
Zenodo
ACKNOWLEDGEMENTS This work would not have been possible without the efforts of a number of people. ...
Further funding was provided by the Centre for Interdisciplinary Research in Music Media and Technology and the Canadian Foundation for Innovation. ...
This means that for their required goal of 10 million books, their expected index size is two terabytes of which most of the information is OCR coordinate data. ...
doi:10.5281/zenodo.1415562
fatcat:4hlvg7rx25br5d3ivlyprz25li
Enabling Search over Large Collections of Telugu Document Images – An Automatic Annotation Based Approach
[chapter]
2006
Lecture Notes in Computer Science
For the first time, search is enabled over a massive collection of 21 Million word images from digitized document images. ...
Character recognition based approaches yield poor results for developing search engines for Indian language document images, due to the complexity of the script and the poor quality of the documents. ...
We demonstrate the power and scalability of our solution by creating a search engine over 500 books of Telugu language document images. The collection contained 75,000 pages with 21 million words. ...
doi:10.1007/11949619_75
fatcat:f2t7th6dtnfo5jwtnyaicwncly
KuroNet: Regularized Residual U-Nets for End-to-End Kuzushiji Character Recognition
2020
SN Computer Science
Over 3 million books on a diverse array of topics, such as literature, science, mathematics and even cooking are preserved. ...
Our proposed model KuroNet (which builds on Clanuwat et al. in International conference on document analysis and recognition (ICDAR), 2019) outperforms other model for Kuzushiji recognition. ...
Overall it has been estimated that there are over 3 million books preserved nationwide [4] . ...
doi:10.1007/s42979-020-00186-z
fatcat:4e5bdbmvxzfpzagc2a7ayzgpse
Universal Digital Library—Future research directions
2005
Journal of Zhejiang University: Science A
Other than the Digital Library of India Initiative which is part of the Million Books to the Web Project initiated by Prof Raj Reddy of Carnegie Mellon University, there are a few more initiatives in India ...
This paper presents the future directions for the Digital Library of India Initiative both in terms of growing collection and the technical challenges in managing such large collection poses. ...
Currently more than 120 000 books (around 50 million pages) have been scanned and most of them are available on the Web for free browsing. ...
doi:10.1631/jzus.2005.a1204
fatcat:ieoxjyhfwja5vj3ynqorwsfyyu
KuroNet: Pre-Modern Japanese Kuzushiji Character Recognition with Deep Learning
[article]
2019
arXiv
pre-print
Over 3 millions books on a diverse array of topics, such as literature, science, mathematics and even cooking are preserved. ...
The result has been datasets with hundreds of millions of photographs of historical documents which can only be read by a small number of specially trained experts. ...
For these reasons the vast majority of these books and documents have not yet been transcribed into modern Japanese characters.
A. ...
arXiv:1910.09433v1
fatcat:ap7u6mnaabfxxdknkwauwynqhe
Page 678 of MH: Mental Hygiene Vol. 38, Issue 4
[page]
1954
MH: Mental Hygiene
problems for millions of individuals. ...
In both books there is also recognition that with aging there occur frustrations and deprivations that may be beyond individual control, to pose, not a problem of the aged for society, but rather personal ...
Nearest neighbor based collection OCR
2010
Proceedings of the 8th IAPR International Workshop on Document Analysis Systems - DAS '10
We show from a selection of 33 Telugu books that starting with OCR labels for only 30% of the collection we can recognize the remaining 70% of the words in the collection with 70% accuracy using this approach ...
Conventional optical character recognition (OCR) systems operate on individual characters and words, and do not normally exploit document or collection context. ...
Manmatha was supported in part by the Center for Intelligent Information Retrieval and in part by NSF IIS-0910884. ...
doi:10.1145/1815330.1815357
dblp:conf/das/SankarJM10
fatcat:szhehqsdj5gyzareox2tzcd7fm
The Objectives and Activities of the Publishers Association's Serial Publishers Executive (SPE)
1997
Serials: The Journal for the Serials Community
The SPE aims to ensure that serial publishing is gim its rightful place and recognition in the scheme of things. ...
Publishing turnover for academic and professional books in the United Kingdom was £694 million last year. Turnover for academic and professional journals was £626 million. ...
It was apparent during the preparations for the Dearing submission that the market for books is much better documented than the market for journals. ...
doi:10.1629/1024
fatcat:7koqk55ggbao7gt5jyyd5ceuqq
A Smart Reader for Blind People
2019
International Journal of Engineering and Advanced Technology
To read the text a human needs a vision. Survey conducted on several papers and systems provides hardware consisting of a camera interface with Raspberry Pi for processing the text. ...
The raspberry pi makes use of Optical Character Recognition (OCR) software installed in it, to perform the conversion of an image to text and similarly text to speech conversion. ...
Optical character recognition (OCR) is the technology used for translating a captured image of written text into machineencoded text. ...
doi:10.35940/ijeat.f1285.0986s319
fatcat:dwdoi73l7nf5rg46mj7syaseum
Estimating the Effects of Text Genre, Image Resolution and Algorithmic Complexity needed for Sinhala Optical Character Recognition
2021
The International Journal on Advances in ICT for Emerging Regions
While optical character recognition for Latin based scripts have seen near human quality performance, the accuracy for the rounded scripts of South Asia still lags behind. ...
a realistic estimation of the complexity of recognizing the rounded script of Sinhala. ...
ACKNOWLEDGMENT This work was carried out as a part of a project funded by Theekshana -Research and Development Company. We acknowledge Mrs. ...
doi:10.4038/icter.v14i3.7231
fatcat:zbq2kjrlnrepbew5zt4grl2vuy
« Previous
Showing results 1 — 15 out of 113,076 results