Filters








4,140 Hits in 9.4 sec

Course Evaluation Generator (Ceg): An Automated Academic Advising System with Optical Character Recognition

2018 International Journal of Technology and Engineering Studies  
Using the ISO 25010, the weighted mean for accuracy is 3.14 that user-respondents have strongly agreed to the correctness of generating courses to offer for a student.  ...  This research aims to explain the requirements of constructing software intended for the teaching department of ACLC College of Butuan (ACB) to optimize the course selection process, lessen the academic  ...  Monalee A. dela Cerna for the guidance and sincere suggestions to make this study more interesting and challenging; 2.  ... 
doi:10.20469/ijtes.4.10003-5 fatcat:cgip6zs5lrdovl34ey3fe2rauu

Automatic Extraction of Linguistic Data from Digitized Documents

Terrence Szymanski
2013 Proceedings of the annual meeting of the Berkeley Linguistics Society  
In lieu of an abstract, here is a brief excerpt:This paper presents a system for automatically extracting linguistic data from digitized linguistic documents using a combination of existing software packages  ...  The system is designed to leverage existing resources in online digital libraries in order to bootstrap the creation of large, multi-lingual linguistic corpora, which can then be used to conduct data-driven  ...  In order to evaluate the feasibility of this approach, we conducted an experiment on a controlled parallel corpus taken from the Tatoeba database.  ... 
doi:10.3765/bls.v39i1.3886 fatcat:u6i5xs3a7jgcdfdic3zbhcu5gq

OCR based slide retrieval

N. Daddaoua, J.M. Odobez, A. Vinciarelli
2005 Eighth International Conference on Document Analysis and Recognition (ICDAR'05)  
Retrieval experiments performed on a corpus of 570 slides (26 presentations) gathered at a workshop show that performance obtained with the OCR transcriptions are close to those obtained by extracting  ...  Since the most suitable acquisition technique, in such a context, is the use of a framegrabber (a device capturing as images the slides displayed on a screen), the slides must be transcribed with an Optical  ...  The text result is selected from all the generated hypotheses based on a confidence value computed for each recognized string based on langage modeling and OCR recognition statistics.  ... 
doi:10.1109/icdar.2005.169 dblp:conf/icdar/DaddaouaOV05 fatcat:ls7o45hddjgrfjjqil64hhqbni

Reusing the Model and Components of an IIR Study for Perceived Effects of OCR Quality Change

Kimmo Kettunen, Heikki Keskustalo, Birger Larsen, Tuula Pääkkönen, Juha Rautiainen
2022 Zenodo  
However, the research design and its general model could be utilized in the future to study the effects of OCR quality on professional settings entailing historians performing naturalistic phases of their  ...  However, it remains challenging to measure how the user's subjective perception is affected by the amount of OCR noise remaining in the documents.  ...  OCR Software and OCR Quality The baseline OCR for Uusi Suometar was performed using a series of ABBYY FineReader® products.  ... 
doi:10.5281/zenodo.6513586 fatcat:cg673qdtdbgsnlfav5kw6btt3m

Document Image Quality Assessment: A Brief Survey

Peng Ye, David Doermann
2013 2013 12th International Conference on Document Analysis and Recognition  
This paper provides a brief survey of research on the topic of document image quality assessment. We first present a detailed analysis of the types and sources of document degradations.  ...  of degradations and develop reliable methods for estimating the levels of degradations.  ...  When the consumer of a document image is machine, OCR software for example, document image quality may be defined as the OCR accuracy and DIQA metrics are factors that can be used to reliably predict OCR  ... 
doi:10.1109/icdar.2013.148 dblp:conf/icdar/YeD13 fatcat:f7imuvlkozdr5ha4qdyqcuv37e

A survey on Arabic character segmentation

Yasser M. Alginahi
2012 International Journal on Document Analysis and Recognition  
This is due to both the cursive nature of Arabic writing in both printed and handwritten forms and the scarcity of Arabic databases and dictionaries.  ...  This survey presents the description of the Arabic script characteristics with an overview on OCR systems and a comprehensive review mainly on off-line printed Arabic character segmentation techniques.  ...  To the best knowledge of the author, very few references are available which provide some evaluation of Arabic OCR software.  ... 
doi:10.1007/s10032-012-0188-6 fatcat:w5hszp2ksbcb3kw627yw2cwehy

The OCRopus open source OCR system

Thomas M. Breuel, Berrin A. Yanikoglu, Kathrin Berkner
2008 Document Recognition and Retrieval XV  
This paper describes the current status of the system, its general architecture, as well as the major algorithms currently being used for layout analysis and text line recognition.  ...  Above, we saw generally how the processing steps of the OCRopus system fit together. Let us now look at each of the processing steps in more detail.  ...  of the OCR engine to generate this markup.  ... 
doi:10.1117/12.783598 dblp:conf/drr/Breuel08 fatcat:k4cdglpamvee7ajcmrarop66bq

Extensible System for Optical Character Recognition of Maintenance Documents

John Anthony Labarga, Amardeep Singh, Vera Zaychik Moffitt
2018 Proceedings of the Annual Conference of the Prognostics and Health Management Society, PHM  
In the course of maintenance and operations, equipment operators and manufacturers frequently generate large volumes of paper documents.  ...  To implement analytics or automated monitoring, these documents must later be converted to digital copies, which can be ingested into a database.  ...  This generally requires scanning a document to produce an image, using OCR to produce a digital copy of the text, in an appropriate format for a database.  ... 
doi:10.36001/phmconf.2018.v10i1.480 fatcat:bribomxkkbfzjnhe4sygmaz65e

AutoDBT: A Framework for Automatic Testing of Web Database Applications [chapter]

Lihua Ran, Curtis E. Dyreson, Anneliese Andrews
2004 Lecture Notes in Computer Science  
AutoDBT uses the model along with the test criteria to generate test cases for functional testing of the application.  ...  AutoDBT automatically generates a guard query for each test case. The guard determines whether the test can be performed given the current state of the database.  ...  In particular they show how to quickly generate, in parallel, a large database that obeys certain statistical properties among the records generated.  ... 
doi:10.1007/978-3-540-30480-7_20 fatcat:aemcdgam2vgdvltm6rc6jchcam

An automatic linking service of document images reducing the effects of OCR errors with latent semantics

Renato F. Bulcão-Neto, José Camacho-Guerrero, Álvaro Barreiro, Javier Parapar, Alessandra A. Macedo
2010 Proceedings of the 2010 ACM Symposium on Applied Computing - SAC '10  
This paper presents a novel approach to support the automatic generation of relationships among document images by exploiting Latent Semantic Indexing (LSI) and Optical Character Recognition (OCR).  ...  Results show the feasibility of LinkDI relating OCR output with high degradation.  ...  the funding support.  ... 
doi:10.1145/1774088.1774092 dblp:conf/sac/NetoGBPM10 fatcat:xqujwg56qvhgrm7zev6467qfna

Does Removing Pooling Layers from Convolutional Neural Networks Improve Results?

Claudio Filipi Goncalves dos Santos, Thierry Pinheiro Moreira, Danilo Colombo, João Paulo Papa
2020 SN Computer Science  
In this context, there is a trend already in motion to replace convolutional pooling layers for a stride operation in the previous layer to save time.  ...  In this work, we evaluate the speedup of such an approach and how it trades off with accuracy loss in multiple computer vision domains, deep neural architectures, and datasets.  ...  On behalf of all authors, the corresponding author states that there is no conflict of interest.  ... 
doi:10.1007/s42979-020-00295-9 fatcat:shesvjh4c5bf7fdhpyyumyjxim

Survey of Automatic Spelling Correction

Daniel Hládek, Ján Staš, Matúš Pleva
2020 Electronics  
Although each article contains a brief introduction to the topic, there is a lack of work that would summarize the theoretical framework and provide an overview of the approaches developed so far.  ...  The survey describes selected approaches in a common theoretical framework based on Shannon's noisy channel. A separate section describes evaluation methods and benchmarks.  ...  The two following evaluation methodologies are used to evaluate spelling: • Mean reciprocal rank: A statistical measure for evaluating any process that produces a list of possible responses to a sample  ... 
doi:10.3390/electronics9101670 fatcat:pgf65dpwp5b2xc2hc6xxf5pplm

Information Retrieval Based on OCR Errors in Scanned Documents

Y. Fataicha, M. Cheriet, J. Y. Nie, C. Y. Suen
2003 2003 Conference on Computer Vision and Pattern Recognition Workshop  
The proposed algorithm consists of two basic steps. In the first step, we apply editing operations on OCR words that generate a collection of error-grams and correction rules.  ...  In this paper, we describe an approach that integrates the detection of errors in scanned texts without relying on a lexicon, and this detection is integrated in the research process.  ...  For example, if the word "light" is a term query, it is statistically uncertain because OCR confuses "i" with "l" and "c" with "e" etc. Thus, we generate 32 words.  ... 
doi:10.1109/cvprw.2003.10020 dblp:conf/cvpr/FataichaCNS03 fatcat:fmzhvw2bz5aepipgucu3xa5nsy

Optical Character Recognition [chapter]

2016 Practical Laboratory Automation  
The first OCR machines appear First generation OCR Second generation OCR Third generation OCR OCR to the people Table 2 : 2 Evaluation of feature extraction techniques.  ...  In statistical classification a probabilistic approach to recognition is applied.  ... 
doi:10.1002/9783527801954.app2 fatcat:i7yhlctvwnh23fvgfyqubxsvhm

optical character recognition [chapter]

Martin H. Weik
2000 Computer Science and Communications Dictionary  
The first OCR machines appear First generation OCR Second generation OCR Third generation OCR OCR to the people Table 2 : 2 Evaluation of feature extraction techniques.  ...  In statistical classification a probabilistic approach to recognition is applied.  ... 
doi:10.1007/1-4020-0613-6_12944 fatcat:6gd2qmtoxbdvjebz3mc6yeizia
« Previous Showing results 1 — 15 out of 4,140 results