Filters








23 Hits in 8.9 sec

Partitioning and searching dictionary for correction of optically read Devanagari character strings

V. Bansal, R.M.K. Sinha
2002 International Journal on Document Analysis and Recognition  
This paper describes a method for correction of optically read Devanagari character strings using a Hindi word dictionary.  ...  A tag is a string of fixed length associated with each partition. The correction process uses a distance matrix for assigning penalty for a mismatch.  ...  Part of the work has been supported by Department of Electronics, Govt. of India.  ... 
doi:10.1007/s100320100066 fatcat:ejoamexjk5d6vnsdiijg5td5ce

Partitioning and searching dictionary for correction of optically read Devanagari character strings

V. Bansal, R.M.K. Sinha
1999 Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318)  
This paper describes a method for correction of optically read Devanagari character strings using a Hindi word dictionary.  ...  A tag is a string of fixed length associated with each partition. The correction process uses a distance matrix for assigning penalty for a mismatch.  ...  Part of the work has been supported by Department of Electronics, Govt. of India.  ... 
doi:10.1109/icdar.1999.791872 dblp:conf/icdar/BansalS99a fatcat:6kyqe4bx4zhird5rhk7dhlhb34

Integrating knowledge sources in Devanagari text recognition system

V. Bansal, R.M.K. Sinha
2000 IEEE transactions on systems, man and cybernetics. Part A. Systems and humans  
The knowledge sources we use are mostly statistical in nature or in the form of a word dictionary tailored specifically for optical character recognition (OCR).  ...  A performance of approximately 90% correct recognition is achieved. Index Terms-Devanagari document processing, knowledge-based systems, optical character recognition.  ...  An optical character recognition (OCR) for Devanagari and Bangla (an Indian language script) printed script has been described by Chaudhuri and Pal [9] , [10] .  ... 
doi:10.1109/3468.852443 fatcat:rxgiruwpl5aonbjjv2rulquare

Genetic studies of variation in rayleigh and photometric matches in normal trichromats

Margaret Lutze, Nancy J. Cox, Vivianne C. Smith, Joel Pokorny
1990 Vision Research  
The knowledge sources we use are mostly statistical in nature or in the form of a word dictionary tailored specifically for optical character recognition (OCR).  ...  A performance of approximately 90% correct recognition is achieved. Index Terms-Devanagari document processing, knowledge-based systems, optical character recognition.  ...  An optical character recognition (OCR) for Devanagari and Bangla (an Indian language script) printed script has been described by Chaudhuri and Pal [9] , [10] .  ... 
doi:10.1016/0042-6989(90)90134-7 pmid:2321360 fatcat:nftmglimfngfbmzh6wvyjye5mi

Shape Encoded Post Processing of Gurmukhi OCR

Dharam Veer Sharma, Gurpreet Singh Lehal, Sarita Mehta
2009 2009 10th International Conference on Document Analysis and Recognition  
A post-processor is an integral part of any OCR system. This paper proposes a method for detection and correction of errors in recognition results of handwritten and machine printed Gurmukhi OCR.  ...  The corresponding code is then searched in the dictionary. If it exits then words from the list of the code are match with the source word.  ...  The accuracy obtained for OCR result improvement was between 4.65% for machine printed words and names and 7.45% for names obtained from form processing.  ... 
doi:10.1109/icdar.2009.180 dblp:conf/icdar/SharmaLM09 fatcat:zs64rhodzzhd5cdgycnpjp3vk4

Vartani Spellcheck – Automatic Context-Sensitive Spelling Correction of OCR-generated Hindi Text Using BERT and Levenshtein Distance [article]

Aditya Pal, Abhijit Mustafi
2020 arXiv   pre-print
We use a lookup dictionary and context-based named entity recognition (NER) for detection of possible spelling errors in the text.  ...  and difficulty in segmenting characters in a word.  ...  We would also like to thank the creators of publicly available Hindi datasets which were used extensively in our research.  ... 
arXiv:2012.07652v1 fatcat:u55lueliknbrhn3m4uvhxrw4qi

A post-processor for Gurmukhi OCR

G. S. Lehal, Chandan Singh
2002 Sadhana (Bangalore)  
A post-processing system for OCR of Gurmukhi script has been developed.  ...  Statistical information of Punjabi language syllable combinations, corpora look-up and certain heuristics based on Punjabi grammar rules have been combined to design the post-processor.  ...  Bansal & Sinha (1999) have developed a partitioned word dictionary for correcting optically read Devanagari character strings.  ... 
doi:10.1007/bf02703315 fatcat:zviej6qw2fgibcobwzju54uvxi

The Unicode Standard

Joan M. Aliprand
2000 Library resources & technical services  
The Unicode Character Database and other files are provided as-is by Unicode®, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied.  ...  The Unicode Character Database and other files are provided as-is by Unicode®, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied.  ...  and/or reading of the given text.  ... 
doi:10.5860/lrts.44n3.160 fatcat:t5yh4d5gtzfp7dp52ex234xc7m

Text Detection and Recognition in Imagery: A Survey

Qixiang Ye, David Doermann
2015 IEEE Transactions on Pattern Analysis and Machine Intelligence  
The categories and sub-categories of text are illustrated, benchmark datasets are enumerated, and the performance of the most representative approaches is compared.  ...  Special issues associated with the enhancement of degraded text and the processing of video text, multi-oriented, perspectively distorted and multilingual text are also addressed.  ...  Jie Chen, the Associate Editor and the reviewers for their comments and suggestions.  ... 
doi:10.1109/tpami.2014.2366765 pmid:26352454 fatcat:cuz3qhkglnahdebxqptbsgpjmm

Document image analysis: A primer

Rangachar Kasturi, Lawrence O'Gorman, Venu Govindaraju
2002 Sadhana (Bangalore)  
OCR makes it possible for the user to edit or search the document's contents. In this paper we briefly describe various components of a document analysis system.  ...  A well-known document image analysis product is the Optical Character Recognition (OCR) software that recognizes characters in a scanned document.  ...  OCR for Indian languages Proceedings of the Fifth International Conference on Document Analysis and Recognition (ICDAR 1999) has several papers dealing with OCR for Devanagari (such as Karnik 1999) .  ... 
doi:10.1007/bf02703309 fatcat:yfgn35ljn5bctbz2uc6d5zzcfu

An overview of character recognition focused on off-line handwriting

N. Arica, F.T. Yarman-Vural
2001 IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews)  
This material serves as a guide and update for the readers, working in the Character Recognition area. First, an overview of CR systems and their evolution over time is presented.  ...  Then, the available CR techniques with their superiorities and weaknesses are reviewed. Finally, the current status of CR is discussed and directions for future research are suggested.  ...  Paul Henry Leech for his suggestions and to the referees for their comments.  ... 
doi:10.1109/5326.941845 fatcat:jb3bjig3p5abtmgjz3bowuhdfu

Real-time character reading system for marathi script using raspberry Pi

S. Shelke, S. Apte
2016 3rd International Conference on Electrical, Electronics, Engineering Trends, Communication, Optimization and Sciences (EEECOS 2016)   unpublished
Development of real time recognition systems for character recognition for Indian scripts is a challenging task. This paper presents a novel real time character reading system for Marathi script.  ...  The system is developed using Raspberry Pi and OpenCV Python. The reading system is composed of two modules, namely, image acquisition module and character recognition module.  ...  "Partitioning and searching dictionary for correction of optically read Devanagari character strings", 5 th International Conference on rd International Conference on Electrical, Electronics, Engineering  ... 
doi:10.1049/cp.2016.1552 fatcat:4rnbff4kgvacliefpvjj3fagzm

Text Recognition in the Wild: A Survey [article]

Xiaoxue Chen, Lianwen Jin, Yuanzhi Zhu, Canjie Luo, Tianwei Wang
2020 arXiv   pre-print
It provides a comprehensive reference for people entering this field, and could be helpful to inspire future research.  ...  In recent years, with the rise and development of deep learning, numerous methods have shown promising in terms of innovation, practicality, and efficiency.  ...  Various shapes of text increase the difficulty of recognizing characters and predicting text strings.  ... 
arXiv:2005.03492v3 fatcat:rmzmavxylnf6rbp52lje2mrgiy

Intelligent Character Recognition System Using Convolutional Neural Network

S. Suriya, Dhivya S, Balaji M
2020 EAI Endorsed Transactions on Cloud Systems  
the equality rate after data augmenting to increase the efficiency of the system in learning and recognizing the character.  ...  the presence of low resolution, substantial blur, low contrast, and other distortions.  ...  [6] imposed a Review on Optical Character Recognition, this papers is on review of some researches has been made in English, Arabic and Devanagari characters and the methodology used and challenges  ... 
doi:10.4108/eai.16-10-2020.166659 fatcat:rrv3tyk2ezegdhcwsvuvvkgbrq

Development of a New Image-to-text Conversion System for Pashto, Farsi and Traditional Chinese [article]

Marek Rychlik, Dwight Nwaigwe and Yan Han and Dylan Murphy
2020 arXiv   pre-print
We report upon the results of a research and prototype building project Worldly OCR dedicated to developing new, more accurate image-to-text conversion software for several languages and writing systems  ...  We also describe approaches geared towards Traditional Chinese, which is non-cursive, but features an extremely large character set of 65,000 characters.  ...  The definition of OCR and its advantages Image-to-text conversion, also called Optical Character Recognition (OCR), is a crucial technology for Digital Humanities, connecting historical documents written  ... 
arXiv:2005.08650v1 fatcat:3nmbzaz72vgwnab2ts7iz6ugly
« Previous Showing results 1 — 15 out of 23 results