Filters








200 Hits in 3.4 sec

A Comparative study on Term Weighting Methods for Automated Telugu Text Categorization with Effective Classifiers

Vishnu Murthy G, Vishnu Vardhan B, Sarangam K, Vijay pal Reddy P
2013 International Journal of Data Mining & Knowledge Management Process  
This paper investigates the performance of different classification approaches using different term weighting approaches in order to decide the most applicable one to Telugu text classification problem  ...  We have investigated on different term weighting methods for Telugu corpus in combination with Naive Bayes ( NB), Support Vector Machine (SVM) and k Nearest Neighbor (kNN) classifiers.  ...  In this paper, we considered the SVM,KNN and NB classification approaches for Telugu Text Categorization.  ... 
doi:10.5121/ijdkp.2013.3606 fatcat:6zi7iupmmzegzadym2imufrene

Influence of Lexical, Syntactic and Structural Features and their Combination on Authorship Attribution for Telugu Text

S. Naga Prasad, V.B. Narsimha, P. Vijayapal Reddy, A. Vinaya Babu
2015 Procedia Computer Science  
AA is based on the classification of documents on author writing style rather than the topic of the text.  ...  In this paper experimental evaluations were carried out on Telugu text for Authorship Attribution using various types of features and their combinations.  ...  It can also be experimented with various feature selection approaches for feature vector dimensionality reduction.  ... 
doi:10.1016/j.procs.2015.04.110 fatcat:tyf4rgt5fzbyhkkvx6u6rgouqi

Dimension Reduction for Script Classification - Printed Indian Documents

Hamsaveni L, Pradeep C, Chethan H K
2017 International Journal of Advanced Information Technology  
Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching online archives of document images  ...  the relative performance of classification procedures incorporating those methods.  ...  PCA, PLS and SIR are three of such methods for dimension reduction.  ... 
doi:10.5121/ijait.2017.7301 fatcat:3tvy3bop3vgwdacqvf7nqoqo4i

Review of offline handwritten text recognition in south Indian languages

A. T. Anju, Binu P. Chacko, K. P. Mohammad Basheer
2021 Malaya Journal of Matematik  
In this article, offline handwriting recognition methods performed in south Indian languages including Telugu, Tamil, Kannada and Malayalam are presented.  ...  Offline Handwritten character recognition is a popular and challenging area of research under pattern recognition and image processing.  ...  Mainly two approaches are used for the recognition process such as analytical approach and holistic approach.  ... 
doi:10.26637/mjm0901/0132 fatcat:6ckkcuiakbcivljp45occntya4

PCA plus LDA on Wavelet Co-occurrence Histogram Features for Texture Classification and its Applications

Shivashankar S., Hiremath P. S.
2013 International Journal of Computer Applications  
A combination of Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) is applied on WCH feature vector for dimensionality reduction and enhancement of the class separability respectively  ...  In this paper, we propose a combined approach, namely, PCA plus LDA on Wavelet Co-occurrence Histogram Features (WCHF) for texture classification.  ...  Further these features are used for dimensionality reduction.  ... 
doi:10.5120/10432-5109 fatcat:vk3kt6ekcbc23odhckzvxhv3na

Online Handwritten Text Recognition for Indian Scripts

Ravneet Kaur, Dharam Veer Sharma
2017 IOSR Journal of VLSI and Signal processing  
The above discussed approaches have their own applicability, but they are having limited domain.  ...  In this paper a review of online HCR work on almost all popular Indian scripts such as Devanagari, Gurmukhi, Bangla, Tamil, Telugu, Malayalam, Urdu, Kannada, Oriya, and Gujarati is presented.  ...  Feature combinations extracted from the size normalized characters are fed to 2D-LDA for dimensionality reduction and nearest neighbor classifier is used for classification.  ... 
doi:10.9790/4200-0704013950 fatcat:ingqu5rnx5cznnwmtsqqgzvpvy

Sparse Concept Coded Tetrolet Transform for Unconstrained Odia Character Recognition [article]

Kalyan S Dash, N B Puhan, G Panda
2020 arXiv   pre-print
In this regard, we propose a new image representation approach for unconstrained handwritten alphanumeric characters using sparse concept coded Tetrolets.  ...  The sparse concept coding of low entropy Tetrolet representation is found to extract the important hidden information (concept) for superior pattern discrimination.  ...  reduction. 3) Extensive performance evaluation is carried out in ten databases comprising of English, Arabic, Bangla, Devanagari, Odia and Telugu scripts.  ... 
arXiv:2004.01551v1 fatcat:paj4fpnatngddmy7eogk6fmfve

Empirical Evaluation of Character Classification Schemes

Neeba N.V, C.V. Jawahar
2009 2009 Seventh International Conference on Advances in Pattern Recognition  
In this paper, we empirically study the performance of a set of pattern classification schemes for character classification problems.  ...  Scope of this study include (a) applicability of a spectrum of classifiers and features (b) scalability of classifiers (c) sensitivity of features to degradation (d) generalization across fonts and (e)  ...  In both these schemes, the feature extraction scheme is derived out of the data covariance. Random projection is a data independent method for dimensionality reduction.  ... 
doi:10.1109/icapr.2009.41 dblp:conf/icapr/NeebaJ09 fatcat:4jt7ed3u3zfhtgxcrlptnxreke

Zone-based hybrid feature extraction algorithm for handwritten numeral recognition of four Indian scripts

S.V. Rajashekararadhya, Vanaja P Ranjan
2009 2009 IEEE International Conference on Systems, Man and Cybernetics  
The nearest neighbor and support vector machine classifiers are used for subsequent classification and recognition purposes.  ...  We obtained 97.85 %, 96.8 %, 95.1% and 95 % recognition rates for Kannada, Telugu, Tamil and Malayalam numerals respectively, using support vector machine.  ...  Nearest neighbor classifier for classification and recognition For large-scale pattern matching, a long-employed approach is the NNC.  ... 
doi:10.1109/icsmc.2009.5346007 dblp:conf/smc/RajashekararadhyaR09 fatcat:62nundzcx5d5tifhcdgynilj5i

Genre Classification of Telugu and English Movie Based on the Hierarchical Attention Neural Network

Kumar Govindaswamy, Bharathiar University, Shriram Ragunathan, Bharathiar University
2021 International Journal of Intelligent Engineering and Systems  
Twitter data related to the Telugu and English movies are collected and applied to HANN for movie's genre classification. IMDB data are used to evaluate the performance of the proposed HANN method.  ...  Genre Classification of movies is useful in the movie recommendation system for video streaming applications like Amazon, Netflix, etc.  ...  Conflicts of Interest The authors declare no conflict of interest.  ... 
doi:10.22266/ijies2021.0228.06 fatcat:m7ckrx5enfdflk4gi75fzewjwa

Tools for Developing OCRs for Indian Scripts

M N S S K Pavan Kumar, S S Ravi Kiran, Abhishek Nayani, C V Jawahar, P J Narayanan
2003 2003 Conference on Computer Vision and Pattern Recognition Workshop  
An integrated approach to the design of OCRs for all Indian scripts has great benefits.  ...  We are building OCRs for all Indian languages following this approach as part of a system to provide tools to create content in them.  ...  In the feature selection, dimensionality reduction was performed using PCA analysis and the reduced dimension vectors were used as feature vectors.  ... 
doi:10.1109/cvprw.2003.10023 dblp:conf/cvpr/KumarKNJN03 fatcat:7rioppeaqjdc3fqbfhx4c4sn6m

Script Recognition—A Review

D Ghosh, T Dube, A P Shivaprasad
2010 IEEE Transactions on Pattern Analysis and Machine Intelligence  
It is noted that the research in this field is relatively thin and still more research is to be done, particularly in case of handwritten documents.  ...  In view of this, several methods for automatic script identification have been developed so far. They mainly belong to two broad categories -structure-based and visual appearance-based techniques.  ...  This may result in the curse of dimensionality.  ... 
doi:10.1109/tpami.2010.30 pmid:20975114 fatcat:3ysyxmflovej5f7gy2z7ogffou

Preferred Computational Approaches for the Recognition of different Classes of Printed Malayalam Characters using Hierarchical SVM Classifiers

Bindu Philip, R. D. Sudhaker Samuel
2010 International Journal of Computer Applications  
Characterization of matrices for efficient classification has several options. There are various alternatives depending on the structure of the matrix.  ...  The proposed algorithms have been tested on a variety of printed Malayalam documents. Recognition rates between 97.72% and 98.78% have resulted.  ...  The value of p is chosen by defining a value of ξ for reliable and robust classification. Five dominant singular values are selected as features for distinct classification of the characters.  ... 
doi:10.5120/350-530 fatcat:z2nymwlik5emxoezzfb5dungia

A Semi-automatic Adaptive OCR for Digital Libraries [chapter]

Sachin Rawat, K. S. Sesh Kumar, Million Meshesha, Indraneel Deb Sikdar, A. Balasubramanian, C. V. Jawahar
2006 Lecture Notes in Computer Science  
This paper presents a novel approach for designing a semi-automatic adaptive OCR for large document image collections in digital libraries.  ...  Applicability of our design for the recognition of Indian Languages is demonstrated. Recognition errors are used to train the OCR again so that it adapts and learns for improving its accuracy.  ...  This work was partially supported by the MCIT, Government of India for Digital Libraries Activities.  ... 
doi:10.1007/11669487_2 fatcat:t7kxm66ohjhwpl5kmyrdlgys6i

Mining of Bilingual Indian Web Documents

Kolla Bhanu Prakash, Arun Rajaraman
2016 Procedia Computer Science  
The present paper focuses on content extraction of such documents through a generic approach using pixel-based approach and mining through classification.  ...  Web and mobile communication are growing in popularity globally and regionally catering to different ways of information dissemination, rendering complex web documents having script, language and media  ...  Acknowledgement The author sincerely thanks the Chirala Engineering College Management for their kind support in providing resources for doing this research work.  ... 
doi:10.1016/j.procs.2016.06.103 fatcat:kluibz2ybjdglkdoeoj6jz2syq
« Previous Showing results 1 — 15 out of 200 results