139,633 Hits in 6.4 sec

An Empirical Comparison of Text Categorization Methods [chapter]

Ana Cardoso-Cachopo, Arlindo L. Oliveira
2003 Lecture Notes in Computer Science  
In this paper we present a comprehensive comparison of the performance of a number of text categorization methods in two different data sets.  ...  We argue that this evaluation measure is also very well suited for text categorization tasks.  ...  A number of approaches to text categorization has been proposed. The goal of text categorization methods is to associate one (or more) of a given set of categories to a particular document.  ... 
doi:10.1007/978-3-540-39984-1_14 fatcat:kcpemgdt4nczbbcfh7qdi2fwey

An empirical comparison of min–max-modular k-NN with different voting methods to large-scale text categorization

Ke Wu, Bao-Liang Lu, Masao Utiyama, Hitoshi Isahara
2007 Soft Computing - A Fusion of Foundations, Methodologies and Applications  
Text categorization refers to the task of assigning the pre-defined classes to text documents based on their content. k-NN algorithm is one of top performing classifiers on text data.  ...  However, there is little research work on the use of different voting methods over text data.  ...  Text categorization has become one of the most important techniques to handle the problem. Text categorization aims to automatically assign documents into some predefined categories.  ... 
doi:10.1007/s00500-007-0242-3 fatcat:vu7gbrhnqfeoboy6qkajd2es5u

An Empirical Study of Category Skew on Feature Selection for Text Categorization [chapter]

Mondelle Simeon, Robert Hilderman
2009 Lecture Notes in Computer Science  
In this paper, we present an empirical comparison of the effects of category skew on six feature selection methods.  ...  The methods were evaluated on 36 datasets generated from the 20 Newsgroups, OHSUMED, and Reuters-21578 text corpora.  ...  Our Contribution In this paper, we present an extensive empirical comparison of six feature selection methods.  ... 
doi:10.1007/978-3-642-01818-3_35 fatcat:eyjm3nfkmrhoriesufxbjtnksq

Variance based classifier comparison in text catergorization (poster session)

Atsuhiro Takasu, Kenro Aihara
2000 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '00  
Yang conducted comprehensive study of comparison of text categorization and reported that k nearest neighbor and support vector machines works well for text categorization [4/.  ...  Text categorization is one of the key functions for utilizing vast amount of documents.  ... 
doi:10.1145/345508.345618 dblp:conf/sigir/TakasuA00 fatcat:gaqwmdchmvdf7jx6uhod4bjhbm

An empirical evaluation of text classification and feature selection methods

Muazzam Ahmed Siddiqui
2016 Artificial intelligence research  
An extensive empirical evaluation of classifiers and feature selection methods for text categorization is presented.  ...  We found statistically significant difference between the performance of Support Vector Machine and other classifiers on text categorization problems.  ...  INTRODUCTION This paper presents an empirical evaluation of text categorization and feature selection methods applied on five benchmark corpora.  ... 
doi:10.5430/air.v5n2p70 fatcat:utpb25jxhreiflpge5dsvjetmm

IGICA: A Hybrid Feature Selection Approach in Text Categorization

Mohammad Mojaveriyan, Hossein Ebrahimpour-komleh, Seyed jalaleddin Mousavirad
2016 International Journal of Intelligent Systems and Applications  
In the text categorization, there are many features which most of them are redundant.  ...  Indeed, feature selection is a method to select an appropriate subset of features for increasing the performance of learning algorithms.  ...  Each of these groups has various methods which a few ones are used for large text categorization problems [2] .  ... 
doi:10.5815/ijisa.2016.03.05 fatcat:huwq5vud2vag7hhch4ppulzsjy

Empirical Evaluations of Automatic Forum Selector

Chen-Huei Chou
2012 International Journal of Computer and Communication Engineering  
In this study, we propose the use of text categorization approach to automatically select a target forum category. The empirical evaluations demonstrate the utility of text categorization approach.  ...  Online discussion forums are common methods used in electronic Customer Relationship Management.  ...  First we review the text categorization techniques. We then describe the details of empirical evaluations and discuss the results.  ... 
doi:10.7763/ijcce.2012.v1.16 fatcat:b74drzjo3bclfe5r6zgysxgu6q

An application of Expert Network to clinical classification and MEDLINE indexing

Y Yang, C G Chute
1994 Proceedings. Symposium on Computer Applications in Medical Care  
ExpNet predicts the related categories of an arbitrary text based on a search of its nearest neighbors in a set of training texts, and a reasoning from the expert-assigned categories of these neighbors  ...  An effective and efficient learning method, Expert Network (ExpNet), is introduced in this paper.  ...  METHOD Expert Network is designed to predict the category or categories of an arbitrary text ("the request") based on previously categorized texts.  ... 
pmid:7949911 pmcid:PMC2247915 fatcat:nwazt6xymvbfhdnunz4c3bvzae

A Lexical Approach for Text Categorization of Medical Documents

Rajni Jindal, Shweta Taneja
2015 Procedia Computer Science  
This research proposes a novel lexical approach to text categorization in the bio-medical domain.  ...  It automatically classifies journal articles of medical domain into specific categories.  ...  Background Work A lot of work has been done in the field of Text Categorization of medical documents. Authors have proposed different methods for categorizing the documents.  ... 
doi:10.1016/j.procs.2015.02.026 fatcat:ldvmk3flebgxvjwp6rdubpyk6i

HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization

Kazuya Shimura, Jiyi Li, Fumiyo Fukumoto
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
We focus on the multi-label categorization task for short texts and explore the use of a hierarchical structure (HS) of categories.  ...  The lower the HS level, the worse the categorization performance. Because lower categories are fine-grained and the amount of training data per category is much smaller than that in an upper level.  ...  This work is supported in part by Support Center for Advanced Telecommunications Technology Research, Foundation and the Grant-in-aid for the Japan Society for the Promotion of Science, No.17K00299.  ... 
doi:10.18653/v1/d18-1093 dblp:conf/emnlp/ShimuraLF18 fatcat:abqfqna6brcxraklvubvrfbm6m

Support Vector Machines based Arabic Language Text Classification System: Feature Selection Comparative Study [chapter]

Abdelwadood. Moh'd. Mesleh
2008 Advances in Computer and Information Sciences and Engineering  
This paper investigates the effectiveness of six commonly used feature selection methods, Evaluation used an in-house collected Arabic text classification corpus, and classification is based on Support  ...  Feature selection is essential for effective and accurate text classification systems.  ...  It is quit hard to fairly compare the effectiveness of these approaches because of the following reasons: [29, 30] have presented an extensive empirical study of many FS methods with kNN and SVMs, it  ... 
doi:10.1007/978-1-4020-8741-7_3 dblp:conf/cisse/Mesleh07 fatcat:o7zi45wsnvc6ddxxj3qlr3s5b4

The effectiveness of homogenous ensemble classifiers for Turkish and English texts

Zeynep Hilal Kilimci, Selim Akyokus, Sevinc Ilhan Omurca
2016 2016 International Symposium on INnovations in Intelligent SysTems and Applications (INISTA)  
A wide range of comparative and extensive empirical studies are conducted on four widely-used datasets in text categorization domain in both Turkish and English.  ...  We perform a comparative analysis of the impact of the ensemble techniques for text categorization domain.  ...  An excellent review of text categorization algorithms is given in papers [1, 2] .  ... 
doi:10.1109/inista.2016.7571854 dblp:conf/inista/KilimciAO16 fatcat:gg2xmmikhrbufmylbklspkrcfi

A Survey Report on Text Classification with Different Term Weighing Methods and Comparison between Classification Algorithms

Anuradha Patra, Divakar Singh
2013 International Journal of Computer Applications  
This paper surveys of text classification, process of text classification different term weighing methods and comparisons between different classification algorithms.  ...  Text categorization is the task of assigning predefined categories to documents.  ...  INTRODUCTION Text classification is an important part of text mining. Current research of text classification aims to improve the quality of text representation and develop high quality classifiers.  ... 
doi:10.5120/13122-0472 fatcat:fq53obkuzffzvfuj33lbd6tyw4

A multimedia information fusion framework for web image categorization

Wenting Lu, Lei Li, Jingxuan Li, Tao Li, Honggang Zhang, Jun Guo
2012 Multimedia tools and applications  
Previous approaches on image categorization focus on either adopting text or image features, or simply combining these two types of information together.  ...  With the rapid development of technologies for fast Internet access and the popularization of digital cameras, an enormous number of digital images are posted and shared online everyday.  ...  Acknowledgements This work is partially supported by the Army Research Office under grant number W911NF-10-1-0366, the National Natural Science Foundation of China under Grant No.61175011, and the China  ... 
doi:10.1007/s11042-012-1165-2 fatcat:xw7k6jvep5dohl4flwyoeqfu3i


Yiming Yang
2012 Information retrieval (Boston)  
This paper focuses on a comparative evaluation of a wide-range of text categorization methods, including previously published results on the Reuters corpus and new results of additional experiments.  ...  Analysis and empirical evidence suggest that the evaluation results on some versions of Reuters were significantly affected by the inclusion of a large portion of unlabelled documents, mading those results  ...  Ault for many valuable suggestions for improving the writing of this paper.  ... 
doi:10.1023/a:1009982220290 fatcat:xhkofbrstnhvtfhd6f3k7vzqni
« Previous Showing results 1 — 15 out of 139,633 results