Filters








64,863 Hits in 7.0 sec

Interactions Between Term Weighting and Feature Selection Methods on the Sentiment Analysis of Turkish Reviews [chapter]

Tuba Parlar, Selma Ayşe Özel, Fei Song
2018 Lecture Notes in Computer Science  
In this study, our aim is to examine the effects of term weighting methods on newly proposed Query Expansion Ranking (QER) feature selection method and also compare the classification results with one  ...  The experimental results show that when QER feature selection method is used with tf*idf term weighting method, the classification performance improves in terms of F-score.  ...  Term Weighting Methods It is important to create feature vectors in text classification.  ... 
doi:10.1007/978-3-319-75487-1_26 fatcat:er46dvr56fbzhpq54nq5fs52my

Nitelik Çıkarımı Yöntemlerinin Türkçe Metinlerin Sınıflandırılmasına Etkisi

Özge AKDOĞAN, Selma Ayşe ÖZEL
2019 Çukurova Üniversitesi Mühendislik-Mimarlık Fakültesi Dergisi  
Feature extraction is the most important preprocessing step of text classification task. Effects of preprocessing techniques on text mining for English have been extensively studied.  ...  In this study, we investigate the effects of feature extraction techniques on four different Turkish text classification problems including news classification, spam e-mail detection, sentiment analysis  ...  Term Weighting Methods Term weighting is an important preprocessing step in text classification, and in this step, we assign weights to terms with respect to their importance in the documents.  ... 
doi:10.21605/cukurovaummfd.637643 fatcat:zu5lnqef3bg3vheokfdbxzmzse

The Evaluation of Accuracy Performance in an Enhanced Embedded Feature Selection for Unstructured Text Classification

Nur Syafiqah Mohd Nafis, Suryanti Awang
2020 Iraqi Journal of Science  
Text documents are unstructured and high dimensional. Effective feature selection is required to select the most important and significant feature from the sparse feature space.  ...  Thus, this paper proposed an embedded feature selection technique based on Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE) for unstructured  ...  It is successfully applied for feature weighting technique using document frequency (DF) and term frequency (TF) based feature selection in text classification.  ... 
doi:10.24996/ijs.2020.61.12.28 fatcat:ou6qvfbxrzhvvfbgd3gesxa6bi

PERFECTIONOF CLASSIFICATION ACCURACY IN TEXT CATEGORIZATION

Rajeev Tripathi
2021 Zenodo  
The type of words utilised in the corpus and the type of features produced for classification have a big impact on the performance of a text classification model.  ...  When dealing with large amounts of text data, however, the models performance and accuracy become a difficulty.  ...  CHI square statistics In text classification, CHI square statistics [4] is a helpful feature selection approach since it can quantify the connection between feature and class.  ... 
doi:10.5281/zenodo.5554295 fatcat:wfzfhxwoofaknfcsokidcn5lxe

Text Classification Based on Weighted Extreme Learning Machine

Hayder Mahmood Salman
2019 Ibn AL- Haitham Journal For Pure and Applied Science  
The huge amount of documents in the internet led to the rapid need of text classification (TC). TC is used to organize these text documents.  ...  These feature weights with the extracted features introduced as an input to the ELM that produced weighted Extreme Learning Machine (WELM).  ...  This phenomenon which made the importance of text classification begins to spring up. Text classification (TC) is the way toward assigning a document to a class by assessing its content segments.  ... 
doi:10.30526/32.1.1978 fatcat:izg5kgjblve4zpg4ba5z5ig6ke

N-grams based feature selection and text representation for Chinese Text Classification

Zhihua Wei, Duoqian Miao, Jean-Hugues Chauchat, Rui Zhao, Wen Li
2009 International Journal of Computational Intelligence Systems  
In this paper, text representation and feature selection strategies for Chinese text classification based on n-grams are discussed.  ...  Two steps feature selection strategy is proposed which combines the preprocess within classes with the feature selection among classes.  ...  In the situation with more than 3000 features, results in all cases with both methods are similar. • Feature selection based on n-gram frequency produces denser "text*feature" matrices than the ones based  ... 
doi:10.2991/ijcis.2009.2.4.5 fatcat:vzbja3er2zcorb4w5utd4prnya

N-grams based feature selection and text representation for Chinese Text Classification

Zhihua Wei, Duoqian Miao, Jean-Hugues Chauchat, Rui Zhao, Wen Li
2009 International Journal of Computational Intelligence Systems  
In this paper, text representation and feature selection strategies for Chinese text classification based on n-grams are discussed.  ...  Two steps feature selection strategy is proposed which combines the preprocess within classes with the feature selection among classes.  ...  In the situation with more than 3000 features, results in all cases with both methods are similar. • Feature selection based on n-gram frequency produces denser "text*feature" matrices than the ones based  ... 
doi:10.1080/18756891.2009.9727668 fatcat:junxr4o2pndkfi2xn4fydq6x34

Design and analysis of a general vector space model for data classification in Internet of Things

Jinguo Sang, Shanchen Pang, Yang Zha, Fan Yang
2019 EURASIP Journal on Wireless Communications and Networking  
This algorithm improves the feature selection and weighting methods by introducing synonym replacement to traditional text classification algorithms.  ...  This paper proposes a new text classification algorithm based on vector space model.  ...  In the text classification phase, the processing objects were the texts to be classified, and the frequency-based feature selection method and the TF weighting algorithm were used.  ... 
doi:10.1186/s13638-019-1581-3 fatcat:g6uhv3igvbhhdkgqtypdg7jhui

A New Fine-Grained Weighting Method in Multi-Label Text Classification

Chang-Hwan Lee
2014 Midwest Artificial Intelligence and Cognitive Science Conference  
Multi-label classification is one of the important research areas in data mining. In this paper, a new multilabel classification method using multinomial naive Bayes is proposed.  ...  We use a new fine-grained weighting method for calculating the weights of feature values in multinomial naive Bayes.  ...  Acknowledgments This work was supported in part by National Research Foundation of Korea (NRF) (Grant number: 2011-0023296).  ... 
dblp:conf/maics/Lee14 fatcat:jzznjj2lnfdojl6vri7ppth3ae

Relevance popularity: A term event model based feature selection scheme for text classification

Guozhong Feng, Baiguo An, Fengqin Yang, Han Wang, Libiao Zhang, Quan Zou
2017 PLoS ONE  
In traditional feature selection methods such as information gain and chi-square, the number of documents that contain a particular term (i.e. the document frequency) is often used.  ...  However, the frequency of a given term appearing in each document has not been fully investigated, even though it is a promising feature to produce accurate classifications.  ...  terms with more details and important (high frequency within the documents) information.  ... 
doi:10.1371/journal.pone.0174341 pmid:28379986 pmcid:PMC5381872 fatcat:qnixno3o75gkrjtw5zat5llzou

A feature selection method based on synonym merging in text classification system

Haipeng Yao, Chong Liu, Peiying Zhang, Luyao Wang
2017 EURASIP Journal on Wireless Communications and Networking  
As an important step in natural language processing (NLP), text classification system has been widely used in many fields, like spam filtering, news classification, and web page detection.  ...  In this paper, a feature selection algorithm based on synonym merging named SM-CHI is proposed.  ...  The difference with [19] is that we merge synonyms after feature selection based on CHI and we propose three improved weighting method for the merged feature words.  ... 
doi:10.1186/s13638-017-0950-z fatcat:f6r62dxperehrniawyz4rlwi4q

New Feature Selection and Weighting Methods Based on Category Information [chapter]

Gongshen Liu, Jianhua Li, Xiang Li, Qiang Li
2004 Lecture Notes in Computer Science  
The traditional methods of feature selection and weighting make the best of document information, but despise or ignore the category information.  ...  It is proved by the experiment that four famous classifiers based on new feature selection and weighting methods are more effective than those based on traditional methods.  ...  As table 1 shown, the precision and recall of every classifier are all improved with new feature selection and feature weighting methods.  ... 
doi:10.1007/978-3-540-30544-6_35 fatcat:rbkr7ob3qnbv7kqolormhtp7fe

A Feature Weight Algorithm for Text Classification Based on Class Information

Yong Fei Li
2013 Advanced Materials Research  
TFIDF algorithm was used for feature weighting in text classification. But the result of classification was not very well because of lack of class information in feature weighting.  ...  Class distinction ability and class description ability were introduced, respectively expressed by inverse class frequency and term frequency in class, document frequency in class.  ...  The feature weight was calculated based on term frequency (TF), inverse class frequency (ICF), term frequency in class (TFc), and document frequency in class (DFc).  ... 
doi:10.4028/www.scientific.net/amr.756-759.3419 fatcat:ptndm3lllfat7dpkj6jq6xex2u

Two new feature selection metrics for text classification

Durmuş Özkan Şahin, Erdal Kılıç
2019 Automatika  
The first recommended metric is Relevance Frequency Feature Selection metric that was obtained by adding new parameters to Relevance Frequency method that is used for term weighting in text classification  ...  The main problem about text classification is the increase in the required time and a decrease in the success of classification because of data size.  ...  Relevance Frequency Feature Selection (RFFS) RF is a method that was proposed by Lan for term weighting in text classification [14] .  ... 
doi:10.1080/00051144.2019.1602293 fatcat:knc3nmkr5rbffh4z5qckc6etg4

A Novel Document Representation Approach for Authorship Attribution

Sreenivas Mekala, Raghunadha Tippireddy, Vishnu Bulusu
2018 International Journal of Intelligent Engineering and Systems  
In this paper, the experimentation carried out with various stylistic features, feature selection measures and term weight measures identified in various text processing domains to predict the author of  ...  In the proposed approach the documents were represented with the weights of the documents specific to author group of documents.  ...  Discriminative feature selection term weight (DFSTW) measure DFS measure allocate more weight to the terms that are having high average term frequency in class cj and the terms with high occurrence rate  ... 
doi:10.22266/ijies2018.0630.28 fatcat:c33lu4hqyjb57ev3u6kocjmizm
« Previous Showing results 1 — 15 out of 64,863 results