Filters








8,315 Hits in 6.5 sec

Supervised classification with conditional Gaussian networks: Increasing the structure complexity from naive Bayes

Aritz Pérez, Pedro Larrañaga, Iñaki Inza
2006 International Journal of Approximate Reasoning  
This work shows how discrete classifier induction algorithms can be adapted to the conditional Gaussian network paradigm to deal with continuous variables without discretizing them.  ...  The study suggests that semi naive Bayes structure based classifiers and, especially, the novel wrapper condensed semi naive Bayes backward, outperform the behavior of the rest of the presented classifiers  ...  We also thank the comments of the anonymous reviewers which helped us to improve the quality of this work.  ... 
doi:10.1016/j.ijar.2006.01.002 fatcat:che3jvwycveorot62n6kkzpjni

Search-based class discretization [chapter]

Luís Torgo, João Gama
1997 Lecture Notes in Computer Science  
We present a methodology that enables the use of classification algorithms on regression tasks.  ...  The transformation consists of mapping a continuous variable into an ordinal variable by grouping its values into an appropriate set of intervals.  ...  The use of this iterative approach to estimate a parameter of a learning algorithm can be described by the following figure: Training set The two main components of the wrapper approach are the way new  ... 
doi:10.1007/3-540-62858-4_91 fatcat:tv7iyzbc7jc2bmxdgl2wvqgwpu

Speeding Up the Wrapper Feature Subset Selection in Regression by Mutual Information Relevance and Redundancy Analysis [chapter]

Gert Van Dijck, Marc M. Van Hulle
2006 Lecture Notes in Computer Science  
First, features are filtered by means of a relevance and redundancy filter using mutual information between regression and target variables.  ...  Second, a wrapper searches for good candidate feature subsets by taking the regression model into account. The advantage of a hybrid approach is threefold.  ...  Finally, it was shown that the filter preprocessing increases the speed of the wrapper approach in the feature subset search.  ... 
doi:10.1007/11840817_4 fatcat:awcgkuz7zrcjnhyfdqdlbwydh4

Medical Datamining with a New Algorithm for Feature Selection and Naive Bayesian Classifier

Ranjit Abraham, Jay B. Simha, S. S. Iyengar
2007 10th International Conference on Information Technology (ICIT 2007)  
The proposed algorithm utilizes discretization and simplifies the' wrapper' approach based feature selection by reducing the feature dimensionality through the elimination of irrelevant and least relevant  ...  Much research work in datamining has gone into improving the predictive accuracy of statistical classifiers by applying the techniques of discretization and feature selection.  ...  One approach referred to as the 'wrapper' employs as a subroutine a statistical resampling technique such as cross validation using the actual target learning algorithm to estimate the accuracy of feature  ... 
doi:10.1109/icoit.2007.4418266 fatcat:ifr6lkcasvexveulvvp5kgl2ii

Medical Datamining with a New Algorithm for Feature Selection and Naive Bayesian Classifier

Ranjit Abraham, Jay B. Simha, S. S. Iyengar
2007 10th International Conference on Information Technology (ICIT 2007)  
The proposed algorithm utilizes discretization and simplifies the' wrapper' approach based feature selection by reducing the feature dimensionality through the elimination of irrelevant and least relevant  ...  Much research work in datamining has gone into improving the predictive accuracy of statistical classifiers by applying the techniques of discretization and feature selection.  ...  One approach referred to as the 'wrapper' employs as a subroutine a statistical resampling technique such as cross validation using the actual target learning algorithm to estimate the accuracy of feature  ... 
doi:10.1109/icit.2007.41 dblp:conf/cit/AbrahamSI07 fatcat:qjwicggccfeijgalsn4yjuul6a

Effective Discretization and Hybrid feature selection using Naïve Bayesian classifier for Medical datamining

Ranjit Abraham, Jay B. Simha, S. Sitharama Iyengar
2009 International Journal of Computational Intelligence Research  
The proposed algorithm which is a multi-step process utilizes discretization, filters out irrelevant and least relevant features and finally uses a greedy algorithm such as best first search or wrapper  ...  We propose a Hybrid feature selection algorithm (CHI-WSS) that helps in achieving dimensionality reduction by removing irrelevant data, increasing learning accuracy and improving result comprehensibility  ...  Hence the probability density estimation is used assuming that X within the class c are drawn from a normal (Gaussian) distribution where σ c is the standard deviation and μ c is the mean of the attribute  ... 
doi:10.5019/j.ijcir.2009.175 fatcat:k6r7ibls2jce3mqmji5a2uuuli

The ties problem resulting from counting-based error estimators and its impact on gene selection algorithms

Xin Zhou, K. Z. Mao
2006 Computer applications in the biosciences : CABIOS  
Our analysis finds that the ties problem is caused by the discrete nature of countingbased error estimators and could be avoided by using continuous evaluation criteria instead.  ...  The website contains (1) the source code of all the gene selection algorithms and (2) the complete set of tables and figures of experiments.  ...  Conflict of Interest: none declared.  ... 
doi:10.1093/bioinformatics/btl438 pmid:16908500 fatcat:xk5aycveyjfs3ns3z2axq2rx4i

An Improved Feature Selection Algorithm Based on Parzen Window and Conditional Mutual Information

Deng Chao He, Wen Ning Hao, Gang Chen, Da Wei Jin
2013 Applied Mechanics and Materials  
In this paper, an improved feature selection algorithm by conditional mutual information with Parzen window was proposed, which adopted conditional mutual information as an evaluation criterion of feature  ...  selection in order to overcome the deficiency of feature redundant and used Parzen window to estimate the probability density functions and calculate the conditional mutual information of continuous variables  ...  Its basic idea is to estimate the overall density function by mean value of the density of every points in a certain domain.  ... 
doi:10.4028/www.scientific.net/amm.347-350.2614 fatcat:vyxnqipjevfwxerlr4rmzlbovi

A Novel Two-Stage Selection of Feature Subsets in Machine Learning

R. F. Kamala, P. R. J. Thangaiah
2019 Engineering, Technology & Applied Science Research  
The results of this method include improvements in the performance measures like efficiency, accuracy, and scalability of machine learning algorithms.  ...  In feature subset selection the variable selection procedure selects a subset of the most relevant features. Filter and wrapper methods are categories of variable selection methods.  ...  For categorical data the similarity measure CHI was acknowledged as a goodness of fit test [12] , with an estimated CHI distribution.  ... 
doi:10.48084/etasr.2735 fatcat:nrjh4qqdj5anlj5hn2bvfpalnm

Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS

Rosa Blanco, Iñaki Inza, Marisa Merino, Jorge Quiroga, Pedro Larrañaga
2005 Journal of Biomedical Informatics  
The estimated accuracies obtained tally with the results of previous studies.  ...  Moreover, the medical significance of the subset of variables selected by the classifiers along with the comprehensibility of Bayesian models is greatly appreciated by physicians.  ...  This work is partially supported by the Ministry of Science and Technology, by the Fondo  ... 
doi:10.1016/j.jbi.2005.05.004 pmid:15967731 fatcat:ubf6ljmodfchjh5vqjl5owtjie

Towards reconstruction of gene networks from expression data by supervised learning

Lev A Soinov, Maria A Krestyaninova, Alvis Brazma
2003 Genome Biology  
We present algorithms that work for continuous expression levels and do not require a priori discretization. We apply our method to publicly available data for the budding yeast cell cycle.  ...  We use a supervised learning approach to address this question by building decision-tree-related classifiers, which predict gene expression from the expression data of other genes.  ...  Acknowledgements L.A.S. is supported by a grant from AstraZeneca. The WEKA package distributed under the GNU General Public License was used for classifier creation and accuracy estimation.  ... 
pmid:12540298 pmcid:PMC151290 fatcat:z77o4mxnkjfenctun5xp3awrqy

Comparison of classification methods for detecting associations between SNPs and chick mortality

Nanye Long, Daniel Gianola, Guilherme JM Rosa, Kent A Weigel, Santiago Avendaño
2009 Genetics Selection Evolution  
This was done by categorizing mortality rates and using a filter-wrapper feature selection procedure in each of the classification methods evaluated.  ...  Further, an alternative categorization scheme, which used only two extreme portions of the empirical distribution of mortality rates, was considered.  ...  Hill is thanked for suggesting the permutation test employed for generating the null distribution of PRESS values.  ... 
doi:10.1186/1297-9686-41-18 pmid:19284707 pmcid:PMC3225888 fatcat:ze6dpjl4m5a4zfarffw32pwpz4

The EMADS Extendible Multi-Agent Data Mining Framework [chapter]

Kamal Ali Albashiri, Frans Coenen
2009 Data Mining and Multi-agent Integration  
The EMADS vision is that of a community of data mining agents, contributed by many individuals and interacting under decentralized control, to address data mining requests.  ...  A full description of EMADS is presented.  ...  Data mining is carried out by means of local data mining agents (for reasons of privacy preservation).  ... 
doi:10.1007/978-1-4419-0522-2_13 fatcat:hry2xvuhb5ehdkrnsyvolpkbci

Feature Subset Selection, Class Separability, and Genetic Algorithms [chapter]

Erick Cantú-Paz
2004 Lecture Notes in Computer Science  
The performance of classification algorithms in machine learning is affected by the features used to describe the labeled examples presented to the inducers.  ...  This paper describes a hybrid of a simple genetic algorithm and a method based on class separability applied to the selection of feature subsets for classification problems.  ...  Table 4 shows the mean number of feature subsets examined by each algorithm.  ... 
doi:10.1007/978-3-540-24854-5_96 fatcat:chzdgavfv5a35jcxurwfekx7bi

Feature Selection Filters Based on the Permutation Test [chapter]

Predrag Radivojac, Zoran Obradovic, A. Keith Dunker, Slobodan Vucetic
2004 Lecture Notes in Computer Science  
To estimate the p-values, we use Fisher's permutation test combined with the four simple filtering criteria in the roles of test statistics: sample mean difference, symmetric Kullback-Leibler distance,  ...  We investigate the problem of supervised feature selection within the filtering framework.  ...  Acknowledgement This study was supported by NIH grant 1R01 LM07688 awarded to AKD and ZO and by NSF grant IIS-0219736 awarded to ZO and SV. We thank Nitesh V. Chawla and Robert C.  ... 
doi:10.1007/978-3-540-30115-8_32 fatcat:x3yddjkihzfs5lg52uykyvpihy
« Previous Showing results 1 — 15 out of 8,315 results