Filters








10,271 Hits in 9.1 sec

A Swift Clustering based Algorithm to Explore Different Correlation Measures

G. Julie Priyadharsana
2018 International Journal for Research in Applied Science and Engineering Technology  
The Minimum Spanning Tree (MST) eliminates redundancy using kruskal's algorithm.  ...  Feature selection identify a subset of features which to produces the result same as the target features. The novel Swift Clustering based algorithm removes both irrelevant and redundant features.  ...  Second purpose of feature selection is increases classification accuracy by eliminating noise features [9] . The Clustering analysis technique involves for the statistical data analysis.  ... 
doi:10.22214/ijraset.2018.3204 fatcat:mwxg7ygirnhilcstrhlwgyyjve

Efficient feature subset selection model for high dimensional data

Chinnu C Georgel, Abdul Ali
2016 International Journal on Cybernetics & Informatics  
Splitting the minimum spanning tree based on the dependency between features leads to the generation of forests.  ...  This paper proposes a new method that intends on reducing the size of high dimensional dataset by identifying and removing irrelevant and redundant features.  ...  ACKNOWLEDGEMENTS I wish to thank the Management, the Principal and Head of the Department (CSE) of ICET for the support and help in completing the work.  ... 
doi:10.5121/ijci.2016.5217 fatcat:3kkjtcmydfabxizmch7oterpke

Optimizing Storage Space for Higher-Dimensional Data Using Feature Subset Selection Approach

Donia Augustine
2018 International Journal of Emerging Research in Management and Technology  
An N-dimensional feature selection algorithm, NDFS is used for identifying the subset of relevant features.  ...  The clustering based strategy of NDFS have a high probability of producing a subset of useful and independent features.  ...  CONCLUSION NDFS proposes a feature selection algorithm for high dimensional data.  ... 
doi:10.23956/ijermt.v6i6.241 fatcat:hdj5y3g32vfjdlep76djkigp3a

FC-MST: Feature correlation maximum spanning tree for multimedia concept classification

Hsin-Yu Ha, Shu-Ching Chen, Min Chen
2015 Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)  
Given the explosive growth of high-dimensional multimedia data, a welldesigned feature selection method can be leveraged in classifying multimedia contents into high-level semantic concepts.  ...  In this paper we present a multi-phase feature selection method using maximum spanning tree built from feature correlation among multiple modalities (FC-MST).  ...  Then, a Maximum Spanning Tree is built using the correlations and eliminate irrelevant and redundant features by pruning the tree.  ... 
doi:10.1109/icosc.2015.7050820 dblp:conf/semco/HaCC15 fatcat:a25capp4e5bcffa675uozsbu2u

A Novel Feature Subset Selection Algorithm for Software Defect Prediction

Reena P, Binu Rajan
2014 International Journal of Computer Applications  
The proposed clustering based algorithm for feature selection uses minimum spanning tree based method to cluster features.  ...  A clustering based feature subset selection algorithm has been applied over software defect prediction data sets.  ...  Recognized Prim's algorithm is used for constructing the minimum spanning tree. Tree Partitioning and Elimination of Redundant Features The minimum spanning tree is constructed in the second step.  ... 
doi:10.5120/17618-8315 fatcat:v5yez23egfbdndjczdrlggpuse

Feature Subset Selection for High Dimensional Data

Pavan Mallya P, Roopa C. K
2015 International Journal of Engineering Research and  
This paper considers feature selection for data classification in the presence of a huge number of irrelevant and redundant features.  ...  It is capable of processing many thousands of features within minutes on a personal computer while maintaining a very high accuracy that is nearly insensitive to a growing number of irrelevant features  ...  There is also a need for removal of redundant features as well in the context of feature selection for high dimensional data. IV.  ... 
doi:10.17577/ijertv4is050376 fatcat:otgigqwzkvh7tkhkkcvk3uy2gy

Fuzzy C Means Clustering Algorithm for High Dimensional Data Using Feature Subset Selection Technique

N. Manjula, S. Pandiarajan, J. Jagadeesan
2014 IOSR Journal of Computer Engineering  
choice algorithms, namely, FCBF, ReliefF, CFS, Consist, and FOCUS-SF, with relevancy four kinds of wellknown classifiers, namely, the chance primarily based Naive Thomas Bayes, the tree-based C4.5, the  ...  Options in numerous clusters area unit comparatively freelance; the clustering-based strategy of quick incorporates a high chance of manufacturing a set of helpful and independent options.  ...  Index Terms: feature subset selection, relevance, redundancy and high dimensionality. I.  ... 
doi:10.9790/0661-16226469 fatcat:25er6px2ovffzgzglxgtvgpjae

Clustering Based Attribute subset Selection using Fast Algorithm

Suresh Laxman Ushalwar, Nagori M.B
2015 International Journal on Cybernetics & Informatics  
It provides privacy for data and reduces the dimensionality of the data.  ...  In machine learning and data mining, attribute select is the practice of selecting a subset of most consequential attributes for utilize in model construction.  ...  We propose a FAST algorithm, which is primarily utilized for abstraction of redundant data, reiterated data and additionally reduces the dimensionality of data.  ... 
doi:10.5121/ijci.2015.4220 fatcat:2pel6j3uzrdolit4lc3pht4aba

Feature Selection Using Maximum Feature Tree Embedded with Mutual Information and Coefficient of Variation for Bird Sound Classification

Haifeng Xu, Yan Zhang, Jiang Liu, Danjv Lv, Paolo Spagnolo
2021 Mathematical Problems in Engineering  
And then, a method named ERMFT (Eliminating Redundancy Based on Maximum Feature Tree) based on two neighborhoods to eliminate redundancy to optimize features is explored.  ...  Although extracting features from multiple perspectives helps to fully describe the target information, it is urgent to deal with the enormous dimension of features and the curse of dimensionality.  ...  Acknowledgments is research was funded by the National Natural Science Foundation of China under Grants nos. 61462078, 31960142, and 31860332.  ... 
doi:10.1155/2021/8872248 fatcat:n4jt4gvdorcnhg7s2lqxo4bik4

An fMRI Feature Selection Method Based on a Minimum Spanning Tree for Identifying Patients with Autism

Chunlei Shi, Jiacai Zhang, Xia Wu
2020 Symmetry  
Here, we proposed a novel feature selection method based on the minimum spanning tree (MST) to seek neuromarkers for ASD. First, we constructed an undirected graph with nodes of candidate features.  ...  Third, the sum of the edge weights of all connected nodes was sorted for each node in the MST.  ...  The correlation coefficient is a statistical measuring of the direction and strength of the relationship of the changing trend between the two features, and its value ranges from -1 to +1.  ... 
doi:10.3390/sym12121995 fatcat:wyccgsil4vh3he2ryskvuxbl64

Streaming feature selection algorithms for big data: A survey

Noura AlNuaimi, Mohammad Mehedy Masud, Mohamed Adel Serhani, Nazar Zaki
2019 Applied Computing and Informatics  
In machine learning, streaming feature selection has always been considered a superior technique for selecting the relevant subset features from highly dimensional data and thus reducing learning complexity  ...  In the relevant literature, streaming feature selection refers to the features that arrive consecutively over time; despite a lack of exact figure on the number of features, numbers of instances are well-established  ...  high dimensional data.  ... 
doi:10.1016/j.aci.2019.01.001 fatcat:6rk437oqlzehzg2qoatyulnafe

Using Feature Clustering for GP-Based Feature Construction on High-Dimensional Data [chapter]

Binh Tran, Bing Xue, Mengjie Zhang
2017 Lecture Notes in Computer Science  
We propose a cluster-based GP feature construction method called CGPFC which uses feature clustering to improve the performance of GP for feature construction on high-dimensional data.  ...  Genetic programming (GP) has been shown to be a prominent technique for this task. However, applying GP to high-dimensional data is still challenging due to the large search space.  ...  Two features are redundant if their mutual information or their SU is high [28] . For example, in [23] , SU is combined with minimum spanning tree (MST) to group features.  ... 
doi:10.1007/978-3-319-55696-3_14 fatcat:jf5cms7aw5gizdb2dvn2stw5sq

FSEFST:Feature Selection and Extraction using Feature Subset Technique in High Dimensional Data

2019 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
An efficient algorithm for Feature Selection and Extraction using Feature Subset Technique in High Dimensional Data (FSEFST) has been proposed in order to select and extract the efficient features by using  ...  Feature selection and Feature Extraction are one of the methods used to reduce the dimensionality.  ...  The major challenges faced by the researchers in high dimensional data are the selection of subset feature and classification [2] .  ... 
doi:10.35940/ijitee.b6907.129219 fatcat:3wycvghewngcbbpprtbmvaim4i

Clustering Categorical Sequences with Variable-Length Tuples Representation [chapter]

Liang Yuan, Zhiling Hong, Lifei Chen, Qiang Cai
2016 Lecture Notes in Computer Science  
, in terms of the entropy-based measure evaluating the redundancy of tuples.  ...  The variable-length tuples are obtained using a pruning method applied to delete the redundant tuples from the suffix tree, which is created for the fixed-length tuples with a large memorylength of sequences  ...  The popular approach adapting the algorithms to the high-dimensional data is to eliminate these features by combining feature selection techniques, for example, by removing those tuples whose frequency  ... 
doi:10.1007/978-3-319-47650-6_2 fatcat:aor6jg2dk5gjfct6ayxdkm7jq4

A survey of dimension reduction and classification methods for RNA-Seq data on malaria vector

Micheal Olaolu Arowolo, Marion Olubunmi Adebiyi, Charity Aremu, Ayodele A. Adebiyi
2021 Journal of Big Data  
This study reviews various works on Dimensionality reduction techniques for reducing sets of features that groups data effectively with less computational processing time and classification methods that  ...  AbstractRecently unique spans of genetic data are produced by researchers, there is a trend in genetic exploration using machine learning integrated analysis and virtual combination of adaptive data into  ...  Authors in [3] worked on a feature selection based on One-Way-ANOVA for microarray data classification, by combining Analysis of Variance (ANOVA) for feature selection; to diminish high data dimensionality  ... 
doi:10.1186/s40537-021-00441-x fatcat:zwdbm44x5bak5bzror2ajegfzq
« Previous Showing results 1 — 15 out of 10,271 results