37,767 Hits in 6.7 sec

Clustering Heterogeneous Data with Mutual Semi-supervision [chapter]

Artur Abdullin, Olfa Nasraoui
2012 Lecture Notes in Computer Science  
Algorithms for Mixed Data Attributes  k-prototypes [Huang, 1998] : integrates the k-means and the kmodes.  ...   We propose a new approach to  cluster diverse representations or types of data  Also, naturally, can cluster different sources of data  Our approach is rooted in Semi-Supervised Learning (SSL) Note  ... 
doi:10.1007/978-3-642-34109-0_4 fatcat:usb2atelnrgffktedxgx5yo4pa

A Discretization Algorithm of Continuous Attributes Based on Supervised Clustering

Haiyang Hua, Huaici Zhao
2009 2009 Chinese Conference on Pattern Recognition  
This paper describes such an algorithm, called SX-means (Supervised X-means), which is a new algorithm of supervised discretization of continuous attributes on clustering .The algorithm modifies clusters  ...  Experimental evaluation of several discretization algorithms on six artificial data sets show that the proposed algorithm is more efficient and can generate a better discretization schema.  ...  Unsupervised clustering Class 1 Class 2 Continuous Attribute A Discretization Algorithm of Continuous Attributes Based on Supervised Clustering clustering method for discretization.  ... 
doi:10.1109/ccpr.2009.5344142 fatcat:rvqx3anlnbdhpp5uyao7eexiza

A Hybrid Recommendation System based on Fuzzy C-Means Clustering and Supervised Learning

2021 KSII Transactions on Internet and Information Systems  
The algorithm constructs the user and item membership degree feature vector, and adopts the data representation form of the scoring matrix to the supervised learning algorithm, as well as by combining  ...  To address this challenge, this paper proposes a high-efficient hybrid recommendation system based on Fuzzy C-Means (FCM) clustering and supervised learning models.  ...  and CF recommendation algorithms are mixed with the idea of clustering.  ... 
doi:10.3837/tiis.2021.07.006 fatcat:466yl23povanvcxtbuvhbdki6a

Generalized clustering, supervised learning, and data assignment

Annaka Kalton, Pat Langley, Kiri Wagstaff, Jungsoon Yoo
2001 Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '01  
This framework views clustering as a general process of iterative optimization that includes modules for supervised learning and instance assignment.  ...  Clustering algorithms have become increasingly important in handling and analyzing data. Considerable work has been done in devising effective but increasingly specific clustering algorithms.  ...  Second, we anticipated that, across data sets, high (low) predictive accuracy by a supervised method would be associated with high (low) accuracy for the corresponding clustering algorithm.  ... 
doi:10.1145/502512.502555 fatcat:bss7tia42rfxlput6tfzsux65u

An Incremental Classification Algorithm for Mining Data with Feature Space Heterogeneity

Yu Wang
2014 Mathematical Problems in Engineering  
In this paper, we develop an incremental classification algorithm, Supervised Clustering for Classification with Feature Space Heterogeneity (SCCFSH), to address this problem.  ...  In our approach, supervised clustering is implemented to obtain a number of clusters such that samples in each cluster are from the same class.  ...  Acknowledgments The author is grateful to the editor and the anonymous reviewer for providing many helpful comments and suggestions, which have significantly improved the exposition and focus of this paper  ... 
doi:10.1155/2014/327142 fatcat:ds6itqzkxffdnprs3s2ult6xia

Semi-supervised learning on closed set lattices

Mahito Sugiyama, Akihiro Yamamoto
2013 Intelligent Data Analysis  
We propose a new approach for semi-supervised learning using closed set lattices, which have been recently used for frequent pattern mining within the framework of the data analysis technique of Formal  ...  From both labeled and unlabeled data, SELF constructs a closed set lattice, which is a partially ordered set of data clusters with respect to subset inclusion, via FCA together with discretizing continuous  ...  Acknowledgment We would like to thank Marco Cuturi for his helpful comments. This work was partly supported by Grant-in-Aid for Scientific Research (A) 22240010 and for JSPS Fellows 22·5714.  ... 
doi:10.3233/ida-130586 fatcat:h37hp24t35boxb56zawafos6pu

Combined Clustering Methods for Microarray Data Analysis

Raul Malutan, Pedro Gómez Vilda, Monica Borda
2013 Advanced Engineering Forum  
In this paper Gene Shaving algorithm was used for a previous supervised classification and once the cluster information was obtained, data was classified again with supervised algorithms like Support Vector  ...  The algorithms were run on several data sets, observing that the quality of the obtained clusters is dependent on the number of clusters specified.  ...  Later, gene shaving supervised method classified the data according with the number of clusters.  ... 
doi:10.4028/ fatcat:i4gib3a57feptciojvrgyiwru4

Fuzzy clustering and fuzzy entropy based classification model

Muhammad A. Khan, Muhammad Nazir, Arfan Jaffar, Anwar M. Mirza
2010 2010 6th International Conference on Emerging Technologies (ICET)  
Ones the data set is pre-processed, we can use a support vector machine (SVM) for classification.  ...  The Fuzzy C-mean clustering algorithm was applied to classify the range of the attribute. By using Fuzzy C-Mean clustering algorithm we will cluster the data of large range of attribute.  ...  Support vector machines (SVMs) are a set of connected supervised learning strategies used for classification and regression.  ... 
doi:10.1109/icet.2010.5638379 fatcat:75fgtgs4c5aahag6krvprmci74

A semi-supervised regression model for mixed numerical and categorical variables

Michael K. Ng, Elaine Y. Chan, Meko M.C. So, Wai-Ki Ching
2007 Pattern Recognition  
In this paper, we develop a semi-supervised regression algorithm to analyze data sets which contain both categorical and numerical attributes.  ...  This algorithm partitions the data sets into several clusters and at the same time fits a multivariate regression model to each cluster.  ...  After a data set is clustered by the semi-supervised clustering algorithm, a new cluster variable is added to the data set to indicate the cluster each data point is assigned to.  ... 
doi:10.1016/j.patcog.2006.06.018 fatcat:ywsvtit7tvbobgykkc2g6yuybm

Review on Classification and Clustering using Fuzzy Neural Networks

Suprit Kulkarni, Kishore Honwadkar
2016 International Journal of Computer Applications  
In general, in classification the classifier assigns a class label from a set of predefined classes to a new input object.  ...  Whereas, given a set of objects, clustering creates different groups of these objects using some similarity measure.  ...  Algorithm is suitably modified for working with mixed data. The probability of intersection of hyperline segments decreases with the increase in the dimension of pattern space.  ... 
doi:10.5120/ijca2016908456 fatcat:t5ihmtoz2vda5m6do6jma6eizy

Clustering mixed data using an Artificial Bee Colony

technique for similarity functions (KMSF) and a newly developed algorithm with dissimilarity based clustering technique (AD2011).  ...  The proposed clustering technique is able to suitably handle mixed and incomplete data types in such a way that the original characteristics of the data are preserved.  ...  Perhaps the first modification of k-means for mixed and incomplete data is the k-prototypes (KP) algorithm [40] . The KP algorithm uses a new method to obtain cluster centers.  ... 
doi:10.35940/ijitee.a4861.129219 fatcat:pge4iwvyejhenjnplyyuppvnhm

A probabilistic framework for relational clustering

Bo Long, Zhongfei Mark Zhang, Philip S. Yu
2007 Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '07  
In this paper, we propose a probabilistic model for relational clustering, which also provides a principal framework to unify various important clustering tasks including traditional attributes-based clustering  ...  The algorithms are applicable to relational data of various structures and at the same time unifies a number of stat-of-the-art clustering algorithms: co-clustering algorithms, the k-partite graph clustering  ...  Semi-supervised Clustering Recently, semi-supervised clustering has become a topic of significant interest [4, 46] , which seeks to cluster a set of data points with a set of pairwise constraints.  ... 
doi:10.1145/1281192.1281244 dblp:conf/kdd/LongZY07 fatcat:q2sy5rjm5jbybai3talxp2v7nm


M.Phil. Mr.C.Mani M.C.A., C. Mehala
2022 Zenodo  
Data Mining is used to gather information from huge set of data. Clustering is a grouping task for a set of objects.  ...  One of the most important algorithms for clustering heterogeneous type of data is the K- Prototype algorithm. This algorithm is veritably salutary for clustering large data sets.  ...  WuSen. et al (2013), has proposed a K-Prototype clustering algorithm for deficient datasets with mixed numeric and categorical attributes.  ... 
doi:10.5281/zenodo.6410014 fatcat:pdskqxda5fhdxgthbg6z5vg64y

A supervised clustering algorithm for computer intrusion detection

Xiangyang Li, Nong Ye
2005 Knowledge and Information Systems  
We previously developed a clustering and classification algorithm-supervised (CCAS) to learn patterns of normal and intrusive activities and to classify observed system activities.  ...  This robust CCAS adds data redistribution, a supervised hierarchical grouping of clusters and removal of outliers as the postprocessing steps.  ...  The cluster structure represents the patterns of normal and intrusive activities. We classify a new data point by comparing new data points with these clusters.  ... 
doi:10.1007/s10115-005-0195-8 fatcat:gssqtwazvzfz7bjs3l6nsdw6ce

The k-means Algorithm: A Comprehensive Survey and Performance Evaluation

Mohiuddin Ahmed, Raihan Seraj, Syed Mohammed Shamsul Islam
2020 Electronics  
Additionally, such a clustering algorithm requires the number of clusters to be defined beforehand, which is responsible for different cluster shapes and outlier effects.  ...  The k-means clustering algorithm is considered one of the most powerful and popular data mining algorithms in the research community.  ...  [61] offered a cost function and distance measure for clustering datasets with mixed data (datasets with numerical and categorical data) based on co-occurrences of values.  ... 
doi:10.3390/electronics9081295 fatcat:63g2t367uzbf7dohoxf3rzfjnm
« Previous Showing results 1 — 15 out of 37,767 results