Filters








94,176 Hits in 3.2 sec

Context-Based Distance Learning for Categorical Data Clustering [chapter]

Dino Ienco, Ruggero G. Pensa, Rosa Meo
2009 Lecture Notes in Computer Science  
In this paper, we propose a method to learn a context-based distance for categorical attributes.  ...  Clustering data described by categorical attributes is a challenging task in data mining applications.  ...  Periklis Andritsos who provided the implementation of LIMBO, and Elena Roglia for stimulating discussions. Ruggero G. Pensa is co-funded by Regione Piemonte.  ... 
doi:10.1007/978-3-642-03915-7_8 fatcat:mdcy5zasgnhhdahrfh4zclf4w4

Context-Based Geodesic Dissimilarity Measure for Clustering Categorical Data

Changki Lee, Uk Jung
2021 Applied Sciences  
This study proposes a new method to measure the dissimilarity between two categorical observations, called a context-based geodesic dissimilarity measure, for the categorical data clustering problem.  ...  The dissimilarity or distance computation has been a manageable problem for continuous data because many numerical operations can be successfully applied.  ...  Acknowledgments: The authors would like to thank the anonymous reviewers for their constructive comments. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/app11188416 fatcat:o6sk4phj3ffanfhhu6vz3dzq4e

An Approach of Context Ontology for Robust Face Recognition Against Illumination Variations

M.Rezaul Bashar, Yan Li, Phill Kyu Rhee
2007 2007 International Conference on Information and Communication Technology  
Context ontology is built using context acquisition, context learning and context categorization.  ...  , learning, and recognition.  ...  Context Categorization Cosine distance is a popular distance measure for comparing documents in the information retrieval literature.  ... 
doi:10.1109/icict.2007.375351 fatcat:2x6cd2zg4vfu7hqjgdbkxdkvre

From Context to Distance

Dino Ienco, Ruggero G. Pensa, Rosa Meo
2012 ACM Transactions on Knowledge Discovery from Data  
In this paper, we propose a framework to learn a context-based distance for categorical attributes.  ...  Clustering data described by categorical attributes is a challenging task in data mining applications.  ...  Periklis Andritsos who provided the implementation of LIM BO, and Elena Roglia for stimulating discussions. We want to thank Regione Piemonte which co-funds Ruggero G. Pensa.  ... 
doi:10.1145/2133360.2133361 fatcat:z2wdlwi3gbf7rgjelbew5aoh2m

A Review on Outlier Detection Approaches

Monika R. Bankar
2019 International Journal for Research in Applied Science and Engineering Technology  
It is difficult to define distance between two categorical attributes because the values are not ordered and hence the outlier detection strategy is different for numerical and categorical attributes.  ...  And new strategy for outlier detection is proposed.  ...  For categorical data system uses context-based distance learning. The distance is evaluated using attributes values distribution over data objects.  ... 
doi:10.22214/ijraset.2019.3345 fatcat:hc6vputvczgpzpxm37bz5a2ijm

Context Aware Clustering Using Glove and K-Means

Pulkit Juneja, Hemant Jain, Tanay Deshmukh, Siddhant Somani, Tripathy B.K
2017 International Journal of Software Engineering & Applications  
Several methods exist which can cluster categorical data, but our approach is unique in that we use recent text-processing and machine learning advancements like GloVe and t-SNE to develop a a context-aware  ...  In this paper we propose a novel method to cluster categorical data while retaining their context. Typically, clustering is performed on numerical data.  ...  Manning of Stanford University for their work on Global Vectors (GloVe) for Word Representations. Without their research our work would not have been possible.  ... 
doi:10.5121/ijsea.2017.8403 fatcat:3arcmsxdg5azhn44dyg24o2txm

ConDist: A Context-Driven Categorical Distance Measure [chapter]

Markus Ring, Florian Otto, Martin Becker, Thomas Niebler, Dieter Landes, Andreas Hotho
2015 Lecture Notes in Computer Science  
A distance measure between objects is a key requirement for many data mining tasks like clustering, classification or outlier detection.  ...  We compare our new distance measure to existing categorical distance measures and evaluate on different data sets from the UCI machine-learning repository.  ...  This work is funded by the Bavarian Ministry for Economic affairs through the WISENT project (grant no. IUK 452/002) and by the DFG through the PoSTS II project (grant no. STR 1191/3-2).  ... 
doi:10.1007/978-3-319-23528-8_16 fatcat:izqbhcdshrg3rh7kmfkk7pr2xa

Innovative Teaching-Learning Process: Categorical Clustering Data

K. Sree Vani
2020 Journal of Engineering Education Transformations  
The present study is intended to explore the categorical clustering data.  ...  The results revealed that there is a statistically significant difference in categorical data clustering with reference to gender as well as managementImplications and suggestions for further research  ...  TEACHING-LEARNING PROCESS: CATEGORICAL CLUSTERING DATA 1.1 Objectives of the study 1.  ... 
doi:10.16920/jeet/2020/v33i0/150207 fatcat:hkvgovuof5gh5hiwutb7dx5e7u

Exploiting Mobile Ad Hoc Networking and Knowledge Generation to Achieve Ambient Intelligence

Anna Lekova
2012 Applied Computational Intelligence and Soft Computing  
EFMF employs unsupervised online one-pass fuzzy clustering method to recognize nodes' mobility context from social scenario traces and ubiquitously learn "friends" and "strangers" indirectly and anonymously  ...  The contribution of the present study is a distributed evolving fuzzy modeling framework (EFMF) to observe and categorize relationships and activities in the user and application level and based on that  ...  Therefore, we apply an unsupervised approach to learn context from data in a passive (nonintrusive) mode without a priory knowledge and focus our study on context-awareness for routing services.  ... 
doi:10.1155/2012/262936 fatcat:32r5hqd7yzaqtfb5uebglqglv4

Clustering Unknown IoT Devices in a 5G Mobile Network Security Context via Machine Learning

Tony Hammainen, Julen Kahles
2021 2021 17th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob)  
We propose a novel machine-learning pipeline for clustering unknown IoT devices in an industrial 5G mobilenetwork setting.  ...  More specifically, we develop feature engineering methods that transform IP-flows into device-level data points, define distance metrics between the data points, and apply the DBSCAN algorithm on them.  ...  Results for reference method 1: Numerical aggregations / Euclidean distance Fig. 3 . 3 Fig. 3. Results for reference method 2: Categorical aggregations / Jaccard distance Fig. 4 . 4 Fig. 4.  ... 
doi:10.1109/wimob52687.2021.9606307 fatcat:so7zvj3zmnaklmtkrzmxabwh2a

A space-structure based dissimilarity measure for categorical data

Kevin Alejandro Hernández, D. Cárdenas Peña, Álvaro A. Orozco
2021 International Journal of Power Electronics and Drive Systems (IJPEDS)  
For this reason, we propose a new distance metric for categorical data.  ...  Therefore, determining a dissimilarity measure for categorical data is one of the most attractive and recent challenges in data mining problems.  ...  project "Desarrollo de una metodología para la identificación de perfiles de los consumidores del servicio público utilizando técnicas de aprendizaje de máquina" (number 2-20-8) funded by Vice-Rectory for  ... 
doi:10.11591/ijece.v11i1.pp620-627 fatcat:3n65lttyjjhy3guszrxfszhece

A NOVEL CLASSIFICATION APPROACH OF TRAVEL REVIEW DATASET BASED ON ENTERTAINMENT

Dr. Ayyappan G., Dr. Kumaravel A.
2019 Indian Journal of Computer Science and Engineering  
A possible solution is to adopt clustering techniques to limit the data to be considered for recommendation process.  ...  In tourism context, based on social media interactions like reviews, forums, blogs, feedbacks, etc. travelers can be clustered to form different interest groups.  ...  Based on the data under consideration appropriate distance measure can be chosen for clustering.  ... 
doi:10.21817/indjcse/2019/v10i3/191003012 fatcat:7plpac3aczdi7n3qi2s222ryk4

Clustering Analysis with Embedding Vectors: An Application to Real Estate Market Delineation

Changro Lee
2021 Advances in Technology Innovation  
Although clustering analysis is a popular tool in unsupervised learning, it is inefficient for the datasets dominated by categorical variables, e.g., real estate datasets.  ...  Three variants of a clustering algorithm, i.e., the clustering based on the traditional Euclidean distance, the Gower distance, and the embedding vectors, are applied to the land sales records to delineate  ...  For instance, categorical data such as the color of products with each element being black, blue, and red cannot be clustered based on the distance between the three colors.  ... 
doi:10.46604/aiti.2021.8492 fatcat:isk4oah2hbe4pecqnoxk77encq

Symbolic Distance Measurements Based on Characteristic Subspaces [chapter]

Marcus-Christopher Ludl
2003 Lecture Notes in Computer Science  
We introduce the subspace difference metric, a novel heterogeneous distance metric for calculating distances between points with both continuous and (unordered) categorical attributes.  ...  Our approach is based on the computation and comparison of characteristic subspaces (i.e. contexts) for each of the symbols and can be viewed as a generalization of the well-known value difference metric  ...  The Austrian Research Institute for Artificial Intelligence acknowledges basic financial support by the Austrian Federal Ministry for Education, Science, and Culture.  ... 
doi:10.1007/978-3-540-39804-2_29 fatcat:wzheh6vzkbealj5yjcwadetbum

Mining entity attribute synonyms via compact clustering

Yanen Li, Bo-June Paul Hsu, ChengXiang Zhai, Kuansan Wang
2013 Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13  
In this work, we propose a novel compact clustering framework to jointly identify synonyms for a set of attribute values.  ...  Extensive experiments across multiple domains demonstrate the effectiveness of our clustering framework for mining entity attribute synonyms.  ...  By mining the this context, we are able to discover categorical patterns, which would otherwise be impossible had we looked for synonyms one attribute value at a time due to data sparseness.  ... 
doi:10.1145/2505515.2505608 dblp:conf/cikm/LiHZW13 fatcat:g67oqzuhofcn7ddgpadjezwusi
« Previous Showing results 1 — 15 out of 94,176 results