1,316 Hits in 3.5 sec

Outlier Detection in Arbitrarily Oriented Subspaces

Hans-Peter Kriegel, Peer Kroger, Erich Schubert, Arthur Zimek
2012 2012 IEEE 12th International Conference on Data Mining  
Our model enables to search for outliers in arbitrarily oriented subspaces of the original feature space.  ...  In this paper, we propose a novel outlier detection model to find outliers that deviate from the generating mechanisms of normal instances by considering combinations of different subsets of attributes  ...  CONCLUSIONS We proposed a local model that takes correlations among varying subsets of attributes into account in order to find outliers in arbitrarily oriented subspaces.  ... 
doi:10.1109/icdm.2012.21 dblp:conf/icdm/KriegelKSZ12 fatcat:3qanbuufujclneckcs6oad5qf4

Compound Segmentation via Clustering on Mol2Vec-based Embeddings

Daniyal Kazempour, Anna Beer, Melanie Oelker, Peer Kroger, Thomas Seidl
2021 2021 IEEE 17th International Conference on eScience (eScience)  
Preliminary results from subspace clusterings indicate that a compression of the vector representations seems viable.  ...  Further, we investigate how far subspace clustering can be utilized to compress the data by reducing the dimensionality of the compounds vector representation.  ...  PCA is a linear dimensionality reduction technique that yields a single arbitrarily oriented subspace.  ... 
doi:10.1109/escience51609.2021.00016 fatcat:enq7c3iigza4vacekzlfxrrx24

Robust Clustering in Arbitrarily Oriented Subspaces [chapter]

Elke Achtert, Christian Böhm, Jörn David, Peer Kröger, Arthur Zimek
2008 Proceedings of the 2008 SIAM International Conference on Data Mining  
In this paper, we propose an efficient and effective method to find arbitrarily oriented subspace clusters by mapping the data space to a parameter space defining the set of possible arbitrarily oriented  ...  In contrast to existing approaches, our method can find subspace clusters of different dimensionality even if they are sparse or are intersected by other clusters within a noisy environment.  ...  In this paper we focus on the generalized problem of finding arbitrarily oriented subspace clusters.  ... 
doi:10.1137/1.9781611972788.69 dblp:conf/sdm/AchtertBDKZ08 fatcat:5oqy3obfzbgoppcvsv5kfotejm

Detecting Arbitrarily Oriented Subspace Clusters in Data Streams Using Hough Transform [chapter]

Felix Borutta, Daniyal Kazempour, Felix Mathy, Peer Kröger, Thomas Seidl
2020 Lecture Notes in Computer Science  
In this work, we present a novel oriented subspace clustering algorithm that is able to deal with such issues and detects arbitrarily oriented subspace clusters in high-dimensional data streams.  ...  We therefore propose the CashStream algorithm that unites state-of-the-art stream processing techniques and additionally relies on the Hough transform to detect arbitrarily oriented subspace clusters.  ...  In this work, we tackle this problem and present a novel oriented subspace clustering algorithm that is able to detect arbitrarily oriented subspace clusters in data streams.  ... 
doi:10.1007/978-3-030-47426-3_28 fatcat:ptajh7o4nndhfpu6zypm3mfiuu

Subspace Nearest Neighbor Search - Problem Statement, Approaches, and Discussion [chapter]

Michael Hund, Michael Behrisch, Ines Färber, Michael Sedlmair, Tobias Schreck, Thomas Seidl, Daniel Keim
2015 Lecture Notes in Computer Science  
In this position paper, we frame a new research problem, called subspace nearest neighbor search, aiming at multiple querydependent subspaces for nearest neighbor search.  ...  More specifically, the relevance of dimensions may depend on the query object itself, and in general, different dimension sets (subspaces) may be appropriate for a query.  ...  like to thank the German Research Foundation (DFG) for financial support within the projects A03 of SFB/Transregio 161 "Quantitative Methods for Visual Computing" and DFG-664/11 "SteerSCiVA: Steerable Subspace  ... 
doi:10.1007/978-3-319-25087-8_29 fatcat:tugiqueqxncrnmnatl6xbmyfiq

Global Correlation Clustering Based on the Hough Transform

Elke Achtert, Christian Böhm, Jörn David, Peer Kröger, Arthur Zimek
2008 Statistical analysis and data mining  
In this article, we propose an efficient and effective method for finding arbitrarily oriented subspace clusters by mapping the data space to a parameter space defining the set of possible arbitrarily  ...  oriented subspaces.  ...  In this article, we focus on the generalized problem of finding arbitrarily oriented subspace clusters.  ... 
doi:10.1002/sam.10012 fatcat:viiartbyrbbtxktm55dpscmis4

Local Subspace-Based Outlier Detection using Global Neighbourhoods [article]

Bas van Stein, Matthijs van Leeuwen, Thomas Bäck
2016 arXiv   pre-print
Outlier detection in high-dimensional data is a challenging yet important task, as it has applications in, e.g., fraud detection and quality control.  ...  We therefore introduce GLOSS, an algorithm that performs local subspace outlier detection using global neighbourhoods.  ...  Subspace Outlier Detection (SOD) [7] is an algorithm that searches for outliers in meaningful subspaces of the data space or even in arbitrarily-oriented subspaces [8] .  ... 
arXiv:1611.00183v1 fatcat:tmsg74rc25hr3dhda5wsik6dii

Discriminative features for identifying and interpreting outliers

Xuan Hong Dang, Ira Assent, Raymond T. Ng, Arthur Zimek, Erich Schubert
2014 2014 IEEE 30th International Conference on Data Engineering  
We propose an algorithm that uncovers outliers in subspaces of reduced dimensionality in which they are well discriminated from regular objects while at the same time retaining the natural local structure  ...  We consider the problem of outlier detection and interpretation.  ...  seeking local outliers from varying density data; (2) SOD [23] which seeks outliers in axis-parallel subspaces; (3) COP [25] which finds outliers in arbitrarily oriented subspaces; (4) ABOD [26]  ... 
doi:10.1109/icde.2014.6816642 dblp:conf/icde/DangANZS14 fatcat:r76u7u4vlbf5zcczmfqhtlsc7i

A Study on Clustering High Dimensional Data Using Hubness Phenomenon

V Suganthi, S. Tamilarasi
2014 IOSR Journal of Computer Engineering  
To overcome this problem, the proposed system is to use hub based clustering technique to improve the quality of cluster in terms of effectiveness and accuracy, and to avoid only detecting hyper-spherical  ...  In recent years, data repository has a high dimensional data, which makes a complete search in most of the data mining problems leads computationally infeasible.  ...  Clustering and outlier detection are the tasks that takes a vital role in hubness phenomenon.  ... 
doi:10.9790/0661-16282230 fatcat:3x4wcv3we5g7jn5fojr4mpb7jm

Dynamic Sparse Subspace Clustering for Evolving High-Dimensional Data Streams

Jinping Sui, Zhen Liu, Li Liu, Alexander Jung, Xiang Li
2020 IEEE Transactions on Cybernetics  
In addition, the subspace evolution detection model based on the Page-Hinkley test is proposed where the appearing, disappearing, and recurring subspaces can be detected and adapted.  ...  It has been observed that high-dimensional data are usually distributed in a union of low-dimensional subspaces.  ...  ACKNOWLEDGMENT The authors would like to acknowledge Timo Huuhtanen for his help in polishing this article and Zhang et al. [7] , Hyde et al. [9] , and Peng et al.  ... 
doi:10.1109/tcyb.2020.3023973 pmid:33232249 fatcat:jqjwgmevkffpzcn2gqtvla2ome

Projective clustering by histograms

E.K.K. Ng, A.W.-C. Fu, R.C.-W. Wong
2005 IEEE Transactions on Knowledge and Data Engineering  
The histograms help to generate "signatures", where a signature corresponds to some region in some subspace, and signatures with a large number of data objects are identified as the regions for subspace  ...  onto the subspaces.  ...  Acknowledgements We thank Lai Mei Chiu for her help in the data generations and experiments. We thank the anonymous reviewers for their valuable comments and suggestions.  ... 
doi:10.1109/tkde.2005.47 fatcat:wsv24woa75fflhbr6urbjxrfri

Weighted and robust incremental method for subspace learning

Skocaj, Leonardis
2003 Proceedings Ninth IEEE International Conference on Computer Vision  
In this paper we present a method for subspace learning, which takes these considerations into account.  ...  novel incremental, weighted and robust method for subspace learning.  ...  Consequently, the reconstruction error in outliers is large, which makes their detection easier.  ... 
doi:10.1109/iccv.2003.1238667 dblp:conf/iccv/SkocajL03 fatcat:uqubdhq64ba2lkvlojdtdbf5ve

Local Outlier Detection with Interpretation [chapter]

Xuan Hong Dang, Barbora Micenková, Ira Assent, Raymond T. Ng
2013 Lecture Notes in Computer Science  
In LODI, we develop an approach that explores the quadratic entropy to adaptively select a set of neighboring instances, and a learning method to seek an optimal subspace in which an outlier is maximally  ...  Outlier detection aims at searching for a small set of objects that are inconsistent or considerably deviating from other objects in a dataset.  ...  A similar approach is adopted in [16] where the subspace can be arbitrarily oriented (not only axis-parallel) and a form of outlier characterization based on vector directions have been proposed.  ... 
doi:10.1007/978-3-642-40994-3_20 fatcat:wya2mjnxibgkxatcyaozsbmney

Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces

Kaushik Chakrabarti, Sharad Mehrotra
2000 Very Large Data Bases Conference  
In practice, datasets are often not globally correlated.  ...  In such cases, reducing the data dimensionality using GDR causes significant loss of distance information resulting in a large number of false positives and hence a high query cost.  ...  We thank Corel Corporation for making the large collection of images used in the COL-HIST dataset available to us.  ... 
dblp:conf/vldb/ChakrabartiM00 fatcat:tgzai3lsgzfmtfan6wq5d2hke4

Cluster Ensemble Approach for High Dimensional Data

2018 Australian Journal of Basic and Applied Sciences  
In this paper, we address the problem of combining multiple weighted clusters which belong to different subspaces of the input space.  ...  It plays a crucial and initial role in machine learning, data mining and information retrieval.  ...  It is also possible to determine the outliers in arbitrarily oriented subspaces of the data (Song, Q., et al., 2011) .  ... 
doi:10.22587/ajbas.2018.12.1.9 fatcat:cnsct4lqarhp7ar66ggi53b4mq
« Previous Showing results 1 — 15 out of 1,316 results