Filters








10,615 Hits in 7.1 sec

J. Basic. Appl. Sci. Res., 5(2)31-38, 2015.pdf

Ajitha Padmanabhan
2020 figshare.com  
Distributed Data Mining-Survey on outliers  ...  RBRP(Recursive Binning and Re-Projection) RBRP [15] algorithm is a fast algorithm for mining outlier detection specifically for high dimensional data sets.  ...  Our contributions This survey offers the distributed strategies that exist for detecting outliers in large and high dimensional data sets.  ... 
doi:10.6084/m9.figshare.13325975.v1 fatcat:bwunerjp4vei3azoec2bfvflfy

A comprehensive survey of anomaly detection techniques for high dimensional big data

Srikanth Thudumu, Philip Branch, Jiong Jin, Jugdutt (Jack) Singh
2020 Journal of Big Data  
They called the approach as "fast distributed" and intended for mixed-attribute data sets that deal with sparse high-dimensional data.  ...  [140] proposed a fast parallel anomaly detection approach that is dependent on the attribute value frequency approach, a scalable, high-speed outlier detection process for categorical data that is effortless  ...  Authors' contributions ST conducted the systematic literature review and examined various techniques related to the problems of anomaly detection in high-dimensional big data.  ... 
doi:10.1186/s40537-020-00320-x fatcat:nrx7fnuzbvf65edoisv65by4s4

Outlier detection special issue

Sanjay Chawla, David Hand, Vasant Dhar
2010 Data mining and knowledge discovery  
Here is a simple example: Let D be a multi-variate data set and the objective it to discover whether there are  ...  The objective of Outlier Detection in Data Mining is in similar vein-outliers often embody new information, which is often hard to explain in the context of existing knowledge and results in a re-evaluation  ...  Koufakou and Georgiopolous in "A Fast Outlier Detection Strategy for Distributed High-Dimensional Data Sets with Mixed Attributes," present algorithms for the detecting outliers in situations where data  ... 
doi:10.1007/s10618-009-0163-0 fatcat:oaekiiwzzndztmcirca6ykggrq

Fast Mining of Distance-Based Outliers in High-Dimensional Datasets [chapter]

Amol Ghoting, Srinivasan Parthasarathy, Matthew Eric Otey
2006 Proceedings of the 2006 SIAM International Conference on Data Mining  
In this paper, we present RBRP, a fast algorithm for mining distance-based outliers, particularly targeted at high-dimensional data sets.  ...  Existing algorithms for mining distance-based outliers do not scale to large, highdimensional data sets.  ...  We now present RBRP (Recursive Binning and Re-Projection), a two-phase algorithm for fast mining of distance-based outliers in high dimensional data sets.  ... 
doi:10.1137/1.9781611972764.70 dblp:conf/sdm/GhotingPO06 fatcat:yxlydqc2r5fj3b3g6lplwdv2fu

Fast mining of distance-based outliers in high-dimensional datasets

Amol Ghoting, Srinivasan Parthasarathy, Matthew Eric Otey
2008 Data mining and knowledge discovery  
In this paper, we present RBRP, a fast algorithm for mining distance-based outliers, particularly targeted at high-dimensional data sets.  ...  Existing algorithms for mining distance-based outliers do not scale to large, highdimensional data sets.  ...  We now present RBRP (Recursive Binning and Re-Projection), a two-phase algorithm for fast mining of distance-based outliers in high dimensional data sets.  ... 
doi:10.1007/s10618-008-0093-2 fatcat:cbm52bnsdvefdbkm7klglwgbzm

A Scalable and Efficient Outlier Detection Strategy for Categorical Data

Anna Koufakou, Enrique G. Ortiz, Michael Georgiopoulos, Georgios C. Anagnostopoulos, Kenneth M. Reynolds
2007 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007)  
In this paper, we introduce Attribute Value Frequency (AVF), a fast and scalable outlier detection strategy for categorical data.  ...  AVF scales linearly with the number of data points and attributes, and relies on a single data scan.  ...  In this paper, we introduce an outlier detection strategy for categorical data, called Attribute Value Frequency (AVF).  ... 
doi:10.1109/ictai.2007.125 dblp:conf/ictai/KoufakouOGAR07 fatcat:cauqplj4zvdtrar2pa6anwji5y

Outlier Detection on Mixed-Type Data: An Energy-based Approach [article]

Kien Do, Truyen Tran, Dinh Phung, Svetha Venkatesh
2016 arXiv   pre-print
In this paper, we propose a new unsupervised outlier detection method for mixed-type data based on Mixed-variate Restricted Boltzmann Machine (Mv.RBM).  ...  However, real world data is increasingly heterogeneous, where a data point can have both discrete and continuous attributes. Handling mixed-type data in a disciplined way remains a great challenge.  ...  Acknowledgments This work is partially supported by the Telstra-Deakin Centre of Excellence in Big Data and Machine Learning.  ... 
arXiv:1608.04830v1 fatcat:ibug7uuvl5hfniuyty6yksl5w4

ELKI: A large open-source library for data analysis - ELKI Release 0.7.5 "Heidelberg" [article]

Erich Schubert, Arthur Zimek
2019 arXiv   pre-print
The focus of ELKI is research in algorithms, with an emphasis on unsupervised methods in cluster analysis and outlier detection.  ...  We will first outline the motivation for this release, the plans for the future, and then give a brief overview over the new functionality in this version.  ...  COP: Correlation Outlier Probability Outlier detection using variance analysis on angles, especially for high dimensional data sets.  ... 
arXiv:1902.03616v1 fatcat:ws3f5ymembeg3dlwawlz6njmvq

Multivariate Computing and Robust Estimating for Outlier and Novelty in Data and Imaging Sciences [chapter]

Michelle Yongmei Wang, Chris E. Zwilling
2015 Advances in Bioengineering  
Acknowledgements This work is supported in part by a grant from the National Institute of Health, K25AG033725.  ...  In [48] , several extensions to the classical outlier detection framework are proposed to handle high-dimensional imaging data.  ...  Procedures using the classical MCD estimator are not well-suited for such high-dimensional data.  ... 
doi:10.5772/59750 fatcat:6ybdtwliozhpfpcapxay6zssky

FP-outlier: Frequent pattern based outlier detection

Zengyou He, Xiaofei Xu, Zhexue Huang, Shengchun Deng
2005 Computer Science and Information Systems  
In this paper, we present a new method to detect outliers by discovering frequent patterns (or frequent itemsets) from the data set.  ...  An outlier in a dataset is an observation or a point that is considerably dissimilar to or inconsistent with the remainder of the data.  ...  The High Technology Research and Development Program of China (Grant No. 2002AA413310, Grant No. 2003AA4Z2170, Grant No. 2003AA413021), the National Nature Science Foundation of China (Grant No. 40301038  ... 
doi:10.2298/csis0501103h fatcat:yvnwv3zelfelbmixxvlpmv5a7y

Outlier Detection Strategies for WSNs: A Survey

Bhanu Chander, G. Kumaravelan
2021 Journal of King Saud University: Computer and Information Sciences  
Furthermore, each aforementioned outlier detection approach is presented with detailed discussions and future scope for developments.  ...  Thus, detecting outliers in WSNs using data-driven approaches becomes a novel technique among the Machine Learning (ML) communities.  ...  High dimensional data WSNs contain a large amount of sensed data points, each data point holds several attributes.  ... 
doi:10.1016/j.jksuci.2021.02.012 fatcat:rpgswasszzbgdbkziskhqrqjam

Outlier Detection: A Research and Modified Method using Fuzzy Clustering

2020 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
Outlier detection is studied extensively in data mining and developed for certain application domains, while others are generic in nature.  ...  The purpose of this paper briefly provides a survey on outlier detection and a modified approach to detect outlier using Fuzzy clustering.  ...  Madhubala for her continuous support and encouragement to bring this research work successful.  ... 
doi:10.35940/ijitee.c1091.0193s20 fatcat:x5fid7jilncw3kz6eouieay5ve

An Optimized Computational Framework for Isolation Forest

Zhen Liu, Xin Liu, Jin Ma, Hui Gao
2018 Mathematical Problems in Engineering  
Yet, in the model setting, it is mainly based on the technique of randomization and, as a result, it is not clear how to select a proper attribute and how to locate an optimized split point on a given  ...  According to the experimental results, the proposed model is able to achieve overall better performance in the accuracy of outlier detection compared with the original model and its related variants.  ...  a graph and detect the nodes as outliers in the graph with low degrees.  ... 
doi:10.1155/2018/2318763 fatcat:zvjqmngnk5ckjmesszj5fgwa2m

A review of novelty detection

Marco A.F. Pimentel, David A. Clifton, Lei Clifton, Lionel Tarassenko
2014 Signal Processing  
[162] propose a distance measure for data containing a mix of categorical and continuous attributes.  ...  Many of these algorithms are unable to deal with high-dimensional data sets efficiently.  ... 
doi:10.1016/j.sigpro.2013.12.026 fatcat:ha6kc4bzhbajxbo2mdyh5cw5hu

A Supervised Approach for Detection of Outliers in Healthcare Claims Data

P Naga Jyothi, D Rajya Lakshmi, K.V.S.N.Rama Rao
2020 Journal of Engineering Science and Technology Review  
Outlier detection is a fast-moving method in healthcare data and it is the major concern for the health insurance providers. Most of the Medicare data is related to real-world data.  ...  The paper presents a model-based approach in which outliers are detected and they were assigned with labels.  ...  This is an Open Access article distributed under the terms of the Creative Commons Attribution License ______________________________ References  ... 
doi:10.25103/jestr.131.25 fatcat:lrbic3gszjeyxhzc4aon4yqjza
« Previous Showing results 1 — 15 out of 10,615 results