Outlier Detection for Improving Data Robust by ODAD Clustering Technique

Deepti Mishra, NIU, Greater Noida India
2019 International Journal of Advanced Trends in Computer Science and Engineering  
The paper presents the concept of outliers and its detection by applying an altogether a new approach. Outliers are the odd man out data points falling under the domain of data mining. Data Mining is the evolving heading, now a days because of its ability to deal large amount of data. This paper identifies the outliers in the dataset through an algorithm named outlier detection based on angle and distance based (ODAD) which is based on clustering techniques (which is combination of the angle
more » ... ed and distance based approaches). It encompasses basic five steps: density calculation, cluster identification, angle calculation, Euclidean distance calculation and finally outlier identification and detection. It first calculates the density between the data points to identify the "clusters". Further, to circle out the outliers, the distance-based method of clustering is applied collectively with angle based method to calculate the distance between the data points. The algorithm assigns the rank value to top most outlier data points. Data points having highest rank values considered as outliers. ODAD is implemented in both R and MSSQL.
doi:10.30534/ijatcse/2019/130862019 fatcat:z7pmxmn735czhpknlmgruchbfu