MR-DBIFOA: a parallel Density-based Clustering Algorithm by Using Improve Fruit Fly Optimization

Wei Liu Wei Liu, Jiaxin Wang Wei Liu, Xiaopan Su Jiaxin Wang, Yimin Mao Xiaopan Su
2022 Diànnǎo xuékān  
<p>Clustering is an important technique for data analysis and knowledge discovery. In the context of big data, the density-based clustering algorithm faces three challenging problems: unreasonable division of data gridding, poor parameter optimization ability and low efficiency of parallelization. In this study, a density-based clustering algorithm by using improve fruit fly optimization based on MapReduce (MR-DBIFOA) is proposed to tackle these three problems. Firstly, based on KD-Tree, a
more » ... ion strategy (KDG) is proposed to divide the cell of grid adaptively. Secondly, an improve fruit fly optimization algorithm (IFOA) which use the step strategy based on knowledge learn (KLSS) and the clustering criterion function (CFF) is designed. In addition, based on IFOA algorithm, the optimal parameters of local clustering are dynamically selected, which can improve the clustering effect of local clustering. Meanwhile, in order to improve the parallel efficiency, the density-based clustering algorithm using IFOA (MR-QRMEC) are proposed to parallel compute the local clusters of clustering algorithm. Finally, based on QR-Tree and MapReduce, a cluster merging algorithm (MR-QRMEC) is proposed to get the result of clustering algorithm more quickly, which improve the core clusters merging efficiency of density-based clustering algorithm. The experimental results show that the MR-DBIFOA algorithm has better clustering results and performs better parallelization in big data.</p> <p>&nbsp;</p>
doi:10.53106/199115992022023301010 fatcat:gzehehxvwngipg3vhdeeiuuk4u