An Evolving Fuzzy Model to Determine an Optimal Number of Data Stream Clusters

Hussein A. A. Al-Khamees, Nabeel Al-A'araji, Eman S. Al-Shamery
2022 International Journal of Fuzzy Logic and Intelligent Systems  
Data streams are a modern type of data that differ from traditional data in various characteristics: their indefinite size, high access, and concept drift due to their origin in non-stationary environments. Data stream clustering aims to split these data samples into significant clusters, depending on their similarity. The main drawback of data stream clustering algorithms is the large number of clusters they produce. Therefore, determining an optimal number of clusters is an important
more » ... for these algorithms. In practice, evolving models can change their general structure by implementing different mechanisms. This paper presents a fuzzy model that mainly consists of an evolving Cauchy clustering algorithm which is updated through a specific membership function and determines the optimal number of clusters by implementing two evolving mechanisms: adding and splitting clusters. The proposed model was tested on six different streaming datasets, namely, power supply, sensor, HuGaDB, UCI-HAR, Luxembourg, and keystrokes. The results demonstrated that the efficiency of the proposed model in producing an optimal number of clusters for each dataset outperforms that of previous models.
doi:10.5391/ijfis.2022.22.3.267 fatcat:5lxsn4tjcngjreqiv43l4qllwi