Implementation of the Fuzzy C-Means Clustering Algorithm in Meteorological Data

Yinghua Lu, Tinghuai Ma, Changhong Yin, Xiaoyu Xie, Wei Tian, Shui Ming Zhong
2013 International Journal of Database Theory and Application  
An improved fuzzy c-means algorithm is put forward and applied to deal with meteorological data on top of the traditional fuzzy c-means algorithm. The proposed algorithm improves the classical fuzzy c-means algorithm (FCM) by adopting a novel strategy for selecting the initial cluster centers, to solve the problem that the traditional fuzzy c-means (FCM) clustering algorithm has difficulty in selecting the initial cluster centers. Furthermore, this paper introduces the features and the mining
more » ... ocess of the open source data mining platform WEKA, while it doesn't implement the FCM algorithm. Considering this shortcoming of WEKA, we successfully implement the FCM algorithm and the advanced FCM algorithm taking advantage of the basic classes in WEKA. Finally, the experimental clustering results of meteorological data are given, which can exactly prove that our proposed algorithm will generate better clustering results than those of the K-Means algorithm and the traditional FCM algorithm. probability of the observation object being a part of different groups, which reduces the effectiveness of hard clustering methods in many real situations. For this purpose, fuzzy clustering methods which incorporate fuzzy set theory [5] have emerged. Fuzzy clustering methods [6-8] quantitatively determine the affinities of different objects with mathematical methods, described by a member function, to divide types objectively. Among the fuzzy clustering method, the fuzzy c-means (FCM) algorithm [9] is the most well-known method because it has the advantage of robustness for ambiguity and maintains much more information than any hard clustering methods. The algorithm is an extension of the classical and the crisp k-means clustering method in fuzzy set domain. It is widely studied and applied in pattern recognition, image segmentation and image clustering [10-12], data mining [13], wireless sensor network [14] and so on. WEKA (Waikato Environment for Knowledge Analysis) based on JAVA environment is a free, non-commercial and open-source platform aiming at machine learning and data mining. In WEKA, it implements several famous data mining algorithms. Users can call the appropriate algorithms according to their various purposes. However, the FCM algorithm is not integrated into WEKA. In this paper, we implement the FCM algorithm and successfully integrate it into WEKA to expand the system functions of the open-source platform, so that users can directly call the FCM algorithm to do fuzzy clustering analysis. Besides, considering the shortcoming of the classical FCM algorithm in selecting the initial cluster centers, we represent an improved FCM algorithm which adopts a new strategy to optimize the selection of original cluster centers. The structure of this paper is as follows. In the next section, we start a brief review of WEKA and the FCM algorithm. Section 3 describes the main ideas of the traditional FCM algorithm. In Section 4, we present our proposed algorithm based on the traditional FCM algorithm. Experiments results on meteorological data will be shown in Section 5. Finally, conclusions and future work are summarized. Copyright ⓒ 2013 SERSC 13 priori knowledge. So the conclusion is that the improved FCM algorith m is better than the traditional K-means algorithm from the perspective of the priori meteorological knowledge.
doi:10.14257/ijdta.2013.6.6.01 fatcat:wfaon726frf3feup6jziw3qine