Impact of Load Demand Dataset Characteristics on Clustering Validation Indices [article]

Mayank Jain, Mukta Jain, Tarek AlSkaif, Soumyabrata Dev
2021 arXiv   pre-print
With the inclusion of smart meters, electricity load consumption data can be fetched for individual consumer buildings at high temporal resolutions. Availability of such data has made it possible to study daily load demand profiles of the households. Clustering households based on their demand profiles is one of the primary, yet a key component of such analysis. While many clustering algorithms/frameworks can be deployed to perform clustering, they usually generate very different clusters. In
more » ... der to identify the best clustering results, various cluster validation indices (CVIs) have been proposed in the literature. However, it has been noticed that different CVIs often recommend different algorithms. This leads to the problem of identifying the most suitable CVI for a given dataset. Responding to the problem, this paper shows how the recommendations of validation indices are influenced by different data characteristics that might be present in a typical residential load demand dataset. Furthermore, the paper identifies the features of data that prefer/prohibit the use of a particular cluster validation index.
arXiv:2108.01433v1 fatcat:3ffj55ltrvdgxhopuifkdn6z4u