Clustering Commits for Understanding the Intents of Implementation

Kenji Yamauchi, Jiachen Yang, Keisuke Hotta, Yoshiki Higo, Shinji Kusumoto
2014 2014 IEEE International Conference on Software Maintenance and Evolution  
This paper proposes a novel technique for clustering commits for understanding the intents of implementation. Such a classification of commits should be able to assist developers to understand commits related to particular requirements, for example, how and why has this function been implemented, or has this function suffered from any bugs? Our technique adopts a clustering algorithm on identifier names that are related to changes in each commit. Such an approch allows us to take the semantics
more » ... f each commit into account without commit messages, and so our approach is robust for the situation where some commits lack accurate descriptions. We conducted a pilot study to confirm that our idea answers to our objective. The pilot study found some good examples that showed the usefulness of our approach, and there were some undesirable results that gave some ideas to improve it. Hindle et al. proposed an automated technique to classify commits into maintenance categories, namely Corrective, Adaptive, Perfective, Feature Addition, and Non Functional [1] . Their technique trains the machine learning models with word distribution of commit messages, names of committers, and the number of changed files per directory.
doi:10.1109/icsme.2014.63 dblp:conf/icsm/YamauchiYHHK14 fatcat:wjqpj45dn5hdvenl6dsupje7ky