Guest editorial: mining software repositories

Martin Pinzger, Sunghun Kim
2016 Empirical Software Engineering  
The Mining Software Repositories (MSR) field analyzes the rich data available in software repositories to uncover interesting and actionable information about software systems and projects. Thanks to the ready availability of software configuration management, mailing list, and bug tracking repositories from open source projects, it has gained popularity since 2004 and continues to be one of the fastest growing fields in the area of software engineering. Researchers in this field empirically
more » ... lore a range of software engineering questions using software repository data as the primary source of information. Some commonly explored areas include software evolution, models of software development processes, characterization of developers and their activities, prediction of future software qualities, use of machine learning techniques on software project data, software bug prediction, analysis of software change patterns, and analysis of code clones. This special issue provides five recent MSR papers, that are briefly discussed as follows. The paper "An In-Depth Study of the Promises and Perils of Mining GitHub" by Kalliamvakou, Gousios, Blincoe, Damian, Singer, and German reports the characteristics of the repositories and users on GitHub including how users take advantage of GitHub's main features and how their activities are tracked on GitHub and related datasets to point out misalignments between the real and mined data. The results indicate that while GitHub provides a rich source of data on software development, mining GitHub for research purposes should take various potential perils into account. In the paper "Studying Just-In-Time Defect Prediction Using Cross-Project Models" by Kamei, Fukushima, McIntosh, Yamashita, Ubayashi, and Hassan, the cold start problem
doi:10.1007/s10664-016-9450-8 fatcat:yu2pzpbdp5gbhjnw67i3om56cy