A hosting service of multi-language historage repositories

Kyohei Uemura, Yusuke Saito, Shin Fujiwara, Daiki Tanaka, Kenji Fujiwara, Hajimu Iida, Kenichi Matsumoto
2016 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS)  
In the research of Mining Software Repositories, source code repositories are one of the core sources since it contains the product and the process of software development. A source code repository stores the versions of files and makes it possible to browse the histories of files, such as modification dates, authors, messages, so on. Although such rich information of file histories is easily available, extracting the histories of methods/functions, which are elements of source code files, is
more » ... t easy from general code repositories. To tackle this difficulty, we have developed Historage, a fine-grained version control system. Historage repository is a Git repository, which is built upon an original Git repository. Therefore, similar mining techniques for general Git repositories are applicable to Historage repositories. We also have developed Kataribe, a hosting service of Historage repositories, which contains hundreds of Historage repositories constructed from repositories in GitHub, which are written in C#, Java, Python and Ruby. The list of all Historage and original repositories are available at http://kataribe.naist.jp/public. With this dataset, we will promote in-depth and fine-grained software evolution research with diversity of programming languages.
doi:10.1109/icis.2016.7550864 dblp:conf/ACISicis/UemuraSFTFIM16 fatcat:vnwfzfhd4jazhlxurhn3eretli