A graph-based dataset of commit history of real-world Android apps

Franz-Xaver Geiger, Ivano Malavolta, Luca Pascarella, Fabio Palomba, Dario Di Nucci, Alberto Bacchelli
2018 Proceedings of the 15th International Conference on Mining Software Repositories - MSR '18  
Obtaining a good dataset to conduct empirical studies on the engineering of Android apps is an open challenge. To start tackling this challenge, we present AndroidTimeMachine, the first, self-contained, publicly available dataset weaving spread-out data sources about real-world, open-source Android apps. Encoded as a graph-based database, AndroidTimeMachine concerns 8,431 real open-source Android apps and contains: (i) metadata about the apps' GitHub projects, (ii) Git repositories with full
more » ... mit history and (iii) metadata extracted from the Google Play store, such as app ratings and permissions. CCS CONCEPTS • Software and its engineering → Maintaining software; KEYWORDS Android, Mining Software Repositories, Dataset MATCH (c: Commit ) WHERE c. message CONTAINS ' performance ' SET c : PerformanceFix Also, given these additional labels, performance related fixes can then be used in any kind of query via the following snippet.
doi:10.1145/3196398.3196460 dblp:conf/msr/GeigerMPPNB18 fatcat:hyhvjuvy7fbzhpxd55gpinbavq