Filters








90 Hits in 5.9 sec

An in-depth study of the promises and perils of mining GitHub

Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M. German, Daniela Damian
2015 Empirical Software Engineering  
We document the results of an empirical study aimed at understanding the characteristics of the repositories and users in GitHub; we see how users take advantage of GitHub's main features and how their  ...  Researchers mine the information stored in GitHub's event logs to understand how its users employ the site to collaborate on software, but so far there have been no studies describing the quality and properties  ...  Acknowledgments We would like to thank the authors of Padhye et al. (2014) and Matragkas et al. (2014) for their valuable feedback regarding the evaluation of the impact of these perils on their research  ... 
doi:10.1007/s10664-015-9393-5 fatcat:hoiypztavrg33n3stbn3a7vuwm

Guest editorial: mining software repositories

Martin Pinzger, Sunghun Kim
2016 Empirical Software Engineering  
The paper "An In-Depth Study of the Promises and Perils of Mining GitHub" by Kalliamvakou, Gousios, Blincoe, Damian, Singer, and German reports the characteristics of the repositories and users on GitHub  ...  The results indicate that while GitHub provides a rich source of data on software development, mining GitHub for research purposes should take various potential perils into account.  ...  Acknowledgments We are grateful for the continuous support and encouragement offered by the editorial board for the Journal of Empirical Software Engineering and by the Editor-in-Chief Lionel Briand and  ... 
doi:10.1007/s10664-016-9450-8 fatcat:yu2pzpbdp5gbhjnw67i3om56cy

The promises and perils of mining GitHub

Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M. German, Daniela Damian
2014 Proceedings of the 11th Working Conference on Mining Software Repositories - MSR 2014  
We document the results of an empirical study aimed at understanding the characteristics of the repositories in GitHub and how users take advantage of GitHub's main featuresnamely commits, pull requests  ...  However, so far there have been no studies describing the quality and properties of the data available from GitHub.  ...  We, therefore formulated the following research question to address with this study: RQ: What are the promises and perils of mining GitHub for software engineering research?  ... 
doi:10.1145/2597073.2597074 dblp:conf/msr/KalliamvakouGBSGD14 fatcat:2oo6n7yu2zeajg34whqtxbyike

An in‐depth study of the effects of methods on the dataset selection of public development projects

Can Cheng, Bing Li, Zengyang Li, Peng Liang, Xu Yang
2021 IET Software  
methods under 18 configurations were tested to identify the best configurations in precision and F-measure for selecting PDPs and DPDPs.  ...  Public development projects (PDPs) and documented public development projects (DPDPs) are two types of projects that can provide valuable information on how developers and users participate in OSS projects  ...  To solve this problem, Kalliamvakou et al. investigated different perils of mining GitHub when using automated sample selection methods and suggested strategies to avoid these perils [3] .  ... 
doi:10.1049/sfw2.12050 fatcat:yu2uw6rrerdmpemqtxnypjzgw4

Replication Can Improve Prior Results: A GitHub Study of Pull Request Acceptance [article]

Di Chen, Kathyrn Stolee, Tim Menzies
2019 arXiv   pre-print
For example, in a primary study, other researchers explored factors influencing the fate of GitHub pull requests using an extensive qualitative analysis of 20 pull requests.  ...  To test the generality of this approach, the next step in future work is to conduct other studies that extend qualitative studies with crowdsourcing and data mining.  ...  ACKNOWLEDGEMENTS The work is partially funded by NSF awards #1506586, #1302169, and #1645136.  ... 
arXiv:1902.04060v1 fatcat:dsg2yqruljbbnlqhipmxha5rgu

Replicating and Scaling up Qualitative Analysis using Crowdsourcing: A Github-based Case Study [article]

Di Chen, Kathryn T. Stolee, Tim Menzies
2017 arXiv   pre-print
That said, they can guide and define the goals of scalable secondary studies that use (e.g.) crowdsourcing+data mining.  ...  Due to the difficulties in replicating and scaling up qualitative studies, such studies are rarely verified.  ...  e promises and perils of mining GitHub Table 2 2 summarizes the most representative features these studies state are relevant to determining the fate of a pull request.  ... 
arXiv:1702.08571v2 fatcat:2uz4ww3vwbfmflmhpjztnfaa6e

Are Game Engines Software Frameworks? A Three-perspective Study [article]

Cristiano Politowski, Fabio Petrillo, João Eduardo Montandon, Marco Tulio Valente, Yann-Gaël Guéhéneuc
2020 arXiv   pre-print
Second, we compare the characteristics of the 282 most popular engines and the 282 most popular frameworks in GitHub.  ...  We report that: (1) Game engines are not well-studied in software-engineering research with few studies having engines as object of research. (2) Open-source game engines are slightly larger in terms of  ...  Acknowledgement The authors thank all the anonymous developers for their time. The authors were partly supported by the NSERC Discovery Grant and Canada Research Chairs programs.  ... 
arXiv:2004.05705v3 fatcat:lgiic75lqvd5nhu5kcyv46hmby

STYLE-ANALYZER: fixing code style inconsistencies with interpretable unsupervised algorithms [article]

Vadim Markovtsev, Waren Long, Hugo Mougard, Konstantin Slavnov, Egor Bulychev
2019 arXiv   pre-print
We release STYLE-ANALYZER as a reusable and extendable open source software package on GitHub for the benefit of the community.  ...  showing that it yields promising results in fixing real style mistakes.  ...  on the LOOKOUT framework.  ... 
arXiv:1904.00935v1 fatcat:47ovl7il2vfddazoq4myojuncy

GitHub Projects. Quality Analysis of Open-Source Software [chapter]

Oskar Jarczyk, Błażej Gruszka, Szymon Jaroszewicz, Leszek Bukowski, Adam Wierzbicki
2014 Lecture Notes in Computer Science  
GitHub portal is an online social network that supports development of software by virtual teams of programmers.  ...  After developing the metrics we have gathered characteristics of several GitHub projects and analyzed their influence on the project quality using statistical regression techniques.  ...  In paper "Social coding in GitHub: transparency and collaboration in an open software repository" by Dabbish et.al. (2012) a series of in-depth interviews with central and peripheral GitHub users was performed  ... 
doi:10.1007/978-3-319-13734-6_6 fatcat:xhcem7dmobalbk3doidvpvmcsu

On the analysis of non-coding roles in open source development

Javier Luis Cánovas Izquierdo, Jordi Cabot
2021 Empirical Software Engineering  
As a sample of projects for our study we have taken the 100 most popular projects in the ecosystem of NPM, a package manager for JavaScript.  ...  Our results validate the importance of dedicated non-coding contributors in OSS and the diversity of OSS communities as, typically, a contributor specializes in a specific subset of roles.  ...  In: Symposium on the foundations of software engineering, pp 70–80 Kalliamvakou E, Gousios G, Blincoe K, Singer L, Germán DM, Damian DE (2016) An in-depth study of the promises and perils of mining  ... 
doi:10.1007/s10664-021-10061-x fatcat:m2h4jobhtfgqxd4dfrmnqg6rxu

Empirical study on the usage of graph query languages in open source Java projects

Philipp Seifer, Johannes Härtel, Martin Leinberger, Ralf Lämmel, Steffen Staab
2019 Proceedings of the 12th ACM SIGPLAN International Conference on Software Language Engineering - SLE 2019  
In this paper, we present an empirical study on the usage of graph-based query languages in opensource Java projects on GitHub.  ...  We investigate the usage of SPARQL, Cypher, Gremlin and GraphQL in terms of popularity and their development over time.  ...  Acknowledgments The authors gratefully acknowledge the financial support of project LISeQ (LA 2672/1-1) by the German Research Foundation (DFG).  ... 
doi:10.1145/3357766.3359541 dblp:conf/sle/SeiferHLLS19 fatcat:k3yi2esz6rf5rhswtuqeymyb6q

Continuously mining distributed version control systems: an empirical study of how Linux uses Git

Daniel M. German, Bram Adams, Ahmed E. Hassan
2015 Empirical Software Engineering  
Finally, we discuss how continuous mining could be adopted by current D-VCS hosting services. 1 Even services on top of D-VCSs, like Github, do not provide a way to know the set of all commits in a Super-repository  ...  Distributed version control systems (D-VCSs-such as git and mercurial) and their hosting services (such as Github and Bitbucket) have revolutionalized the way in which developers collaborate by allowing  ...  They identified 9 promises and 7 perils of the then brand new D-VCS technology. git promised a larger and richer set of development data, and was able to distinguish between patch authors and committers  ... 
doi:10.1007/s10664-014-9356-2 fatcat:e7iyyh6vubel5deo6qmddfyipm

The Who, What, How of Software Engineering Research: A Socio-Technical Framework [article]

Margaret-Anne Storey and Neil A. Ernst and Courtney Williams and Eirini Kalliamvakou
2020 arXiv   pre-print
the research strategies used in the study (how we methodologically approach delivering relevant results given the who and what of our studies).  ...  We recommend that the framework should be used in the design of future studies in order to nudge software engineering research towards explicitly including human and social concerns in their designs, and  ...  We also thank Marian Petre and the anonymous reviewers for their insightful suggestions to improve our paper.  ... 
arXiv:1905.12841v3 fatcat:l6f4g4yjwzdhxjj4hbiydnt7r4

Future Directions of the Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Program [article]

Ritu Arora
2020 arXiv   pre-print
The main objectives of this workshop were to (1) understand the impact of the CSSI program on the community over the last 9 years, (2) engage workshop participants in identifying gaps and opportunities  ...  in the current CSSI landscape, (3) gather ideas on the cyberinfrastructure needs and expectations of the community with respect to the CSSI program, and (4) prepare a report summarizing the feedback gathered  ...  and mine new "guiding principles".  ... 
arXiv:2010.15584v1 fatcat:lpwo6c6dfjcd5hnns3et4b3v4u

D1.3 Scoping phase report: using new data to address R&I policy needs

Knut Blind
2018 Zenodo  
The specific objectives of the scoping phase are as follows: ● To identify the evidence needs of Research and Innovation (R&I) policymakers across the policy cycle, including agenda setting, policy design  ...  The objective of this work phase was to understand the demand for new innovation indicators and new opportunities to address them with new data sources and methods.  ...  explored in more depth.  ... 
doi:10.5281/zenodo.1932554 fatcat:ra3z6vtu45dmfcrhngzalgm5be
« Previous Showing results 1 — 15 out of 90 results