Filters








29 Hits in 11.8 sec

MapReduce as a general framework to support research in Mining Software Repositories (MSR)

Weiyi Shang, Zhen Ming Jiang, Bram Adams, Ahmed E. Hassan
2009 2009 6th IEEE International Working Conference on Mining Software Repositories  
Researchers continue to demonstrate the benefits of Mining Software Repositories (MSR) for supporting software development and research activities.  ...  In this paper, we propose the use of MapReduce, a distributed computing platform, to support research in MSR.  ...  In this paper, we propose to use MapReduce as a general framework to support research in MSR.  ... 
doi:10.1109/msr.2009.5069477 dblp:conf/msr/ShangJAH09 fatcat:ywcsgsasunbtpjznrzumasrrki

Detailed author index

2009 2009 6th IEEE International Working Conference on Mining Software Repositories  
Data Shang, Weiyi 21 MapReduce as a General Framework to Support Research in Mining Software Repositories (MSR) Shihab, Emad 107 On the Use of Internet Relay Chat (IRC) Meetings by Developers of the GNOME  ...  Using Association Rules to Study the Co-Evolution of Production & Test Code 2009 Detailed Author Index [Page 6 / 11] J Jiang, Zhen Ming 21 MapReduce as a General Framework to Support Research  ... 
doi:10.1109/msr.2009.5069464 fatcat:hbptjwwpvng4hebf6c7ni72siu

Using Pig as a data preparation language for large-scale mining software repositories studies: An experience report

Weiyi Shang, Bram Adams, Ahmed E. Hassan
2012 Journal of Systems and Software  
The Mining Software Repositories (MSR) field analyzes software repository data to uncover knowledge and assist development of ever growing, complex systems.  ...  In this paper, we report on our experience in using a web-scale platform (i.e., Pig) as a data preparation language to aid large-scale MSR studies.  ...  This explosive growth in the availability and size of software data has led to the formation of the Mining Software Repositories (MSR) field (Hassan, 2008) .  ... 
doi:10.1016/j.jss.2011.07.034 fatcat:6lupcadplnaivf6flcm7olqs24

Table of contents

2009 2009 6th IEEE International Working Conference on Mining Software Repositories  
Support Research in Mining Software Repositories (MSR) (Weiyi Shang, Zhen Ming Jiang, Bram Adams, Ahmed E.  ...  German, Prem Devanbu) 11 Amassing and Indexing a Large Sample of Version Control Systems: Towards the Census of Public Source Code History (Audris Mockus) 21 MapReduce as a General Framework to  ... 
doi:10.1109/msr.2009.5069462 fatcat:gcfoz3tnqzhphlylybavdke7se

Boa: A language and infrastructure for analyzing ultra-large-scale software repositories

Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, Tien N. Nguyen
2013 2013 35th International Conference on Software Engineering (ICSE)  
In today's software-centric world, ultra-large-scale software repositories, e.g.  ...  However, systematic extraction of relevant data from these repositories and analysis of such data for testing hypotheses is hard, and best left for mining software repository (MSR) experts!  ...  ACKNOWLEDGMENT This work was supported in part by NSF grants CCF-11-17937, CCF-10-17334, CCF-10-18600, and CNS-12-23828.  ... 
doi:10.1109/icse.2013.6606588 dblp:conf/icse/0001NRN13 fatcat:jhx5nyqlxbekbfbz4ooiwmdhni

Boa

Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, Tien N. Nguyen
2015 ACM Transactions on Software Engineering and Methodology  
However, systematic extraction and analysis of relevant data from these repositories for testing hypotheses is hard, and best left for mining software repository (MSR) experts!  ...  In today's software-centric world, ultra-large-scale software repositories, e.g. SourceForge, GitHub, and Google Code, are the new library of Alexandria.  ...  Authors would also like to express our deepest gratitude for early users and supporters of the Boa infrastructure including but not limited to: Bram Adams, Jonathan Aldrich, Gogul Balakrishnan, Don Batory  ... 
doi:10.1145/2803171 fatcat:pdakpdgfurdidp3m5debjhhuzq

A Survey on Mining Software Repositories

Woosung JUNG, Eunjoo LEE, Chisu WU
2012 IEICE transactions on information and systems  
This paper presents fundamental concepts, overall process and recent research issues of Mining Software Repositories.  ...  The data sources such as source control systems, bug tracking systems or archived communications, data types and techniques used for general MSR problems are also presented.  ...  Shang et al. showed a framework to support MSR research using MapReduce [170] which is a framework to handle large volume of data [169] .  ... 
doi:10.1587/transinf.e95.d.1384 fatcat:kfje3mzcufchzdj7qyt5smaaum

Built to Last or Built Too Fast? Evaluating Prediction Models for Build Times

Ekaba Bisong, Eric Tran, Olga Baysal
2017 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR)  
Automated builds are integral to the Continuous Integration (CI) software development practice. In CI, developers are encouraged to integrate early and often.  ...  This research focuses on finding a balance between integrating often and keeping developers productive. We propose and analyze models that can predict the build time of a job.  ...  R in our experience for this research project does not have a very mature parallelization framework.  ... 
doi:10.1109/msr.2017.36 dblp:conf/msr/BisongTB17 fatcat:hwore6w3k5co7dsb23fjywabx4

Conducting quantitative software engineering studies with Alitheia Core

Georgios Gousios, Diomidis Spinellis
2013 Empirical Software Engineering  
Keywords quantitative software engineering · software repository mining 1 Introdution During the last decade, the availability of open source software (oss), has changed not only the software development  ...  Quantitative empirical software engineering research benefits mightily from processing large open source software repository data sets.  ...  Reference Framework (nsrf) -Research Funding Program: Thalis -Athens University of Economics and Business -Software Engineering Research Platform.  ... 
doi:10.1007/s10664-013-9242-3 fatcat:o2awul5kpvaejbcekk5ogq4nd4

Rapid Multi-Purpose, Multi-Commit Code Analysis

Carol V. Alexandru, Harald C. Gall
2015 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering  
We present a novel approach, which can be used to analyze an arbitrary number of revisions of a software project simultaneously and which can be adapted for the analysis of mixed-language projects.  ...  Existing code-and software evolution studies typically operate on the scale of a few revisions of a small number of projects, mostly because existing tools are unsuited for performing large-scale studies  ...  This contrasts other research approaches in software engineering, such as Mining Software Repositories (MSR), where it is now common to analyze entire commit logs, mailing list communication and issue  ... 
doi:10.1109/icse.2015.211 dblp:conf/icse/AlexandruG15 fatcat:xk3rx5256zdnlh5z7euf56hfb4

World of Code: Enabling a Research Workflow for Mining and Analyzing the Universe of Open Source VCS data [article]

Yuxing Ma, Tapajit Dey, Chris Bogart, Sadika Amreen, Marat Valiev, Adam Tutko, David Kennard, Russell Zaretzki, Audris Mockus
2020 arXiv   pre-print
To evaluate its research potential and to create vignettes for its usage, we employ WoC in conducting several research tasks.  ...  Open source software (OSS) is essential for modern society and, while substantial research has been done on individual (typically central) projects, only a limited understanding of the periphery of the  ...  Acknowledgment This work was supported by the National Science Foundation NSF Awards 1633437, 1901102, and 1925615.  ... 
arXiv:2010.16196v1 fatcat:wonvbkuqtncttja2us5uiidvdy

The SEOSS 33 Dataset — Requirements, Bug Reports, Code History, and Trace Links for Entire Projects

Michael Rath, Patrick Mäder
2019 Data in Brief  
Enriched with additional metadata, such as time stamps, release versions, component information, and developer comments, the dataset is highly suitable for empirical research, e.g., in requirements and  ...  This paper provides a systematically retrieved dataset consisting of 33 open-source software projects containing a large number of typed artifacts and trace links between them.  ...  Transparency document Transparency document associated with this article can be found in the online version at https:// doi.org/10.1016/j.dib.2019.104005.  ... 
doi:10.1016/j.dib.2019.104005 pmid:31198827 pmcid:PMC6557728 fatcat:bg6xvcyiynbijcwbvm7pdh6nni

An exploratory study of the evolution of communicated information about the execution of large software systems

Weiyi Shang, Zhen Ming Jiang, Bram Adams, Ahmed E. Hassan, Michael W. Godfrey, Mohamed Nasser, Parminder Flora
2013 Journal of Software: Evolution and Process  
Substantial research in software engineering focuses on understanding the dynamic nature of software systems in order to improve software maintenance and program comprehension.  ...  In a case study on two large open source and one industrial software systems, we explore the evolution of CI by mining the execution logs of these systems and the logging statements in the source code.  ...  ACKNOWLEDGMENTS We would like to thank the WCRE 2011 reviewers for their valuable feedback. We are also grateful to Research In Motion (RIM) for providing access to the EA used in our case study.  ... 
doi:10.1002/smr.1579 fatcat:klrlvo7oh5gmjbcgdw7yiu5crq

An Exploratory Study of the Evolution of Communicated Information about the Execution of Large Software Systems

Weiyi Shang, Zhen Ming Jiang, Bram Adams, Ahmed E. Hassan, Michael W. Godfrey, Mohamed Nasser, Parminder Flora
2011 2011 18th Working Conference on Reverse Engineering  
Substantial research in software engineering focuses on understanding the dynamic nature of software systems in order to improve software maintenance and program comprehension.  ...  In a case study on two large open source and one industrial software systems, we explore the evolution of CI by mining the execution logs of these systems and the logging statements in the source code.  ...  ACKNOWLEDGMENTS We would like to thank the WCRE 2011 reviewers for their valuable feedback. We are also grateful to Research In Motion (RIM) for providing access to the EA used in our case study.  ... 
doi:10.1109/wcre.2011.48 dblp:conf/wcre/ShangJAHGNF11 fatcat:jqpzpdilbza4tiieqek5rhhsku

How different are different diff algorithms in Git?

Yusuf Sulistyo Nugroho, Hideaki Hata, Kenichi Matsumoto
2019 Empirical Software Engineering  
Thus, we strongly recommend using the Histogram algorithm when mining Git repositories to consider differences in source code. Empirical Software Engineering (2020) 25:790-823 791  ...  Automatic identification of the differences between two versions of a file is a common and basic task in several applications of mining code repositories.  ...  (ICSME) Rank = A International Conference on Mining Software Repositories (MSR) Rank = A International Symposium on Software Testing and Analysis (ISSTA) Rank = A Number of collected papers  ... 
doi:10.1007/s10664-019-09772-z fatcat:26anmxng2rejleexdzc35vi6em
« Previous Showing results 1 — 15 out of 29 results