6,345 Hits in 7.4 sec

Using Source Code Density to Improve the Accuracy of Automatic Commit Classification into Maintenance Activities [article]

Sebastian Hönel, Morgan Ericsson, Welf Löwe, Anna Wingkvist
2020 arXiv   pre-print
We introduce source code density, a measure of the net size of a commit, and show how it improves the accuracy of automatic commit classification compared to previous size-based classifications.  ...  We also investigate how preceding generations of commits affect the class of a commit, and whether taking the code density of previous commits into account can improve the accuracy further.  ...  Acknowledgments We would like to thank the anonymous reviewers for their invaluable comments, which helped us to further improve this work.  ... 
arXiv:2005.13904v1 fatcat:mkjqhrs5pnaezlf3qjvvy5bxfu

Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities

Priyadarshni Suresh Sagar, Eman Abdulah AlOmar, Mohamed Wiem Mkaouer, Ali Ouni, Christian D. Newman
2021 Algorithms  
This paper investigates to what extent code metrics are good indicators for predicting refactoring activity in the source code.  ...  Understanding how developers refactor their code is critical to support the design improvement process of software.  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/a14100289 fatcat:wd55rrhzmbafhnjsuihlnrtbve

Toward the Automatic Classification of Self-Affirmed Refactoring [article]

Eman Abdullah AlOmar, Mohamed Wiem Mkaouer, Ali Ouni
2020 arXiv   pre-print
We challenge our model using a total of 2,867 commit messages extracted from well-engineered open-source Java projects.  ...  Specifically, we combine the N-Gram TF-IDF feature selection with binary and multiclass classifiers to build a new model to automate the classification of refactorings based on their quality improvement  ...  Using code-density based classification, they achieved up to 89% accuracy for cross project commit classification using LogitBoost classifier.  ... 
arXiv:2009.09279v1 fatcat:hxmfbq3jnfc2flfczfxfsg5tri

On the Documentation of Refactoring Types [article]

Eman Abdullah AlOmar and Jiaqian Liu and Kenneth Addo and Mohamed Wiem Mkaouer and Christian Newman and Ali Ouni and Zhe Yu
2021 arXiv   pre-print
Yet, there is no systematic study that analyzes the extent to which the documentation of refactoring accurately describes the refactoring operations performed at the source code level.  ...  Commit messages are the atomic level of software documentation. They provide a natural language description of the code change and its purpose.  ...  Acknowledgments This material is based on work supported by the National Science Foundation under Grant No. 1757680.  ... 
arXiv:2112.01581v1 fatcat:3g55dn43bja5dilnubsf3o3nfa

What really changes when developers intent to improve their source code: a commit-level study of static metric value and static analysis warning changes [article]

Alexander Trautsch, Johannes Erbel, Steffen Herbold, Jens Grabowski
2022 arXiv   pre-print
We use the model to increase our data set to 125,482 commits.  ...  We manually classify a randomized sample of 2,533 commits from 54 Java open source projects as quality improving depending on the intent of the developer by inspecting the commit message.  ...  Acknowledgements We want to thank the GWDG Göttingen 14 for providing us with computing resources within their HPC-Cluster.  ... 
arXiv:2109.03544v4 fatcat:qbr244fij5axfkcz2lq6q3qmkq

Leveraging Structural Properties of Source Code Graphs for Just-In-Time Bug Prediction [article]

Md Nadim, Debajyoti Mondal, Chanchal K. Roy
2022 arXiv   pre-print
We presented a method to convert the source codes of commit patches to equivalent graph representations and named it Source Code Graph (SCG).  ...  To understand and compare multiple source code graphs, we extracted several structural properties of these graphs, such as the density, number of cycles, nodes, edges, etc.  ...  Acknowledgements This work is supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) and by two Canada First Research Excellence Fund (CFREF) grants coordinated by Global  ... 
arXiv:2201.10137v1 fatcat:goza4e6pprckri4255oeqx5drm

Behind the Scenes: On the Relationship Between Developer Experience and Refactoring [article]

Eman Abdullah AlOmar, Anthony Peruma, Mohamed Wiem Mkaouer, Christian D. Newman, Ali Ouni
2021 arXiv   pre-print
Previous refactoring surveys have shown that code refactoring activities are mainly executed by developers who have sufficient knowledge of the system's design and disposing of leadership roles in their  ...  However, these surveys were mainly limited to specific projects and companies. In this paper, we explore the generalizability of the previous results by analyzing 800 open-source projects.  ...  AlOmar et al ACKNOWLEDGMENTS We would like to thank the authors of Refactoring Miner for publicly providing it.  ... 
arXiv:2109.11089v1 fatcat:76yht4oql5fcplnwjmkppgfqke

A Survey on Mining Software Repositories

Woosung JUNG, Eunjoo LEE, Chisu WU
2012 IEICE transactions on information and systems  
The data sources such as source control systems, bug tracking systems or archived communications, data types and techniques used for general MSR problems are also presented.  ...  This paper presents fundamental concepts, overall process and recent research issues of Mining Software Repositories.  ...  Enslen et al. proposed the Samurai approach to automatically split an identifier into words using a scoring technique based on the word frequencies of the source codes [105] .  ... 
doi:10.1587/transinf.e95.d.1384 fatcat:kfje3mzcufchzdj7qyt5smaaum

An empirical analysis of the impact of software development problem factors on software maintainability

Jie-Cherng Chen, Sun-Jen Huang
2009 Journal of Systems and Software  
For experiment purpose, we will use a the relationship between various factors i.e. program size, ownership and developer quality and to software metric tool, called CCCC , in order to explore the attributes  ...  Another source of difficulty and debate is in determining which metrics matter, and what they mean.  ...  As per the results, network based approach can easily recognize the relations using object oriented metrics and it can optimize the cost of different activities i.e. development, maintenance and can enhance  ... 
doi:10.1016/j.jss.2008.12.036 fatcat:3rp4ndo27vg5rg4xumrqp4jil4

An empirical study of fine-grained software modifications

Daniel M. German
2006 Empirical Software Engineering  
of a project, and how developers might interact between each other and the source code of a system.  ...  We used the information in the MRs to visualize what files are changed at the same time, and who are the people who tend to modify certain files.  ...  The author would like to thank the reviewers of this paper for their thoughtful comments that greatly improved the quality of this paper, and the Apache, Evolution, GNU gcc, Mozilla and PostgreSQL development  ... 
doi:10.1007/s10664-006-9004-6 fatcat:6my2k4c7yzee5e2ytmuzlibqjy

On the Security Cost of Using a Free and Open Source Component in a Proprietary Product [chapter]

Stanislav Dashevskyi, Achim D. Brucker, Fabio Massacci
2016 Lecture Notes in Computer Science  
We investigated publicly available factors (e. g., development activity such as commits, code size, or fraction of code size in different programming languages) to identify which one has the major impact  ...  The work presented in this paper is motivated by the need to estimate the security effort of consuming Free and Open Source Software (FOSS) components within a proprietary software supply chain of a large  ...  They used static code analysis tools to compute several source code metrics and tools for extracting dependency information from the source code, adding this information to the graphs that represent an  ... 
doi:10.1007/978-3-319-30806-7_12 fatcat:lj5xcdvy55b4hbfx2wb2f4xsm4

Understanding the Impact of Development Efforts in Code Quality

Ricardo Perez-Castillo, Mario Piattini
2021 Journal of universal computer science (Online)  
After applying a clustering algorithm, it is detected an inverse correlation in some cases where specific efforts were made to improve code quality.  ...  This study analyses how the evolution of the development effort (i.e., the number of developers and their contributions) influences the code quality (i.e., the number of bugs, code smells, cloning, etc  ...  This work is also part of the projects BIZDEVOPS-Global (RTI2018-098309-B-C31) and ECLIPSE (RTI2018-094283-B-C31) funded by Ministerio de Economía, Industria y Competitividad (MINECO) & Fondo Europeo de  ... 
doi:10.3897/jucs.72475 fatcat:qsh6y3rqbfdr3lq53ctf5qnzou

Categorization and Visualization of Issue Tickets to Support Understanding of Implemented Features in Software Development Projects

Ryo Ishizuka, Hironori Washizaki, Naohiko Tsuda, Yoshiaki Fukazawa, Saori Ouji, Shinobu Saito, Yukako Iimura
2022 Applied Sciences  
Aim: The purpose of this paper is to clarify the way of helping new members understand the implemented features of a project by using tickets.  ...  Our method estimates the number of categories and categorizes issue tickets (tickets) automatically. Moreover, it has two visualizations.  ...  These tickets were assigned to the software engineers. Following the descriptions of the assigned tickets, they implemented the source codes and then commit them to the version control system (VCS).  ... 
doi:10.3390/app12073222 fatcat:7mha5esfgfcwhpfvatvq4crgke

Multi-layered approach for recovering links between bug reports and fixes

Anh Tuan Nguyen, Tung Thanh Nguyen, Hoan Anh Nguyen, Tien N. Nguyen
2012 Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering - FSE '12  
This paper introduces MLink, a multi-layered approach that takes into account not only textual features but also source code features of the changed code corresponding to the commit logs.  ...  It is also capable of learning the association relations between the terms in bug reports and the names of entities/components in the changed source code of the commits from the established bug-to-fix  ...  Aiming to improve further bug-to-fix link recovery accuracy, we introduce MLink approach that explores not only textual features and traditional heuristics, but also source code features of the changed  ... 
doi:10.1145/2393596.2393671 dblp:conf/sigsoft/NguyenNNN12 fatcat:oaezog7xrjbetmgwf5s56kaobq

ComSum: Commit Messages Summarization and Meaning Preservation [article]

Leshem Choshen, Idan Amit
2021 arXiv   pre-print
We present ComSum, a data set of 7 million commit messages for text summarization. When documenting commits, software code changes, both a message and its summary are posted.  ...  Along with its growing size, practicality and challenging language domain, the data set benefits from the living field of empirical software engineering.  ...  Conclusions We present a text summarization data set, ComSum, of significant size, and a methodology to extract larger such data sets in the future.  ... 
arXiv:2108.10763v1 fatcat:vbyda6jb7rh73p6xcylgut5qqi
« Previous Showing results 1 — 15 out of 6,345 results