78 Hits in 1.1 sec

With Registered Reports Towards Large Scale Data Curation [article]

Steffen Herbold
2020 arXiv   pre-print
., as by Herbold et al. [5] for defect prediction. Currently, such replication studies require a large amount of effort by a small group of researchers.  ... 
arXiv:2001.01972v1 fatcat:ppjm266rx5gftptcpitrnnxnja

Benchmarking cross-project defect prediction approaches with costs metrics [article]

Steffen Herbold
2018 arXiv   pre-print
We adopt three performance baselines from Herbold et al.  ...  Both the machine learning metric benchmark by Herbold et al.  ... 
arXiv:1801.04107v1 fatcat:7e74bef3wjditgc6zyoddxvheu

A systematic mapping study on cross-project defect prediction [article]

Steffen Herbold
2017 arXiv   pre-print
Herbold, 2015 Herbold (2015) proposed a tool for the benchmarking of CPDP techniques.  ...  Herbold, 2013 Approach. Herbold (2013) propose a relevancy filter based on the clustering of products using distributional characteristics.  ... 
arXiv:1705.06429v1 fatcat:tmw76rbj7bdgtlze2c45kfozre

The SmartSHARK Ecosystem for Software Repository Mining [article]

Alexander Trautsch, Fabian Trautsch, Steffen Herbold, Benjamin Ledel, Jens Grabowski
2020 arXiv   pre-print
Software repository mining is the foundation for many empirical software engineering studies. The collection and analysis of detailed data can be challenging, especially if data shall be shared to enable replicable research and open science practices. SmartSHARK is an ecosystem that supports replicable and reproducible research based on software repository mining.
arXiv:2001.01606v1 fatcat:r2go6pkzzvfxpbdqutd7k6fffu

MSR Mining Challenge: The SmartSHARK Repository Mining Data [article]

Alexander Trautsch, Fabian Trautsch, Steffen Herbold
2021 arXiv   pre-print
The SmartSHARK repository mining data is a collection of rich and detailed information about the evolution of software projects. The data is unique in its diversity and contains detailed information about each change, issue tracking data, continuous integration data, as well as pull request and code review data. Moreover, the data does not contain only raw data scraped from repositories, but also annotations in form of labels determined through a combination of manual analysis and heuristics,
more » ... s and heuristics, as well as links between the different parts of the data set. The SmartSHARK data set provides a rich source of data that enables us to explore research questions that require data from different sources and/or longitudinal data over time.
arXiv:2102.11540v2 fatcat:smznxa5bkbcmzf2d5gr5pfnspe

Correction of "A Comparative Study to Benchmark Cross-project Defect Prediction Approaches" [article]

Steffen Herbold, Alexander Trautsch, Jens Grabowski
2017 arXiv   pre-print
Unfortunately, the article "A Comparative Study to Benchmark Cross-project Defect Prediction Approaches" has a problem in the statistical analysis which was pointed out almost immediately after the pre-print of the article appeared online. While the problem does not negate the contribution of the the article and all key findings remain the same, it does alter some rankings of approaches used in the study. Within this correction, we will explain the problem, how we resolved it, and present the
more » ... , and present the updated results.
arXiv:1707.09281v1 fatcat:qu7ynbc45fgf3mp2asdc57pr7a

An Industrial Case Study on Shrinking Code Review Changesets through Remark Prediction [article]

Tobias Baum and Steffen Herbold and Kurt Schneider
2018 arXiv   pre-print
Change-based code review is used widely in industrial software development. Thus, research on tools that help the reviewer to achieve better review performance can have a high impact. We analyze one possibility to provide cognitive support for the reviewer: Determining the importance of change parts for review, specifically determining which parts of the code change can be left out from the review without harm. To determine the importance of change parts, we extract data from software
more » ... software repositories and build prediction models for review remarks based on this data. The approach is discussed in detail. To gather the input data, we propose a novel algorithm to trace review remarks to their triggers. We apply our approach in a medium-sized software company. In this company, we can avoid the review of 25% of the change parts and of 23% of the changed Java source code lines, while missing only about 1% of the review remarks. Still, we also observe severe limitations of the tried approach: Much of the savings are due to simple syntactic rules, noise in the data hampers the search for better prediction models, and some developers in the case company oppose the taken approach. Besides the main results on the mining and prediction of triggers for review remarks, we contribute experiences with a novel, multi-objective and interactive rule mining approach. The anonymized dataset from the company is made available, as are the implementations for the devised algorithms.
arXiv:1812.09510v1 fatcat:biwaz4fokzhdfdaj4aoukofxli

Broccoli: Bug localization with the help of text search engines [article]

Benjamin Ledel, Steffen Herbold
2021 arXiv   pre-print
However, the data for these projects are not identical, due to the different times of data collection and the manual validation by Herbold et al. [30] .  ... 
arXiv:2109.11902v2 fatcat:emr7qsiarbbhhatotyb6wjm2xi

A Multi-Objective Anytime Rule Mining System to Ease Iterative Feedback from Domain Experts [article]

Tobias Baum and Steffen Herbold and Kurt Schneider
2018 arXiv   pre-print
Data extracted from software repositories is used intensively in Software Engineering research, for example, to predict defects in source code. In our research in this area, with data from open source projects as well as an industrial partner, we noticed several shortcomings of conventional data mining approaches for classification problems: (1) Domain experts' acceptance is of critical importance, and domain experts can provide valuable input, but it is hard to use this feedback. (2) The
more » ... back. (2) The evaluation of the model is not a simple matter of calculating AUC or accuracy. Instead, there are multiple objectives of varying importance, but their importance cannot be easily quantified. Furthermore, the performance of the model cannot be evaluated on a per-instance level in our case, because it shares aspects with the set cover problem. To overcome these problems, we take a holistic approach and develop a rule mining system that simplifies iterative feedback from domain experts and can easily incorporate the domain-specific evaluation needs. A central part of the system is a novel multi-objective anytime rule mining algorithm. The algorithm is based on the GRASP-PR meta-heuristic but extends it with ideas from several other approaches. We successfully applied the system in the industrial context. In the current article, we focus on the description of the algorithm and the concepts of the system. We provide an implementation of the system for reuse.
arXiv:1812.09746v1 fatcat:e3ugvdiuv5b3tn7xanfaohx2om

Autorank: A Python package for automated ranking of classifiers

Steffen Herbold
2020 Journal of Open Source Software  
Herbold, S., Trautsch, A., & Trautsch, F. (2020). Issues with szz: An empirical assessment of the state of practice of defect prediction data collection.  ...  Journal of Open Source Software, 5(48), 2173. Herbold, S., (2020). Autorank: A Python package for automated ranking of classifiers.  ...  Using Autorank In our research, we recently used autorank to compare differences between data generation methods for defect prediction research (Herbold, Trautsch, & Trautsch, 2020) .  ... 
doi:10.21105/joss.02173 fatcat:42kr3xizfjfgbmo7f2thqwdooy

Calculation and optimization of thresholds for sets of software metrics

Steffen Herbold, Jens Grabowski, Stephan Waack
2011 Empirical Software Engineering  
Steffen Herbold is a doctoral student at the Institute for Computer Science at the Georg-August University of Göttingen.  ... 
doi:10.1007/s10664-011-9162-z fatcat:pcpaboosvbfcpnqx5swbfy2iae

A systematic mapping study of developer social network research [article]

Steffen Herbold, Aynur Amirfallah, Fabian Trautsch, Jens Grabowski
2020 arXiv   pre-print
Developer social networks (DSNs) are a tool for the analysis of community structures and collaborations between developers in software projects and software ecosystems. Within this paper, we present the results of a systematic mapping study on the use of DSNs in software engineering research. We identified 255 primary studies on DSNs. We mapped the primary studies to research directions, collected information about the data sources and the size of the studies, and conducted a bibliometric
more » ... bibliometric assessment. We found that nearly half of the research investigates the structure of developer communities. Other frequent topics are prediction systems build using DSNs, collaboration behavior between developers, and the roles of developers. Moreover, we determined that many publications use a small sample size regarding the number of projects, which could be problematic for the external validity of the research. Our study uncovered several open issues in the state of the art, e.g., studying inter-company collaborations, using multiple information sources for DSN research, as well as general lack of reporting guidelines or replication studies.
arXiv:1902.07499v2 fatcat:2yvfscpt65dm5i7qln7bcb5ggq

Model-based testing as a service

Steffen Herbold, Andreas Hoffmann
2017 International Journal on Software Tools for Technology Transfer (STTT)  
-The article "Combining usage-based and model-based testing for service-oriented architectures in the industrial practice" by Herbold et al.  ...  Details on the complete example are discussed within this special section of the articles by Herbold et al. [13] and Barcelona et al. [3] .  ... 
doi:10.1007/s10009-017-0449-2 fatcat:al6lrkeljbhzlkccu6dfzs25oi

A new perspective on the competent programmer hypothesis through the reproduction of bugs with repeated mutations [article]

Eike Stein, Steffen Herbold, Fabian Trautsch, Jens Grabowski
2021 arXiv   pre-print
The competent programmer hypothesis states that most programmers are competent enough to create correct or almost correct source code. Because this implies that bugs should usually manifest through small variations of the correct code, the competent programmer hypothesis is one of the fundamental assumptions of mutation testing. Unfortunately, it is still unclear if the competent programmer hypothesis holds and past research presents contradictory claims. Within this article, we provide a new
more » ... we provide a new perspective on the competent programmer hypothesis and its relation to mutation testing. We try to re-create real-world bugs through chains of mutations to understand if there is a direct link between mutation testing and bugs. The lengths of these paths help us to understand if the source code is really almost correct, or if large variations are required. Our results indicate that while the competent programmer hypothesis seems to be true, mutation testing is missing important operators to generate representative real-world bugs.
arXiv:2104.02517v1 fatcat:j3vsquj3urd2bgqamdrittso7u

On the feasibility of automated prediction of bug and non-bug issues

Steffen Herbold, Alexander Trautsch, Fabian Trautsch
2020 Empirical Software Engineering  
Steffen Herbold is interim professor at the Karlsruhe Institute of Technology, Germany.  ...  Consequently, Herbold et al. (2020) decided to ignore this small amount of noise, which we also do in this article, i.e., we assume that everything that is not labeled as bug in the data by Herbold  ... 
doi:10.1007/s10664-020-09885-w fatcat:34fb4yabqrgqdb33js3yu6uzji
« Previous Showing results 1 — 15 out of 78 results