Overfitting in semantics-based automated program repair

Xuan Bach D. Le, Ferdian Thung, David Lo, Claire Le Goues
2018 Empirical Software Engineering  
The primary goal of Automated Program Repair (APR) is to automatically fix buggy software, to reduce the manual bug-fix burden that presently rests on human developers. Existing APR techniques can be generally divided into two families: semantics-vs. heuristics-based. Semantics-based APR uses symbolic execution and test suites to extract semantic constraints, and uses program synthesis to synthesize repairs that satisfy the extracted constraints. Heuristic-based APR generates large populations
more » ... f repair candidates via source manipulation, and searches for the best among them. Both families largely rely on a primary assumption that a program is correctly patched if the generated patch leads the program to pass all provided test cases. Patch correctness is thus an especially pressing concern. A repair technique may generate overfitting patches, which lead a program to pass all existing test cases, but fails to generalize beyond them. In this work, we revisit the overfitting problem with a focus on semantics-based APR techniques, complementing previous studies of the overfitting problem in heuristics-based APR. We perform our study using IntroClass and Codeflaws benchmarks, two datasets well-suited for assessing repair quality, to Communicated by: Empir Software Eng systematically characterize and understand the nature of overfitting in semantics-based APR. We find that similar to heuristics-based APR, overfitting also occurs in semanticsbased APR in various different ways.
doi:10.1007/s10664-017-9577-2 fatcat:t52pday2kjc3vdgktv42c2ozb4