Search-based duplicate defect detection: An industrial experience

Mehdi Amoui, Nilam Kaushik, Abraham Al-Dabbagh, Ladan Tahvildari, Shimin Li, Weining Liu
2013 2013 10th Working Conference on Mining Software Repositories (MSR)  
Duplicate defects put extra overheads on software organizations, as the cost and effort of managing duplicate defects are mainly redundant. Due to the use of natural language and various ways to describe a defect, it is usually hard to investigate duplicate defects automatically. This problem is more severe in large software organizations with huge defect repositories and massive number of defect reporters. Ideally, an efficient tool should prevent duplicate reports from reaching developers by
more » ... utomatically detecting and/or filtering duplicates. It also should be able to offer defect triagers a list of top-N similar bug reports and allow them to compare the similarity of incoming bug reports with the suggested duplicates. This demand has motivated us to design and develop a search-based duplicate bug detection framework at BlackBerry. The approach follows a generalized process model to evaluate and tune the performance of the system in a systematic way. We have applied the framework on software projects at BlackBerry, in addition to the Mozilla defect repository. The experimental results exhibit the performance of the developed framework and highlight the high impact of parameter tuning on its performance.
doi:10.1109/msr.2013.6624025 dblp:conf/msr/AmouiKATLL13 fatcat:mjfhj43qpngsnmqpbwlltsqgqm