Regression testing minimization, selection and prioritization: a survey

S. Yoo, M. Harman
2012 Software testing, verification & reliability  
Regression testing is a testing activity that is performed to provide confidence that changes do not harm the existing behaviour of the software. Test suites tend to grow in size as software evolve, often making it too costly to execute entire test suites. A number of different approaches have been studied to maximise the value of the accrued test suite: minimisation, selection and prioritisation. Test suite minimisation seeks to eliminate redundant test cases in order to reduce the number of
more » ... sts to run. Test case selection seeks to identify the test cases that are relevant to some set of recent changes. Test case prioritisation seeks to order test cases in such a way that early fault detection is maximised. This paper surveys each area of minimisation, selection and prioritisation technique and discusses open problems and potential directions for future research. Definition 1 Test Suite Minimisation Problem Given: A test suite, T , a set of test requirements {r1, . . . , rn}, that must be satisfied to provide the desired 'adequate' testing of the program, and subsets of T , T1, . . . , Tn, one associated with each of the ris such that any one of the test cases tj belonging to Ti can be used to achieve requirement ri. Problem: Find a representative set, T , of test cases from T that satisfies all ris. The testing criterion is satisfied when every test requirement in {r1, . . . , rn} is satisfied. A test requirement, ri, is satisfied by any test case, tj, that belongs to the Ti, a subset of T . Therefore, the representative set of test cases is the hitting set of the Tis. Furthermore, in order to maximise the effect of minimisation, T should be the minimal hitting set of the Tis. The minimal hitting set problem is an NP-complete problem as is the dual problem of the minimal set cover problem [57] . While test case selection techniques also seek to reduce the size of a test suite, the majority of selection techniques are modification-aware. That is, the selection is not only temporary (i.e. specific to the current version of the program), but also focused on the identification of the modified parts of the program. Test cases are selected because they are relevant to the changed parts of the SUT, which typically involves a white-box static analysis of the program code. Throughout this survey, the meaning of 'test case selection problem' is restricted to this modification-aware problem. It is also often referred to as the Regression Test case Selection (RTS) problem. More formally, following Rothermel and Harrold [138], the selection problem is defined as follows (refer to Section 4 for more details on how the subset T is selected): Definition 2 Test Case Selection Problem Given: The program, P , the modified version of P , P , and a test suite, T . Problem: Find a subset of T , T , with which to test P . Finally, test case prioritisation concerns ordering test cases for early maximisation of some desirable properties, such as the rate of fault detection. It seeks to find the optimal permutation of the sequence of test cases. It does not involve selection of test cases, and assumes that all the test cases may be executed in the order of the permutation it produces, but that testing may be terminated at some arbitrary point during the testing process. More formally, the prioritisation problem is defined as follows: Definition 3 Test Case Prioritisation Problem Given: a test suite, T , the set of permutations of T , P T , and a function from P T to real numbers, f : P T → R. Problem: to find T ∈ P T such that (∀T )(T ∈ P T )( This survey focuses on papers that consider one of these three problems. Throughout the paper, these three techniques will be collectively referred to as 'regression testing techniques'. Classification of Test Cases Leung and White present the first systematic approach to regression testing by classifying types of regression testing and test cases [101] . Regression testing can be categorised into progressive regression testing and corrective regression testing. Progressive regression testing involves changes of specifications in P , meaning that P should be tested against S . On the other hand, corrective regression testing does not involve changes in specifications, but only in design decisions and actual instructions. It means that the existing test cases can be reused without changing their input/output relation. Leung and White categorise test cases into five classes. The first three classes consist of test cases that already exist in T . • Reusable: reusable test cases only execute the parts of the program that remain unchanged between two versions, i.e. the parts of the program that are common to P and P . It is unnecessary to execute these test cases in order to test P ; however, they are called reusable because they may still be retained and reused for the regression testing of the future versions of P . • Retestable: retestable test cases execute the parts of P that have been changed in P . Thus retestable test cases should be re-executed in order to test P . • Obsolete: test cases can be rendered obsolete because 1) their input/output relation is no longer correct due to changes in specifications, 2) they no longer test what they were designed to test due to modifications to the program, or 3) they are 'structural' test cases that no longer contribute to structural coverage of the program. The remaining two classes consist of test cases that have yet to be generated for the regression testing of P . • New-structural: new-structural test cases test the modified program constructs, providing structural coverage of the modified parts in P .
doi:10.1002/stv.430 fatcat:kg5sgywm4jfqjl5eiz6ols6x4a