Semi-Supervised Approach for Recovering Traceability Links in Complex Systems
2018 23rd International Conference on Engineering of Complex Computer Systems (ICECCS)
Building a complex system requires the collaboration of different stakeholders. They work together to model the system keeping in mind the requirements described in specification documents. This complexity induces a large volume of requirements and models, i.e., artefacts that will be subject to frequent changes during the project lifetime. Since the artefacts are correlated with each other's, each change has to be rigorously propagated. Identifying traceability links between system's artefacts
... is then a critical step to reach this goal. In Information Retrieval domain, many approaches have been already proposed to cope with traceability issues. Their main drawback is they introduce an important amount of false positive links making the traceability links validation phase time consuming and error-prone. In this paper, we propose an approach that identifies traceability links with a reduced amount of false positive links ranging from 20% to 30% while raising the amount of true links identified up to 70%. The approach consists of three main steps: 1) we measure syntactical and semantic similarities between pairs of artefacts by combining the use of four major Information Retrieval techniques; 2) using these similarity measures, we identify the most likely true and false links and we build the so called training data set; 3) this training data set and the four IR techniques are used as input of a predictive model in order to classify between true and false links leading ultimately to a reduced amount of false positives. The output is given in the form of a confidence measure that will help the modeller validating the traceability links. We evaluated our approach using four well-known public case studies. Each one comes with a clear identification of true traceability links which allowed us to compare with the outcome of our approach and validate its effectiveness.