Causality in Requirements Artifacts: Prevalence, Detection, and Impact
Causal relations in natural language (NL) requirements convey strong, semantic information. Automatically extracting such causal information enables multiple use cases, such as test case generation, but it also requires to reliably detect causal relations in the first place. Currently, this is still a cumbersome task as causality in NL requirements is still barely understood and, thus, barely detectable. Objective: In our empirically informed research, we aim at better understanding the notion
... f causality and supporting the automatic extraction of causal relations in NL requirements. Method: In a first case study, we investigate 14.983 sentences from 53 requirements documents to understand the extent and form in which causality occurs. Second, we present and evaluate a tool-supported approach, called CiRA, for causality detection. We conclude with a second case study where we demonstrate the applicability of our tool and investigate the impact of causality on NL requirements. Results: The first case study shows that causality constitutes around 28% of all NL requirements sentences. We then demonstrate that our detection tool achieves a macro-F1 score of 82% on real-world data and that it outperforms related approaches with an average gain of 11.06% in macro-Recall and 11.43% in macro-Precision. Finally, our second case study corroborates the positive correlations of causality with features of NL requirements. Conclusion: The results strengthen our confidence in the eligibility of causal relations for downstream reuse, while our tool and publicly available data constitute a first step in the ongoing endeavors of utilizing causality in RE and beyond.