Mining patterns and rules for software specification discovery

David Lo, Siau-Cheng Khoo
2008 Proceedings of the VLDB Endowment  
Software specifications are often lacking, incomplete and outdated in the industry. Lack and incomplete specifications cause various software engineering problems. Studies have shown that program comprehension takes up to 45% of software development costs. One of the root causes of the high cost is the lack-of documented specification. Also, outdated and incomplete specification might potentially cause bugs and compatibility issues. In this paper, we describe novel data mining techniques to
more » ... or reverse engineer these specifications from the pool of software engineering data. A large amount of software data is available for analysis. One form of software data is program execution traces. A program trace can be viewed as a sequence of events collected when a program is run. A set of program traces in turn can be viewed as a sequence database. In this paper, we present some novel work in mining software specifications by employing novel pattern mining and rule mining techniques. Performance studies show the scalability of our technique. Case studies on traces of a real industrial application show the utility of our technique in recovering program specifications from execution traces.
doi:10.14778/1454159.1454234 fatcat:autgiue3m5aofb6wlpgogagjiu