Using Machine Learning and Information Retrieval Techniques to Improve Software Maintainability [chapter]

Anna Corazza, Sergio Di Martino, Valerio Maggio, Alessandro Moschitti, Andrea Passerini, Giuseppe Scanniello, Fabrizio Silvestri
2013 Communications in Computer and Information Science  
The software architecture plays a fundamental role in the comprehension and maintenance of large and complex systems. However, unlike classes or packages, this information is not explicitly represented in the code, giving rise to the definition of different approaches to automatically recover the original architecture of a system. Software architecture recovery (SAR) techniques aim at extracting architectural information from the source code by often involving clustering of program artifacts
more » ... lyzed at different levels of abstraction (e.g, classes or methods). In this paper, we capitalize our expertise in Machine Learning, Natural Language Processing and Information Retrieval to outline promising research lines in the field of automatic SAR. In particular, after presenting an extensive related work, we illustrate a concrete proposal for solving two main subtasks of SAR, i.e., (I) software clone detection and (II) clustering of functional modules according to their lexical semantics. One interesting aspect of our proposed research is the use of advanced approaches, such as kernel methods, for exploiting structural representation of source code.
doi:10.1007/978-3-642-45260-4_9 fatcat:ony773z7tzaw3obzch5ulcgxlm