On the Use of Process Trails to Understand Software Development

Luigi Cerulo
2006 2006 13th Working Conference on Reverse Engineering  
This thesis investigates the usefulness of historical data stored in software repositories to support developers and managers in their maintenance activities of complex software systems. A set of data extraction and pre-analysis techniques of CVS and Bugzilla repositories are introduced and discussed. They form a basis for analyzing and/or observing software engineering phenomena. The central part of the thesis is the introduction of three cases that take advantage of historical data, when
more » ... able, showing the complementary of historic analysis with respect to static analysis. The first case presents an impact analysis approach that exploits historic impact information stored within CVS and Bugzilla repositories and uses an information retrieval model to infer the set of files impacted by a change by considering those that have been impacted by similar changes in the past. The second case presents a change request assignment approach that selects the best developers able to resolve a new change request. The approach observes what developers have done in the past and uses an information retrieval model to select the set of developers that have previously resolved similar change requests. The third case investigates the use of CVS commit transactions to infer the presence of crosscutting concerns in source code. The hypothesis is that developers usually perform logical transactions coupled in a concern. We show to what extend this information is useful to detect crosscutting concerns and how crosscutting concerns evolve in a software system. All presented approaches are validated empirically using data from several large open source systems and implemented in an Eclipse plug-in, named Jimpa. This thesis highlights the benefits of historic analysis as a complementary alternative to static and dynamic analyses. Software repositories are useful for both researchers and practitioners, respectively, to understand empirically software development, and to predict and plan important aspects of their project.
doi:10.1109/wcre.2006.40 dblp:conf/wcre/Cerulo06 fatcat:7yffx5cgajgj7cv4lvma7lecye