Analyzing and Forecasting Near-Miss Clones in Evolving Software: An Empirical Study

Minhaz F. Zibran, Ripon K. Saha, Muhammad Asaduzzaman, Chanchal K. Roy
2011 2011 16th IEEE International Conference on Engineering of Complex Computer Systems  
Effort for development and maintenance of complex large software is believed to have dependency on the amount of duplicated code fragments (code clones) present in codebases. For example, clones need to be carefully and consistently maintained and/or refactored for preventing accidental error propagation. Thus it is important to understand the proportion and evolution of clones in evolving software systems for cost estimation or the like. This paper presents a study on the evolution of
more » ... clones at release level in medium to large open source software systems of different types (operating systems, database systems, editors, etc.) written in three different programming languages namely C, C#, and Java. Using a hybrid clone detector, NiCad, we detected both exact and near-miss clones at different levels of similarity. Applying statistical methods we investigated, from different dimensions, the evolution of both exact and nearmiss clones, and also forecasted the amount of clones in future releases of the software systems. Our study offers significant insights into the existence and evolution of code clones and their relationships with programming language or paradigm and program size.
doi:10.1109/iceccs.2011.36 dblp:conf/iceccs/ZibranSAR11 fatcat:2i7sseiz3rh3tbewkuabh4js64