NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization

C.K. Roy, J.R. Cordy
2008 2008 16th IEEE International Conference on Program Comprehension  
This paper examines the effectiveness of a new languagespecific parser-based but lightweight clone detection approach. Exploiting a novel application of a source transformation system, the method accurately finds near-miss clones using an efficient text line comparison technique. The transformation system assists the method in three ways. First, using agile parsing it provides user-specified flexible pretty-printing to remove noise, standardize formatting and break program statements into parts
more » ... atements into parts such that potential changes can be detected as simple linewise text differences. Second, it provides efficient flexible extraction of potential clones to be compared using island grammars and agile parsing to select granularities and enumerate potential clones. Third, using transformation rules it provides flexible code normalization to allow for local editing differences between similar code segments and filtering out of uninteresting parts of potential clones. In this paper we introduce the theory and practice of the framework and demonstrate its use in finding function clones in C code. Early experiments indicate that the method is capable of finding near-miss clones with high precision and recall, and with reasonable performance.
doi:10.1109/icpc.2008.41 dblp:conf/iwpc/RoyC08a fatcat:eqqzrazscbg3dngfspbjuorpsy