Instant code clone search

Mu-Woong Lee, Jong-Won Roh, Seung-won Hwang, Sunghun Kim
2010 Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering - FSE '10  
In this paper, we propose a scalable instant code clone search engine for large-scale software repositories. While there are commercial code search engines available, they treat software as text and often fail to find semantically related code. Meanwhile, existing tools for semantic code clone searches take a "post-mortem" approach involving the detection of clones "after" the code development is completed, and hence, fail to return the results instantly. In clear contrast, we combine the
more » ... th of these two lines of existing research, by supporting instant code clone detection. To achieve this goal, we propose scalable indexing structures on vector abstractions of code. Our proposed algorithms allow developers to detect clones of a given code segment among the 1.7 million code segments from 492 open source projects in subsecond response times, without compromising the accuracy obtained by a state-of-the-art tool.
doi:10.1145/1882291.1882317 dblp:conf/sigsoft/LeeRHK10 fatcat:5zr74gdhfnauhgv5zanspwx6ia