Tuning research tools for scalability and performance: The NiCad experience

James R. Cordy, Chanchal K. Roy
<span title="">2014</span> <i title="Elsevier BV"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/hq6x4whtd5hhlhsxzculyeamey" style="color: black;">Science of Computer Programming</a> </i> &nbsp;
Clone detection is a research technique for analyzing software systems for similarities, with applications in software understanding, maintenance, evolution, license enforcement and many other issues. The NiCad near-miss clone detection method has been shown to yield highly accurate results in both precision and recall. However, its naive two-step method, involving a parsing first step to identify and normalize code fragments, followed by a text line-based second step using longest common
more &raquo; ... uence (LCS) to compare fragments, has proven difficult to migrate to the efficiency and scalability required for large scale research applications. Rather than presenting the NiCad tool itself in detail, this paper focuses on our experience in migrating NiCad from an initial rapid prototype to a practical scalable research tool. The process has increased overall performance by a factor of up to 40 and clone detection speed by a factor of over 400, while reducing memory and processor requirements to fit on a standard laptop. We apply a sequence of four different kinds of performance optimizations and analyze the effect of each optimization in detail. We believe that the lessons of our experience in migrating NiCad from research prototype to production performance may be beneficial to others who are facing a similar problem.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.scico.2011.11.002">doi:10.1016/j.scico.2011.11.002</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/7qkouhrjafbtjjs4tjkxwx2j6a">fatcat:7qkouhrjafbtjjs4tjkxwx2j6a</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170809111402/https://www.cs.usask.ca/~croy/papers/2012/NiCadSCP2012.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c3/9a/c39a1191d3c7443710027264361d5896d764ebd1.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.scico.2011.11.002"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> elsevier.com </button> </a>