Performance characterization of data mining benchmarks

Vineeth Mekkat, Ragavendra Natarajan, Wei-Chung Hsu, Antonia Zhai
<span title="">2010</span> <i title="ACM Press"> Proceedings of the 2010 Workshop on Interaction between Compilers and Computer Architecture - INTERACT-14 </i> &nbsp;
Explosive growth in the availability of various kinds of data in both commercial and scientific domains have resulted in an unprecedented need to develop novel data-driven, knowledge discovery techniques. Data mining is one such data-centric application. It consists of methods to discover interesting, nontrivial, and useful patterns hidden within massive amounts of data. Researchers from both academia and industry have recognized that the challenges of data mining applications will help shape
more &raquo; ... e future of multi-core processor and parallelizing compiler designs. However, relatively little has been done to understand the performance characteristics of these applications on modern multi-core processors. The exponential growth of on-chip resources make it critical to exploit parallelism at all granularities for improving the performance of data mining applications. In this paper, we examine the instruction-level, memory-level and thread-level parallelism available in data mining applications. We observe that (i) data mining applications have a slightly different instruction mix from SPEC integer applications, and this difference can potentially lead to different ILP extraction; ii) although many data mining applications suffer from data cache miss penalty, similar to SPEC integer applications, different techniques must be developed to enable effective prefetching due to the existance of complex and irregular data structures, such as hash tables; (iii) although data mining applications have large amount of thread-level parallelism, efficient extraction of such parallelism depends on on-chip cache performance; and (iv) the performance characteristics of data mining applications can vary at runtime, and thus techniques that dynamically tune the applications to adapt to such variations are desired.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/1739025.1739040">doi:10.1145/1739025.1739040</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qs7yslt5yjetvp5rsiybexjzve">fatcat:qs7yslt5yjetvp5rsiybexjzve</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20141023145907/http://www-users.cs.umn.edu:80/~natar/publications/interact2010-mekkat.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c2/7a/c27a1b427134109eeb3681904dd07fd15a4ab793.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/1739025.1739040"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>