Filters








2,886 Hits in 5.3 sec

Benchmark synthesis for architecture and compiler exploration

Luk Van Ertvelde, Lieven Eeckhout
2010 IEEE International Symposium on Workload Characterization (IISWC'10)  
Second, the synthetic benchmarks hide proprietary information from the original workloads they are built after.  ...  First, it generates synthetic benchmarks in a high-level programming language (C in our case), in contrast to prior work in benchmark synthesis which generates synthetic benchmarks in assembly.  ...  We then generate a synthetic benchmark from this statistical profile.  ... 
doi:10.1109/iiswc.2010.5650208 dblp:conf/iiswc/ErtveldeE10 fatcat:izitmzshwbctjomtlknam5kyje

Workload generation for microprocessor performance evaluation

Luk Van Ertvelde, Lieven Eeckhout
2012 Proceedings of the third joint WOSP/SIPEW international conference on Performance Engineering - ICPE '12  
order of magnitude over state-of-the-art. (3) It presents a benchmark synthesis framework for generating synthetic benchmarks from a set of desired program statistics.  ...  This PhD thesis [1] , awarded with the SPEC Distinguished Dissertation Award 2011, proposes and studies three workload generation and reduction techniques for microprocessor performance evaluation. (1)  ...  The methodology to generate these benchmarks comprises two key steps: (i) profiling a real-world (proprietary) application (that is compiled at a low optimization level) to measure its execution characteristics  ... 
doi:10.1145/2188286.2188313 dblp:conf/wosp/ErtveldeE12 fatcat:divo5bku7vbkddnnsqyumklziy

Distributed multi-layered workload synthesis for testing stream processing systems

Eric Bouillet, Parijat Dube, David George, Zhen Liu, Dimitrios Pendarakis, Li Zhang
2008 2008 Winter Simulation Conference  
We present a scalable framework for synthesis of distributed workload based on identifying different layers of workload corresponding to different time-scales.  ...  The workload should have realistic volumetric and contextual statistics at different levels: user level, application level, packet level etc.  ...  Note that a complete tree traversal from root to leaf and a content generation is executed at each atomic increment of the iterator.  ... 
doi:10.1109/wsc.2008.4736167 dblp:conf/wsc/BouilletDGLPZ08 fatcat:6d4tv3d4cjembgqgwikck225ty

Performance Cloning: A Technique for Disseminating Proprietary Applications as Benchmarks

Ajay Joshi, Lieven Eeckhout, Robert Bell, Lizy John
2006 2006 IEEE International Symposium on Workload Characterization  
Unlike previously proposed workload synthesis techniques, we only model microarchitectureindependent performance attributes into the synthetic clone.  ...  By using a set of embedded benchmarks from the MediaBench and MiBench suites, we demonstrate that the performance and power consumption of the synthetic clone correlates well with that of the original  ...  Figure 2 shows an example statistical flow graph that is generated by profiling the execution of a program.  ... 
doi:10.1109/iiswc.2006.302734 dblp:conf/iiswc/JoshiEBJ06 fatcat:hbxfvaxogjerxmslexs5delbqm

Distilling the essence of proprietary workloads into miniature benchmarks

Ajay Joshi, Lieven Eeckhout, Robert H. Bell, Lizy K. John
2008 ACM Transactions on Architecture and Code Optimization (TACO)  
embedded workloads considered in the original paper; (2) It characterizes the data locality of general-purpose and scientific benchmarks and shows that the memory access patterns of these programs can  ...  However, composing such representative workloads A preliminary version of this paper entitled "Performance Cloning: A Technique for Disseminating Proprietary Applications as Benchmarks," by A.  ...  The approach used in statistical simulation is to generate a short synthetic trace from a statistical profile of workload attributes, such as basic block size distribution, branch misprediction rate, data  ... 
doi:10.1145/1400112.1400115 fatcat:ne6kmguvufhvlaty3nowlknh54

Workloads of the Future

Jan M. Rabaey, Daniel Burke, Ken Lutz, John Wawrzynek
2008 IEEE Design & Test of Computers  
Sensor networks and distributed information-capture devices are fundamentally changing the nature of the Internet from download centric to upload rich (see Figure 1 ).  ...  The IT infrastructure is moving away from the desktop and laptop model to centralized servers, communicating with ubiquitously distributed (and often mobile) access devices.  ...  In fact, the system reliability is a statistical property, which results from the combination of the statistics of the individual components.  ... 
doi:10.1109/mdt.2008.118 fatcat:4uyaaojnsbajpj3dyzpcbdyx5y

Comprehensive and Efficient Workload Compression [article]

Shaleen Deep, Anja Gruenheid, Paraschos Koutris, Jeffrey Naughton, Stratis Viglas
2021 arXiv   pre-print
This work studies the problem of constructing a representative workload from a given input analytical query workload where the former serves as an approximation with guarantees of the latter.  ...  ., a representative workload, over time. To construct such a workload in a principled manner, we formalize the notions of workload representativity and coverage.  ...  numeric values that are derived from profiling statistics of the query.  ... 
arXiv:2011.05549v2 fatcat:owo3fie7zfaadap3o4bhceqf7q

Using cycle stacks to understand scaling bottlenecks in multi-threaded workloads

Wim Heirman, Trevor E. Carlson, Shuai Che, Kevin Skadron, Lieven Eeckhout
2011 2011 IEEE International Symposium on Workload Characterization (IISWC)  
As a subsequent step, we further extend the methodology to analyze sets of parallel workloads using statistical data analysis, and perform a workload characterization to understand behavioral differences  ...  We analyze the SPLASH-2, PARSEC and Rodinia benchmark suites and conclude that the three benchmark suites cover similar areas in the workload space.  ...  Statistical data analysis using principal component analysis (PCA) allows for analyzing general performance trends across workloads and system settings.  ... 
doi:10.1109/iiswc.2011.6114195 dblp:conf/iiswc/HeirmanCCSE11 fatcat:6rlbwuoia5g7rh3w5dufabf6cu

Parameterized Characterization of Bioinfomatics Workload on SIMD Architecture

Naeem Z. Azeemi, A. Sultan, A Arshad Muhammad
2006 2006 International Conference on Information and Automation  
A set of sixteen widely used bioinformatics applications is selected as benchmark. Software monitoring techniques are used to collect execution traces.  ...  These profiles are compute intensive and offers a wide range of computation pattern ranging from data base searching applications to highly irregular phylogenetic trees.  ...  Section B describes the benchmark applications that are selected from widely use websites. A.  ... 
doi:10.1109/icinfa.2006.374110 fatcat:m3enylpalzas5dheluuvtnv7ou

Control-theoretic dynamic frequency and voltage scaling for multimedia workloads

Zhijian Lu, Jason Hein, Marty Humphrey, Mircea Stan, John Lach, Kevin Skadron
2002 Proceedings of the international conference on Compilers, architecture, and synthesis for embedded systems - CASES '02  
the same workload.  ...  the average frame delay within 10% of the target more than 90% of the time, whereas the change-point detection algorithm kept the average frame delay with 10% of the target only 70% or less of the time executing  ...  CCR-0105626, CCR-0133634, and a grant from Intel MRL. We would also like to thank T. Simunic, J. Pouwelse, and the anonymous reviewers for their helpful comments.  ... 
doi:10.1145/581630.581654 dblp:conf/cases/LuHHSLS02 fatcat:fugxnz4uu5cwnatboogiek5tly

Control-theoretic dynamic frequency and voltage scaling for multimedia workloads

Zhijian Lu, Jason Hein, Marty Humphrey, Mircea Stan, John Lach, Kevin Skadron
2002 Proceedings of the international conference on Compilers, architecture, and synthesis for embedded systems - CASES '02  
the same workload.  ...  the average frame delay within 10% of the target more than 90% of the time, whereas the change-point detection algorithm kept the average frame delay with 10% of the target only 70% or less of the time executing  ...  However, that technique uses profiling to predict energy per instruction and instructions per frame statistics.  ... 
doi:10.1145/581652.581654 fatcat:h2mr36m2bfgljc7tzwkjegc3f4

Data Motif-based Proxy Benchmarks for Big Data and AI Workloads [article]

Wanling Gao, Jianfeng Zhan, Lei Wang, Chunjie Luo, Zhen Jia, Daoyi Zheng, Chen Zheng, Xiwen He, Hainan Ye, Haibin Wang, Rui Ren
2018 arXiv   pre-print
We propose a data motif-based proxy benchmark generating methodology by means of machine learning method, which combine data motifs with different weights to mimic the big data and AI workloads.  ...  Moreover, the generated proxy benchmarks reflect consistent performance trends across different architectures.  ...  Synthetic benchmark is to generate assembly code or C code based on workload profiling [37] , and can work on real hardware as well as execution-driven simulators.  ... 
arXiv:1810.09376v1 fatcat:ingrwyjpobavzhujbukj2gwfkm

Energy-Efficient Cluster Computing via Accurate Workload Characterization

S. Huang, W. Feng
2009 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid  
Using the NAS Parallel Benchmarks as our workload, we then evaluate our eco-friendly daemon on a cluster computer.  ...  This paper presents an eco-friendly daemon that reduces power and energy consumption while better maintaining high performance via an accurate workload characterization that infers "processor stall cycles  ...  time Figure 5 .Figure 6 .Figure 7 . 567 Performance loss on NAS parallel benchmarks CPU energy savings on NAS parallel benchmarks Overall energy savings on NAS parallel benchmarks Table 1 . 1 Statistics  ... 
doi:10.1109/ccgrid.2009.88 dblp:conf/ccgrid/HuangF09 fatcat:zumw7o3ep5bi5nmv2ri5zzbq5m

TransPlant: A parameterized methodology for generating transactional memory workloads

J. Poe, C. Hughes, Tao Li
2009 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems  
In this work, we propose techniques to generate parameterized transactional memory benchmarks based on a feature set, decoupled from the underlying transactional model.  ...  Using principle component analysis, clustering, and raw transactional performance metrics, we show that TransPlant can generate benchmarks with features that lie outside the boundary occupied by these  ...  Benchmark Synthesis Statistical simulation [16] and workload synthesis [15] capture the underlying statistical behavior of a program and use this information to generate a trace or a new representative  ... 
doi:10.1109/mascot.2009.5366659 dblp:conf/mascots/PoeHL09 fatcat:rjzy6za7vfbvzidx2yxoacmwra

Resource Usage Estimation of Data Stream Processing Workloads in Datacenter Clouds [article]

Alireza Khoshkbarforoushha, Rajiv Ranjan, Raj Gaire, Prem P. Jayaraman, John Hosking, Ehsan Abbasnejad
2015 arXiv   pre-print
Recent work has explored the use of statistical techniques for resource estimation of SQL queries and OLTP workloads.  ...  We have validated the models using both the linear road benchmark and the TPC-H, observing high accuracy under a number of error metrics: mean-square error, continuous ranked probability score, and negative  ...  In addition, it has already been successfully applied in the other domains such as statistical parametric speech synthesis, finance, meteorology.  ... 
arXiv:1501.07020v1 fatcat:3zvy2tglgfc6dnfsnv4jwlwkra
« Previous Showing results 1 — 15 out of 2,886 results