Filters








5,629 Hits in 1.8 sec

Efficient Sampling Startup for Sampled Processor Simulation [chapter]

Michael Van Biesbrouck, Lieven Eeckhout, Brad Calder
2005 Lecture Notes in Computer Science  
In this paper we examine efficient Sampling Startup techniques addressing two issues: how to represent the correct memory image during simulation, and how to deal with warmup.  ...  This paper presents several Sampling Startup techniques and compares them against previously proposed techniques.  ...  Acknowledgments We would like to thank the anonymous reviewers for providing helpful comments on this paper. This work was funded in part by NSF  ... 
doi:10.1007/11587514_5 fatcat:aha4kz4d3jfvdjfob43rbh7pdi

Energy-efficient communication for ad-hoc wireless sensor networks

R. Min, A. Chandrakasan
2001 Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers (Cat.No.01CH37256)  
Packet 32 Samples/Packet 64 Samples/Packet 128 Samples/Packet  ...  Subsystem n Low duty cycle n Buffered transmission Software/Applications n Energy-Quality Scalability n Power-Aware OS Node Implementation Processor Board Radio Processor DVS 53%  ... 
doi:10.1109/acssc.2001.986894 fatcat:rhfxmoi2hfarzfmc2ztjrkkfze

Efficient Sampling Startup for SimPoint

M. Van Biesbrouck, B. Calder, L. Eeckhout
2006 IEEE Micro  
Acknowledgments We thank the anonymous reviewers for their helpful comments on this article.  ...  By combining TMI with MHS, we can accurately and efficiently collect samples of simulated processor execution.  ...  This article proposes efficient and accurate sampling startup approaches.  ... 
doi:10.1109/mm.2006.68 fatcat:vxv3pkcagvf4pc5faso6ftshuu

GATE: Improving the computational efficiency

S. Staelens, J. De Beenhouwer, D. Kruecker, L. Maigne, F. Rannou, L. Ferrer, Y. D'Asseler, I. Buvat, I. Lemahieu
2006 Nuclear Instruments and Methods in Physics Research Section A : Accelerators, Spectrometers, Detectors and Associated Equipment  
Finally, an elaboration on the deployment of GATE on the Enabling Grids for E-Science in Europe (EGEE) grid will conclude the description of efficiency enhancement efforts.  ...  This manuscript describes three different techniques in order to improve the efficiency of those simulations.  ...  Acknowledgments This work was supported by the Institute for the Promotion of Innovation by Science and Technology in Flanders (IWT, Belgium), by the Fund for Scientific Research Flanders (FWO, Belgium  ... 
doi:10.1016/j.nima.2006.08.070 fatcat:v6kmt57tu5bjvalbg4qtiefqou

High-Performance Algorithms for Drift Avoidance and Fast Tracking in Solar MPPT System

A. Pandey, N. Dasgupta, A.K. Mukerjee
2008 IEEE transactions on energy conversion  
He is currently engaged in research and product development in high-availability and high-efficiency power electronics for power supplies, power quality, and nonconventional energy sources.  ...  Of the numerous algorithms for this purpose, perturb and observe (P&O) is a standard.  ...  Moreover, a variable sampling frequency will make scheduling of MPPT algorithms in processors difficult in a multitasking system.  ... 
doi:10.1109/tec.2007.914201 fatcat:d7d2mbyi35gazdapzrdurn7usi

Quantifying the impact of GPUs on performance and energy efficiency in HPC clusters

Jeremy Enos, Craig Steffen, Joshi Fullop, Michael Showerman, Guochun Shi, Kenneth Esler, Volodymyr Kindratenko, John E. Stone, James C. Phillips
2010 International Conference on Green Computing  
wall clock time for whole application to run • Power consumption measurements made over at least 20 sample runs • Removed power measurements from startup and shutdown phases of applications NOTE  ...  GPU codes • Sample set was "STMV" 1 million atom virus simulation • Performance measure is simulation time step per wall clock time • CPU-only: 6.6 seconds per timestep; 316 Watts • CPU+GPU:  ... 
doi:10.1109/greencomp.2010.5598297 dblp:conf/green/EnosSFSSEKSP10 fatcat:rv7gb2nw6fgf7akjusaq2wne7e

Implementation and performance of FDPS: a framework for developing parallel particle simulation codes

Masaki Iwasawa, Ataru Tanikawa, Natsuki Hosono, Keigo Nitadori, Takayuki Muranushi, Junichiro Makino
2016 Nippon Tenmon Gakkai obun kenkyu hokoku  
FDPS provides all of these necessary functions for efficient parallel execution of particle-based simulations as "templates", which are independent of the actual data structure of particles and the functional  ...  We present the basic idea, implementation, measured performance and performance model of FDPS (Framework for developing particle simulators).  ...  The efficiency of XC30 is a bit worse than that of the K computer. This difference comes from the difference of two processors. The Fujitsu processor showed higher efficiency.  ... 
doi:10.1093/pasj/psw053 fatcat:pm4p3wugbbbajkxlersnrz34ke

Guest Editors' Introduction: Computer Architecture Simulation and Modeling

T. Sherwood, Joshua J. Yi
2006 IEEE Micro  
Two of the articles we have selected, "SimFlex: Statistical Sampling of Computer System Simulation" and "Efficient Sampling Startup for SimPoint," attempt to address this problem through clever techniques  ...  In "Efficient Sampling Startup for SimPoint," Van 6 GUEST EDITORS' INTRODUCTION IEEE MICRO Only when the ideas in these articles make the leap from academic enterprise to engineering solution will our  ... 
doi:10.1109/mm.2006.70 fatcat:hucjuetldzdrjagsw3wflozfzy

Leveraging the checkpoint-restart technique for optimizing CPU efficiency of ATLAS production applications on opportunistic platforms

D Cameron, J Elmsheuser, L Heinrich, W Lavrijsen, P Nilsson, V Tsulaia, M Vogel, ATLAS Collaboration
2018 Journal of Physics, Conference Series  
This allows us to checkpoint one job at the end of its configuration step and then use the generated checkpoint image for rapid startup of thousands of production jobs.  ...  ) and the usage of these images for running ATLAS Simulation production jobs on volunteer computing resources (ATLAS@Home) and on Supercomputers.  ...  Acknowledgments The authors would like to thank our volunteer testers in ATLAS@Home for running the jobs for us and providing useful feedback.  ... 
doi:10.1088/1742-6596/1085/3/032028 fatcat:yj7przfczfa6hjy6xmsv3p3amq

Asynchronous replica exchange for molecular simulations

Emilio Gallicchio, Ronald M. Levy, Manish Parashar
2008 Journal of Computational Chemistry  
In asynchronous replica exchange pairs of processors initiate and perform temperature replica exchanges independently from the other processors, thereby removing the need for processor synchronization  ...  Illustrative calculations on a molecular system are presented that show that asynchronous replica exchange, contrary to the synchronous implementation, is able to utilize at nearly top efficiency loosely  ...  Acknowledgments We thank Li Zhang for her work on the development of the Salsa communication framework.  ... 
doi:10.1002/jcc.20839 pmid:17876761 pmcid:PMC2977925 fatcat:otdkanawnjb55isks7e3gahs6y

A performance model of fast 2D-DCT parallel JPEG encoding using CUDA GPU and SMP-architecture

Mohammed K. Ali Shatnawi, Hussein Ali Shatnawi
2014 2014 IEEE High Performance Extreme Computing Conference (HPEC)  
To achieve maximal efficiency, we exploit the substantial parallelism to design an optimized version of JPEG based on thread model.  ...  The performance of image compression algorithms for big data can be enhanced using parallel computations.  ...  CONCLUSION We have presented two efficient designs for JPEG encoding for 24-bit BMP images.  ... 
doi:10.1109/hpec.2014.7040947 dblp:conf/hpec/ShatnawiS14 fatcat:4xhkjw6qpfbivjml3zaug7aemi

Opportunities and obstacles in low-power system-level CAD

Andrew Wolfe
1996 Proceedings of the 33rd annual conference on Design automation conference - DAC '96  
We detail the design of a low-power embedded system, a touchscreen interface device for a personal computer.  ...  Additionally, we highlight opportunities to use system-level design and analysis tools for low-power design and the obstacles that prevented using such tools in this design.  ...  Analytical solutions are often reasonably accurate for steady-state operation, but boundary conditions, like startup, are difficult to predict without simulation.  ... 
doi:10.1145/240518.240521 dblp:conf/dac/Wolfe96 fatcat:bvyxbifiqrcurl2lwlelmh5kau

The bulk-synchronous parallel random access machine

Alexandre Tiskin
1998 Theoretical Computer Science  
The two models are related by efficient simulations for a broad range of algorithms. We identify some properties of a BSPRAM algorithm that suffice for its optimal simulation in BSP.  ...  Consequently, much effort was put into the development of efficient methods for simulation of PRAM on more realistic models.  ...  As for PRAM simulation, some "extra parallelism" is necessary for efficient BSPRAM simulation on BSP.  ... 
doi:10.1016/s0304-3975(97)00197-7 fatcat:ut4vj4f76fdqppvqckztariedq

Energy-centric enabling tecumologies for wireless sensor networks

R. Min, M. Bhardwaj, Seong-Hwan Cho, N. Ickes, E. Shih, A. Sinha, Alice Wang, A. Chandrakasan
2002 IEEE wireless communications  
New levels of energy efficiency -attained through global system-level perspectives on node and network energy consumption -will enable a future where networks of hundreds, thousands, and eventually many  ...  In this article we advocate two particular enablers for energy conservation: the ability to trade off performance for energy savings within the node, and collaborative processing among nodes to reduce  ...  By removing the architectural overhead of decoding and processing general-purpose instructions, we will trade the flexibility of a general-purpose processor for a lean energy-efficient processor that supports  ... 
doi:10.1109/mwc.2002.1028875 fatcat:qamroxouxjeexaag3fbniti5wa

The CMS High Level Trigger: Commissioning and First Operation with LHC Beams [article]

Marta Felcini, Marco Zanetti
2009 arXiv   pre-print
The Filter Farm has been equipped with 720 such processors, providing a computing power at least a factor two larger than expected to be needed at startup.  ...  The required computing power needed to process with no dead time a maximum HLT input rate of 50 kHz, as expected at startup, has been measured, using the most recent commercially available processors.  ...  Meschi for careful reading of the manuscript and valuable comments.  ... 
arXiv:0905.0714v1 fatcat:744dmnulnnfsjktr3ak2tsw6ve
« Previous Showing results 1 — 15 out of 5,629 results