Filters








31,157 Hits in 6.8 sec

Compiler and System Techniques for soc Distributed Reconfigurable Accelerators [chapter]

Joël Cambonie, Sylvain Guérin, Ronan Keryell, Loïc Lagadec, Bernard Pottier, Olivier Sentieys, Bernt Weber, Samar Yazdani
2004 Lecture Notes in Computer Science  
We propose a general framework for SoC architectures and software tools in which different kind of processing units are programmed at high level.  ...  We show a reconfigurable unit suitable for this framework and we draw the outline of a supercompiler able to address such an architecture.  ...  These units can be powerful processors, micro-programmable architectures, reconfigurable data-paths, or fine-grain embedded fpgas.  ... 
doi:10.1007/978-3-540-27776-7_31 fatcat:tfad6qofj5fxxcdp5tcvj75wvy

IMPROVING PERFORMANCE IN HPC SYSTEM UNDER POWER CONSUMPTIONS LIMITATIONS

Muhammad Usman Ashraf
2019 International Journal of Advanced Research in Computer Science  
The next breakthrough in the computing revolution is the Exascale level of performance that is 10 18 calculations per second-a remarkable achievement in computing that will have a fathomless influence  ...  Even though the Exascale performance can be achieved by multiplying the number of cores according to Exascale computing system constraints, the challenge of power consumption still persists.  ...  In this report, we have observed that a tri-level MOC (MPI+OpenMP+CUDA) model has achieved a tremendous performance by providing coarse-grained, fine-grained and finer granularity parallelism.  ... 
doi:10.26483/ijarcs.v10i2.6397 fatcat:k3l3lk5kuzhnldn5b2qzkh4eia

Frameworks for Multi-core Architectures: A Comprehensive Evaluation Using 2D/3D Image Registration [chapter]

Richard Membarth, Frank Hannig, Jürgen Teich, Mario Körner, Wieland Eckert
2011 Lecture Notes in Computer Science  
In an empirical study, a fine-grained data parallel and a coarse-grained task parallel parallelization approach are used to evaluate and estimate different aspects like usability, performance, and overhead  ...  In the beginning, programmers had to divide and distribute the work by hand to the available cores and to manage threads in order to use more than one core.  ...  The functionality is provided by a library and OpenCL programs are just-in-time compiled from the run-time environment like in RapidMind. The kernels are stored in strings, just like in OpenGL.  ... 
doi:10.1007/978-3-642-19137-4_6 fatcat:guuyi6d3ercqvbgyfs3v5p5lya

Are Coarse-Grained Overlays Ready for General Purpose Application Acceleration on FPGAs?

Abhishek Kumar Jain, Douglas L. Maskell, Suhaib A. Fahmy
2016 2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech)  
It is clear that the PAR complexity is significantly reduced by using coarse-grained architectures, Coarse-grained architecture. thus reducing compilation time.  ...  This is motivated by the fact that programs can be written at a higher level of abstraction with compilation to the overlay being several orders of magnitude faster than for the fine grained FPGA on which  ... 
doi:10.1109/dasc-picom-datacom-cyberscitec.2016.110 dblp:conf/dasc/JainMF16 fatcat:gmiz7uunpbaatjryzjiozj24om

Software transactional memory for multicore embedded systems

Jennifer Mankin, David Kaeli, John Ardini
2009 Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems - LCTES '09  
We find that we can meet or beat the performance of fine-grained locking over a range of application characteristics, including size of shared data, time spent in the critical section, and contention between  ...  We offer a full implementation of an embedded STM and test it against both coarse-grained and fine-grained locking mechanisms.  ...  Acknowledgments This work has been supported by Charles Stark Draper Laboratories, Inc. Jennifer Mankin was supported on a Draper Fellowship.  ... 
doi:10.1145/1542452.1542465 dblp:conf/lctrts/MankinKA09 fatcat:r6gm5sbzvvcddfhvtbtbpa7mem

A Run-Time System for Dynamic Grain Packing [chapter]

JoãoLuís Sobral, AlbertoJosé Proença
1999 Lecture Notes in Computer Science  
The SCOOPP (Scalable Object Oriented Parallel Programming) system is an hybrid compile and run-time system, that extracts parallelism, supports explicit parallelism and dynamically serialises parallel  ...  tasks in excess, to dynamically scale applications through a wide range of target platforms.  ...  Static granularity control [1] [2] is usually applied to fine grained tasks, whose behaviour is known at compile-time.  ... 
doi:10.1007/3-540-48311-x_119 fatcat:lu5uf5twijhnfnwmqc7xe7aqqm

Toward a Self-Aware Codelet Execution Model

Stephane Zuckerman, Aaron Landwehr, Kelly Livingston, Guang Gao
2014 2014 Fourth Workshop on Data-Flow Execution Models for Extreme Scale Computing  
This paper takes the position that a potential solution to solve the resource management issue at this scale is a hierarchical and distributed self-aware system leveraging the fine-grain eventdriven codelet  ...  The Codelet Model is a fine-grain dataflow-inspired and eventdriven program execution model which was designed to run parallel programs on a combination of such many-core chips into a supercomputer Meanwhile  ...  inevitably lead to fine-grain resource management.  ... 
doi:10.1109/dfm.2014.12 fatcat:rq6qmxv5bnbr3jjhxztposhpju

Fine-grained modularity and reuse of virtual machine components

Christian Wimmer, Stefan Brunthaler, Per Larsen, Michael Franz
2012 Proceedings of the 11th annual international conference on Aspect-oriented Software Development - AOSD '12  
Among the novel use cases that will be enabled by our research are: VM extensions by third parties, support for multiple languages inside one VM, and a universal VM for mobile devices.  ...  We will split the VMs into fine-grained modules, define explicit interfaces and extension points for the modules, and finally re-connect them.  ...  Acknowledgments Parts of this effort have been sponsored by the National Science Foundation under grant CCF-1117162 and by Samsung Telecommunications America under Agreement No. 51070.  ... 
doi:10.1145/2162049.2162073 dblp:conf/aosd/WimmerBLF12 fatcat:ddkikcw3crhf7nr5ik2znw3zsa

On the adequacy of lightweight thread approaches for high-level parallel programming models

Adrián Castelló, Rafael Mayo, Kevin Sala, Vicenç Beltran, Pavan Balaji, Antonio J. Peña
2018 Future generations computer systems  
High-level parallel programming models (PMs) are becoming crucial in order to extract the computational power of current on-node multi-threaded parallelism.  ...  Our work reveals those scenarios where LWTs overperform pthread-based solutions and compares the performance between an ad hoc solution and a generic implementation.  ...  At this time, this PM is only supported by the Mercurium compiler [26] and the Nanos++ runtime.  ... 
doi:10.1016/j.future.2018.02.016 fatcat:pbo2kyo4sjgzppbxjo2cf7ofza

Evaluation of Power Consumption at Execution of Multiple Automatically Parallelized and Power Controlled Media Applications on the RP2 Low-Power Multicore [chapter]

Hiroki Mikami, Shumpei Kitaki, Masayoshi Mase, Akihiro Hayashi, Mamoru Shimaoka, Keiji Kimura, Masato Edahiro, Hironori Kasahara
2013 Lecture Notes in Computer Science  
compiler are executed simultaneously on RP2, a 8-core multicore processor developed by Renesas Electronics, Hitachi, and Waseda University.  ...  Finally, when a combination of a high computational load application program and an intermediate computational load application program are executed simultaneously, the consumed power reduced by 21% by  ...  A part of this research has been supported by NEDO "Advanced Heterogeneous Multiprocessor", NEDO "Multi core processors for realtime consumer electronics" and STARC "Automatic Parallelizing Compiler Cooperative  ... 
doi:10.1007/978-3-642-36036-7_3 fatcat:hgyucsbr5fgsnkednsi2hp3f2i

The OpenTM Transactional Application Programming Interface

Woongki Baek, Chi Cao Minh, Martin Trautmann, Christos Kozyrakis, Kunle Olukotun
2007 Parallel Architecture and Compilation Techniques (PACT), Proceedings of the International Conference on  
The implementation builds upon the OpenMP support in the GCC compiler and includes a runtime for the C programming language. We evaluate the performance and programmability features of OpenTM.  ...  We show that it delivers the performance of fine-grain locks at the programming simplicity of coarsegrain locks.  ...  Woongki Baek is supported by an STMicroelectronics Stanford Graduate Fellowship and a Samsung Scholarship.  ... 
doi:10.1109/pact.2007.4336227 fatcat:nn7gbfngvrff5egt4jzfgurptm

Program-driven fine-grained power management for the reconfigurable mesh

Heiner Giefers, Marco Platzner
2009 2009 International Conference on Field Programmable Logic and Applications  
For one application we report a reduction in power and energy consumption by 21.09%.  ...  We extend our previous reconfigurable mesh architecture and the corresponding programming language ARMLang to support programdriven power management.  ...  Further, for our fine-grained power management of the reconfigurable mesh we need to support a low-power mode for the processing elements.  ... 
doi:10.1109/fpl.2009.5272527 dblp:conf/fpl/GiefersP09 fatcat:nk26rj33svbidhrk6mqabui6mu

Improving the design flow for parallel and heterogeneous architectures running real-time applications: The PHARAON FP7 project

Héctor Posadas, Alejandro Nicolás, Pablo Peñil, Eugenio Villar, Florian Broekaert, Michel Bourdelles, Albert Cohen, Mihai T. Lazarescu, Luciano Lavagno, Andrei Terechko, Miguel Glassee, Manuel Prieto
2014 Microprocessors and microsystems  
reduce power consumption in a transparent manner for applications.  ...  In this article, we present the work-in-progress of the EU FP7 PHARAON project, started in September 2011.  ...  Acknowledgments This work is being performed in the framework of the FP7-288307 project PHARAON.  ... 
doi:10.1016/j.micpro.2014.05.003 fatcat:6kbn3sgvkjglhgr6yjbdmhqkhu

Adaptive granularity memory systems

Doe Hyun Yoon, Min Kyu Jeong, Mattan Erez
2011 SIGARCH Computer Architecture News  
The evaluation shows that performance is improved by 61% without ECC and 44% with ECC in memory-intensive applications, while the reduction in memory power consumption (29% without ECC and 14% with ECC  ...  We propose adaptive granularity to combine the best of finegrained and coarse-grained memory accesses.  ...  Fine-Grained Cache Management While orthogonal to our research on adaptive memory access granularity, work on cache architectures that support fine-grained data management is a necessary component of our  ... 
doi:10.1145/2024723.2000100 fatcat:kwb7uoqcdnh4bhwaznn46udryq

Adaptive granularity memory systems

Doe Hyun Yoon, Min Kyu Jeong, Mattan Erez
2011 Proceeding of the 38th annual international symposium on Computer architecture - ISCA '11  
The evaluation shows that performance is improved by 61% without ECC and 44% with ECC in memory-intensive applications, while the reduction in memory power consumption (29% without ECC and 14% with ECC  ...  We propose adaptive granularity to combine the best of finegrained and coarse-grained memory accesses.  ...  Fine-Grained Cache Management While orthogonal to our research on adaptive memory access granularity, work on cache architectures that support fine-grained data management is a necessary component of our  ... 
doi:10.1145/2000064.2000100 dblp:conf/isca/YoonJE11 fatcat:ty3xafhgl5dxbj3kz5v2uor3da
« Previous Showing results 1 — 15 out of 31,157 results