A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2005; you can also visit the original URL.
The file type is application/pdf
.
Filters
Compiler and System Techniques for soc Distributed Reconfigurable Accelerators
[chapter]
2004
Lecture Notes in Computer Science
We propose a general framework for SoC architectures and software tools in which different kind of processing units are programmed at high level. ...
We show a reconfigurable unit suitable for this framework and we draw the outline of a supercompiler able to address such an architecture. ...
These units can be powerful processors, micro-programmable architectures, reconfigurable data-paths, or fine-grain embedded fpgas. ...
doi:10.1007/978-3-540-27776-7_31
fatcat:tfad6qofj5fxxcdp5tcvj75wvy
IMPROVING PERFORMANCE IN HPC SYSTEM UNDER POWER CONSUMPTIONS LIMITATIONS
2019
International Journal of Advanced Research in Computer Science
The next breakthrough in the computing revolution is the Exascale level of performance that is 10 18 calculations per second-a remarkable achievement in computing that will have a fathomless influence ...
Even though the Exascale performance can be achieved by multiplying the number of cores according to Exascale computing system constraints, the challenge of power consumption still persists. ...
In this report, we have observed that a tri-level MOC (MPI+OpenMP+CUDA) model has achieved a tremendous performance by providing coarse-grained, fine-grained and finer granularity parallelism. ...
doi:10.26483/ijarcs.v10i2.6397
fatcat:k3l3lk5kuzhnldn5b2qzkh4eia
Frameworks for Multi-core Architectures: A Comprehensive Evaluation Using 2D/3D Image Registration
[chapter]
2011
Lecture Notes in Computer Science
In an empirical study, a fine-grained data parallel and a coarse-grained task parallel parallelization approach are used to evaluate and estimate different aspects like usability, performance, and overhead ...
In the beginning, programmers had to divide and distribute the work by hand to the available cores and to manage threads in order to use more than one core. ...
The functionality is provided by a library and OpenCL programs are just-in-time compiled from the run-time environment like in RapidMind. The kernels are stored in strings, just like in OpenGL. ...
doi:10.1007/978-3-642-19137-4_6
fatcat:guuyi6d3ercqvbgyfs3v5p5lya
Are Coarse-Grained Overlays Ready for General Purpose Application Acceleration on FPGAs?
2016
2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech)
It is clear that the PAR complexity is significantly reduced by using coarse-grained architectures, Coarse-grained architecture. thus reducing compilation time. ...
This is motivated by the fact that programs can be written at a higher level of abstraction with compilation to the overlay being several orders of magnitude faster than for the fine grained FPGA on which ...
doi:10.1109/dasc-picom-datacom-cyberscitec.2016.110
dblp:conf/dasc/JainMF16
fatcat:gmiz7uunpbaatjryzjiozj24om
Software transactional memory for multicore embedded systems
2009
Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems - LCTES '09
We find that we can meet or beat the performance of fine-grained locking over a range of application characteristics, including size of shared data, time spent in the critical section, and contention between ...
We offer a full implementation of an embedded STM and test it against both coarse-grained and fine-grained locking mechanisms. ...
Acknowledgments This work has been supported by Charles Stark Draper Laboratories, Inc. Jennifer Mankin was supported on a Draper Fellowship. ...
doi:10.1145/1542452.1542465
dblp:conf/lctrts/MankinKA09
fatcat:r6gm5sbzvvcddfhvtbtbpa7mem
A Run-Time System for Dynamic Grain Packing
[chapter]
1999
Lecture Notes in Computer Science
The SCOOPP (Scalable Object Oriented Parallel Programming) system is an hybrid compile and run-time system, that extracts parallelism, supports explicit parallelism and dynamically serialises parallel ...
tasks in excess, to dynamically scale applications through a wide range of target platforms. ...
Static granularity control [1] [2] is usually applied to fine grained tasks, whose behaviour is known at compile-time. ...
doi:10.1007/3-540-48311-x_119
fatcat:lu5uf5twijhnfnwmqc7xe7aqqm
Toward a Self-Aware Codelet Execution Model
2014
2014 Fourth Workshop on Data-Flow Execution Models for Extreme Scale Computing
This paper takes the position that a potential solution to solve the resource management issue at this scale is a hierarchical and distributed self-aware system leveraging the fine-grain eventdriven codelet ...
The Codelet Model is a fine-grain dataflow-inspired and eventdriven program execution model which was designed to run parallel programs on a combination of such many-core chips into a supercomputer Meanwhile ...
inevitably lead to fine-grain resource management. ...
doi:10.1109/dfm.2014.12
fatcat:rq6qmxv5bnbr3jjhxztposhpju
Fine-grained modularity and reuse of virtual machine components
2012
Proceedings of the 11th annual international conference on Aspect-oriented Software Development - AOSD '12
Among the novel use cases that will be enabled by our research are: VM extensions by third parties, support for multiple languages inside one VM, and a universal VM for mobile devices. ...
We will split the VMs into fine-grained modules, define explicit interfaces and extension points for the modules, and finally re-connect them. ...
Acknowledgments Parts of this effort have been sponsored by the National Science Foundation under grant CCF-1117162 and by Samsung Telecommunications America under Agreement No. 51070. ...
doi:10.1145/2162049.2162073
dblp:conf/aosd/WimmerBLF12
fatcat:ddkikcw3crhf7nr5ik2znw3zsa
On the adequacy of lightweight thread approaches for high-level parallel programming models
2018
Future generations computer systems
High-level parallel programming models (PMs) are becoming crucial in order to extract the computational power of current on-node multi-threaded parallelism. ...
Our work reveals those scenarios where LWTs overperform pthread-based solutions and compares the performance between an ad hoc solution and a generic implementation. ...
At this time, this PM is only supported by the Mercurium compiler [26] and the Nanos++ runtime. ...
doi:10.1016/j.future.2018.02.016
fatcat:pbo2kyo4sjgzppbxjo2cf7ofza
Evaluation of Power Consumption at Execution of Multiple Automatically Parallelized and Power Controlled Media Applications on the RP2 Low-Power Multicore
[chapter]
2013
Lecture Notes in Computer Science
compiler are executed simultaneously on RP2, a 8-core multicore processor developed by Renesas Electronics, Hitachi, and Waseda University. ...
Finally, when a combination of a high computational load application program and an intermediate computational load application program are executed simultaneously, the consumed power reduced by 21% by ...
A part of this research has been supported by NEDO "Advanced Heterogeneous Multiprocessor", NEDO "Multi core processors for realtime consumer electronics" and STARC "Automatic Parallelizing Compiler Cooperative ...
doi:10.1007/978-3-642-36036-7_3
fatcat:hgyucsbr5fgsnkednsi2hp3f2i
The OpenTM Transactional Application Programming Interface
2007
Parallel Architecture and Compilation Techniques (PACT), Proceedings of the International Conference on
The implementation builds upon the OpenMP support in the GCC compiler and includes a runtime for the C programming language. We evaluate the performance and programmability features of OpenTM. ...
We show that it delivers the performance of fine-grain locks at the programming simplicity of coarsegrain locks. ...
Woongki Baek is supported by an STMicroelectronics Stanford Graduate Fellowship and a Samsung Scholarship. ...
doi:10.1109/pact.2007.4336227
fatcat:nn7gbfngvrff5egt4jzfgurptm
Program-driven fine-grained power management for the reconfigurable mesh
2009
2009 International Conference on Field Programmable Logic and Applications
For one application we report a reduction in power and energy consumption by 21.09%. ...
We extend our previous reconfigurable mesh architecture and the corresponding programming language ARMLang to support programdriven power management. ...
Further, for our fine-grained power management of the reconfigurable mesh we need to support a low-power mode for the processing elements. ...
doi:10.1109/fpl.2009.5272527
dblp:conf/fpl/GiefersP09
fatcat:nk26rj33svbidhrk6mqabui6mu
Improving the design flow for parallel and heterogeneous architectures running real-time applications: The PHARAON FP7 project
2014
Microprocessors and microsystems
reduce power consumption in a transparent manner for applications. ...
In this article, we present the work-in-progress of the EU FP7 PHARAON project, started in September 2011. ...
Acknowledgments This work is being performed in the framework of the FP7-288307 project PHARAON. ...
doi:10.1016/j.micpro.2014.05.003
fatcat:6kbn3sgvkjglhgr6yjbdmhqkhu
Adaptive granularity memory systems
2011
SIGARCH Computer Architecture News
The evaluation shows that performance is improved by 61% without ECC and 44% with ECC in memory-intensive applications, while the reduction in memory power consumption (29% without ECC and 14% with ECC ...
We propose adaptive granularity to combine the best of finegrained and coarse-grained memory accesses. ...
Fine-Grained Cache Management While orthogonal to our research on adaptive memory access granularity, work on cache architectures that support fine-grained data management is a necessary component of our ...
doi:10.1145/2024723.2000100
fatcat:kwb7uoqcdnh4bhwaznn46udryq
Adaptive granularity memory systems
2011
Proceeding of the 38th annual international symposium on Computer architecture - ISCA '11
The evaluation shows that performance is improved by 61% without ECC and 44% with ECC in memory-intensive applications, while the reduction in memory power consumption (29% without ECC and 14% with ECC ...
We propose adaptive granularity to combine the best of finegrained and coarse-grained memory accesses. ...
Fine-Grained Cache Management While orthogonal to our research on adaptive memory access granularity, work on cache architectures that support fine-grained data management is a necessary component of our ...
doi:10.1145/2000064.2000100
dblp:conf/isca/YoonJE11
fatcat:ty3xafhgl5dxbj3kz5v2uor3da
« Previous
Showing results 1 — 15 out of 31,157 results