Filters








53,093 Hits in 3.7 sec

HSTREAM: A directive-based language extension for heterogeneous stream computing [article]

Suejb Memeti, Sabri Pllana
2018 arXiv   pre-print
Big data streaming applications require utilization of heterogeneous parallel computing systems, which may comprise multiple multi-core CPUs and many-core accelerating devices such as NVIDIA GPUs and Intel  ...  We demonstrate the usefulness of HSTREAM language extension with various applications from the STREAM benchmark.  ...  We have evaluated the usefulness of our HSTREAM solution for stream computing in heterogeneous parallel computing systems with the STREAM benchmark.  ... 
arXiv:1809.09387v1 fatcat:iwenfyvdlzdudoitudszyvfdci

Multiple Target Task Sharing Support for the OpenMP Accelerator Model [chapter]

Guray Ozen, Sergi Mateo, Eduard Ayguadé, Jesús Labarta, James Beyer
2016 Lecture Notes in Computer Science  
In this paper we propose an extension to the OpenMP 4.5 directive-based programming model to support the specification and execution of multiple instances of task regions on different devices (i.e. accelerators  ...  Although current directive-based paradigms, such as OpenMP or OpenACC, support both accelerators and multicore-based hosts, they do not provide an effective and efficient way to concurrently use them,  ...  Clauses on a directive with no device type clause apply to all accelerator device types.  ... 
doi:10.1007/978-3-319-45550-1_19 fatcat:7jx3aneqhnhhnhloug2hgruc6m

Memory Analysis of Low Power MPEG-4 Decoder Architecture

Andreas Dahlin, Johan Ersfolk, Haitham Habli, Johan Lilius
2009 2009 International Conference on Embedded Software and Systems  
In this approach hardware accelerators are scheduled quasi-statically thus decreasing the interfacing overhead substantially.  ...  Recent research has shown that in mobile devices, energy efficiency of the total system does not scale at the same pace with the energy efficiency of the silicon.  ...  Communication between the accelerators and the rest of the system can be handled using DMA or even through direct access from the accelerators to the system bus.  ... 
doi:10.1109/icess.2009.85 dblp:conf/icess/DahlinEHL09 fatcat:tgjuequvdbgzbafxmkguy5h7ke

Low-Overhead Run-Time Scheduling for Fine-Grained Acceleration of Signal Processing Systems

Jani Boutellier, Shuvra S. Bhattacharyya, Olli Silven
2007 Signal Processing Systems Design and Implementation (siPS), IEEE Workshop on  
In this paper, we present four scheduling algorithms that provide flexible utilization of fine-grain DSP accelerators with low run-time overhead.  ...  Experimental results demonstrate the effectiveness of our scheduling approach.  ...  Thus, there were a maximum of 6 x number of streams jobs in each scheduling problem. The number of operations in the job types ranged between one and five.  ... 
doi:10.1109/sips.2007.4387591 dblp:conf/sips/BoutellierBS07 fatcat:q7ausup675g5vbdk2cjuygavh4

When FPGA-Accelerator Meets Stream Data Processing in the Edge

Song Wu, Die Hu, Shadi Ibrahim, Hai Jin, Jiang Xiao, Fei Chen, Haikun Liu
2019 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS)  
We demonstrate that through the design, implementation, and evaluation of F-Storm, an FPGA-accelerated and general-purpose distributed stream processing system on Edge servers.  ...  Unfortunately, given the limited computation power of Edge servers, current efforts may fail in practice to achieve the desired latency of stream data applications.  ...  Next, the scheduler assigns the two types of executors to multiple slots.  ... 
doi:10.1109/icdcs.2019.00180 dblp:conf/icdcs/0001HI0XCL19 fatcat:nzgqzjm3drdvhk6sydr26v5zbm

Implementation of dynamic service aggregation for interactive video delivery

Prithwish Basu, Ashok Narayanan, Rajesh Krishnan, Thomas D. Little, Kevin Jeffay, Dilip D. Kandlur, Timothy Roscoe
1997 Multimedia Computing and Networking 1998  
We present our experiences with building the system and address issues of server-directed channel switching at the client and stream merging through content acceleration.  ...  In this novel demonstration of aggregation via rate-adaptive merging, MPEG-1 system streams are used as the content format and IP multicast is used for video delivery.  ...  Kuczynski for her work on the early versions of the server.  ... 
doi:10.1117/12.298433 fatcat:aunqbzkd2vfytcwsmxvijw7noq

Dynamic Task-Scheduling and Resource Management for GPU Accelerators in Medical Imaging [chapter]

Richard Membarth, Jan-Hugo Lupp, Frank Hannig, Jürgen Teich, Mario Körner, Wieland Eckert
2012 Lecture Notes in Computer Science  
As a remedy, this paper investigates the employment of resource management and scheduling techniques for applications from the medical domain for GPU accelerators.  ...  Using GPUs as accelerators in this domain, imposes new challenges since GPU's common FIFO scheduling does not support task prioritization and preemption.  ...  In addition, using two different types of GPUs, allows us to show that the scheduler minimizes the average response time of all task trees.  ... 
doi:10.1007/978-3-642-28293-5_13 fatcat:bgs5wao2ufggtdfa6okmmplsq4

pvFPGA: paravirtualising an FPGA-based hardware accelerator towards general purpose computing

Wei Wang, Miodrag Bolic, Jonathan Parri
2017 International Journal of High Performance Computing and Networking  
The data transferred between the x86 server and the FPGA accelerator through direct memory access (DMA), and a streaming pipeline technique is adopted to improve the efficiency of data transfer.  ...  The accelerator design on the FPGA can be used for accelerating various applications, regardless of the application computation latencies.  ...  This type of pipeline satisfies our FPGA accelerator design aim, which is to make the FPGA accelerator capable of accelerating various applications for cloud clients.  ... 
doi:10.1504/ijhpcn.2017.084246 fatcat:5uzwbwl7rbcw3i3dbhura2bbz4

pvFPGA: paravirtualising an FPGA-based hardware accelerator towards general purpose computing

Miodrag Bolic, Jonathan Parri, Wei Wang
2017 International Journal of High Performance Computing and Networking  
The data transferred between the x86 server and the FPGA accelerator through direct memory access (DMA), and a streaming pipeline technique is adopted to improve the efficiency of data transfer.  ...  The accelerator design on the FPGA can be used for accelerating various applications, regardless of the application computation latencies.  ...  This type of pipeline satisfies our FPGA accelerator design aim, which is to make the FPGA accelerator capable of accelerating various applications for cloud clients.  ... 
doi:10.1504/ijhpcn.2017.10005140 fatcat:43pqq7cbvzdxhio63hgpaay62a

Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes

Xavier Lacoste, Mathieu Faverge, George Bosilca, Pierre Ramet, Samuel Thibault
2014 2014 IEEE International Parallel & Distributed Processing Symposium Workshops  
PASTIX is a parallel sparse direct solver, based on a dynamic scheduler for modern hierarchical manycore architectures.  ...  Furthermore, they are able to significantly speed up the solver on heterogeneous environments by taking advantage of the accelerators while hiding the complexity of their efficient manipulation from the  ...  any type of accelerator.  ... 
doi:10.1109/ipdpsw.2014.9 dblp:conf/ipps/LacosteFBRT14 fatcat:b6yqhke4srhhbhkhxjifof3aae

Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes [article]

Xavier Lacoste, Mathieu Faverge (INRIA Bordeaux - Sud-Ouest, LaBRI), George Bosilca, Pierre Ramet (INRIA Bordeaux - Sud-Ouest, LaBRI), Samuel Thibault (LaBRI, INRIA Bordeaux - Sud-Ouest)
2014 arXiv   pre-print
PaStiX is a parallel sparse direct solver, based on a dynamic scheduler for modern hierarchical manycore architectures.  ...  Furthermore, they are able to significantly speed up the solver on heterogeneous environments by taking advantage of the accelerators while hiding the complexity of their efficient manipulation from the  ...  any type of accelerator.  ... 
arXiv:1405.2636v1 fatcat:u7fvjti4hbb45hpn6vcodkzoqm

Low Power and Scalable Many-Core Architecture for Big-Data Stream Computing

Karim Kanoun, Martino Ruggiero, David Atienza, Mihaela van der Schaar
2014 2014 IEEE Computer Society Annual Symposium on VLSI  
then minimum energy consumption for each type of executed classifier.  ...  In the last years the process of examining large amounts of different types of data, or Big-Data, in an effort to uncover hidden patterns or unknown correlations has become a major need in our society.  ...  However, In modern stream mining applications, the topology of the graph is determined online based on the variation of the type of the input stream.  ... 
doi:10.1109/isvlsi.2014.77 dblp:conf/isvlsi/KanounRAS14 fatcat:vu3tpx4wffa6fc5kiteuf6gnxa

Parallel Programming Models for Heterogeneous Multicore Architectures

2010 IEEE Micro  
Acknowledgments We thankfully acknowledge the support of the European Commission through the SARC IP project (contract no.  ...  , blocking) Yes Sequoia Language types Tasks Static Explicit (language in/out data types) No StarSs Directives Tasks Dynamic Explicit (in/out clauses) No Tagged Procedure Calls  ...  In data must be streamed into the SPU's local store, out data must be streamed out of local stores, and inout data must be streamed both in and out of local stores.  ... 
doi:10.1109/mm.2010.94 fatcat:z2vhi4aysjfsbc23v2hve7xfje

The SABER system for window-based hybrid stream processing with GPGPUs

Alexandros Koliousis, Matthias Weidlich, Raul Castro Fernandez, Alexander L. Wolf, Paolo Costa, Peter Pietzuch
2016 Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems - DEBS '16  
SABER executes window-based streaming SQL queries in a data-parallel fashion and employs an adaptive scheduling strategy to balance the load on the different types of processors.  ...  In this paper, we review the design principles of SABER in terms of its hybrid stream processing model and its architecture for query execution.  ...  different types of processors.  ... 
doi:10.1145/2933267.2933291 dblp:conf/debs/KoliousisWFWCP16 fatcat:mdxbmdygunay5ihstlm57m2jxm

AMC: Advanced Multi-accelerator Controller

Tassadaq Hussain, Amna Haider, Shakaib A. Gursal, Eduard Ayguadé
2015 Parallel Computing  
In this article, we propose the integration of an intelligent memory system and efficient scheduler in the HLS-based multi-accelerator environment called Advanced Multi-accelerator Controller (AMC).  ...  A generic FPGA based HLS multiaccelerator system requires a microprocessor (master core) that manages memory and schedules accelerators.  ...  The parameters Stream and Stride define the type of memory access.  ... 
doi:10.1016/j.parco.2014.10.003 fatcat:z7xne5erxjbihk54ns6kjwjpve
« Previous Showing results 1 — 15 out of 53,093 results