A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
HSTREAM: A directive-based language extension for heterogeneous stream computing
[article]
2018
arXiv
pre-print
Big data streaming applications require utilization of heterogeneous parallel computing systems, which may comprise multiple multi-core CPUs and many-core accelerating devices such as NVIDIA GPUs and Intel ...
We demonstrate the usefulness of HSTREAM language extension with various applications from the STREAM benchmark. ...
We have evaluated the usefulness of our HSTREAM solution for stream computing in heterogeneous parallel computing systems with the STREAM benchmark. ...
arXiv:1809.09387v1
fatcat:iwenfyvdlzdudoitudszyvfdci
Multiple Target Task Sharing Support for the OpenMP Accelerator Model
[chapter]
2016
Lecture Notes in Computer Science
In this paper we propose an extension to the OpenMP 4.5 directive-based programming model to support the specification and execution of multiple instances of task regions on different devices (i.e. accelerators ...
Although current directive-based paradigms, such as OpenMP or OpenACC, support both accelerators and multicore-based hosts, they do not provide an effective and efficient way to concurrently use them, ...
Clauses on a directive with no device type clause apply to all accelerator device types. ...
doi:10.1007/978-3-319-45550-1_19
fatcat:7jx3aneqhnhhnhloug2hgruc6m
Memory Analysis of Low Power MPEG-4 Decoder Architecture
2009
2009 International Conference on Embedded Software and Systems
In this approach hardware accelerators are scheduled quasi-statically thus decreasing the interfacing overhead substantially. ...
Recent research has shown that in mobile devices, energy efficiency of the total system does not scale at the same pace with the energy efficiency of the silicon. ...
Communication between the accelerators and the rest of the system can be handled using DMA or even through direct access from the accelerators to the system bus. ...
doi:10.1109/icess.2009.85
dblp:conf/icess/DahlinEHL09
fatcat:tgjuequvdbgzbafxmkguy5h7ke
Low-Overhead Run-Time Scheduling for Fine-Grained Acceleration of Signal Processing Systems
2007
Signal Processing Systems Design and Implementation (siPS), IEEE Workshop on
In this paper, we present four scheduling algorithms that provide flexible utilization of fine-grain DSP accelerators with low run-time overhead. ...
Experimental results demonstrate the effectiveness of our scheduling approach. ...
Thus, there were a maximum of 6 x number of streams jobs in each scheduling problem. The number of operations in the job types ranged between one and five. ...
doi:10.1109/sips.2007.4387591
dblp:conf/sips/BoutellierBS07
fatcat:q7ausup675g5vbdk2cjuygavh4
When FPGA-Accelerator Meets Stream Data Processing in the Edge
2019
2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS)
We demonstrate that through the design, implementation, and evaluation of F-Storm, an FPGA-accelerated and general-purpose distributed stream processing system on Edge servers. ...
Unfortunately, given the limited computation power of Edge servers, current efforts may fail in practice to achieve the desired latency of stream data applications. ...
Next, the scheduler assigns the two types of executors to multiple slots. ...
doi:10.1109/icdcs.2019.00180
dblp:conf/icdcs/0001HI0XCL19
fatcat:nzgqzjm3drdvhk6sydr26v5zbm
Implementation of dynamic service aggregation for interactive video delivery
1997
Multimedia Computing and Networking 1998
We present our experiences with building the system and address issues of server-directed channel switching at the client and stream merging through content acceleration. ...
In this novel demonstration of aggregation via rate-adaptive merging, MPEG-1 system streams are used as the content format and IP multicast is used for video delivery. ...
Kuczynski for her work on the early versions of the server. ...
doi:10.1117/12.298433
fatcat:aunqbzkd2vfytcwsmxvijw7noq
Dynamic Task-Scheduling and Resource Management for GPU Accelerators in Medical Imaging
[chapter]
2012
Lecture Notes in Computer Science
As a remedy, this paper investigates the employment of resource management and scheduling techniques for applications from the medical domain for GPU accelerators. ...
Using GPUs as accelerators in this domain, imposes new challenges since GPU's common FIFO scheduling does not support task prioritization and preemption. ...
In addition, using two different types of GPUs, allows us to show that the scheduler minimizes the average response time of all task trees. ...
doi:10.1007/978-3-642-28293-5_13
fatcat:bgs5wao2ufggtdfa6okmmplsq4
pvFPGA: paravirtualising an FPGA-based hardware accelerator towards general purpose computing
2017
International Journal of High Performance Computing and Networking
The data transferred between the x86 server and the FPGA accelerator through direct memory access (DMA), and a streaming pipeline technique is adopted to improve the efficiency of data transfer. ...
The accelerator design on the FPGA can be used for accelerating various applications, regardless of the application computation latencies. ...
This type of pipeline satisfies our FPGA accelerator design aim, which is to make the FPGA accelerator capable of accelerating various applications for cloud clients. ...
doi:10.1504/ijhpcn.2017.084246
fatcat:5uzwbwl7rbcw3i3dbhura2bbz4
pvFPGA: paravirtualising an FPGA-based hardware accelerator towards general purpose computing
2017
International Journal of High Performance Computing and Networking
The data transferred between the x86 server and the FPGA accelerator through direct memory access (DMA), and a streaming pipeline technique is adopted to improve the efficiency of data transfer. ...
The accelerator design on the FPGA can be used for accelerating various applications, regardless of the application computation latencies. ...
This type of pipeline satisfies our FPGA accelerator design aim, which is to make the FPGA accelerator capable of accelerating various applications for cloud clients. ...
doi:10.1504/ijhpcn.2017.10005140
fatcat:43pqq7cbvzdxhio63hgpaay62a
Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes
2014
2014 IEEE International Parallel & Distributed Processing Symposium Workshops
PASTIX is a parallel sparse direct solver, based on a dynamic scheduler for modern hierarchical manycore architectures. ...
Furthermore, they are able to significantly speed up the solver on heterogeneous environments by taking advantage of the accelerators while hiding the complexity of their efficient manipulation from the ...
any type of accelerator. ...
doi:10.1109/ipdpsw.2014.9
dblp:conf/ipps/LacosteFBRT14
fatcat:b6yqhke4srhhbhkhxjifof3aae
Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes
[article]
2014
arXiv
pre-print
PaStiX is a parallel sparse direct solver, based on a dynamic scheduler for modern hierarchical manycore architectures. ...
Furthermore, they are able to significantly speed up the solver on heterogeneous environments by taking advantage of the accelerators while hiding the complexity of their efficient manipulation from the ...
any type of accelerator. ...
arXiv:1405.2636v1
fatcat:u7fvjti4hbb45hpn6vcodkzoqm
Low Power and Scalable Many-Core Architecture for Big-Data Stream Computing
2014
2014 IEEE Computer Society Annual Symposium on VLSI
then minimum energy consumption for each type of executed classifier. ...
In the last years the process of examining large amounts of different types of data, or Big-Data, in an effort to uncover hidden patterns or unknown correlations has become a major need in our society. ...
However, In modern stream mining applications, the topology of the graph is determined online based on the variation of the type of the input stream. ...
doi:10.1109/isvlsi.2014.77
dblp:conf/isvlsi/KanounRAS14
fatcat:vu3tpx4wffa6fc5kiteuf6gnxa
Parallel Programming Models for Heterogeneous Multicore Architectures
2010
IEEE Micro
Acknowledgments We thankfully acknowledge the support of the European Commission through the SARC IP project (contract no. ...
, blocking)
Yes
Sequoia
Language types
Tasks
Static
Explicit (language in/out
data types)
No
StarSs
Directives
Tasks
Dynamic
Explicit (in/out clauses)
No
Tagged
Procedure
Calls ...
In data must be streamed into the SPU's local store, out data must be streamed out of local stores, and inout data must be streamed both in and out of local stores. ...
doi:10.1109/mm.2010.94
fatcat:z2vhi4aysjfsbc23v2hve7xfje
The SABER system for window-based hybrid stream processing with GPGPUs
2016
Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems - DEBS '16
SABER executes window-based streaming SQL queries in a data-parallel fashion and employs an adaptive scheduling strategy to balance the load on the different types of processors. ...
In this paper, we review the design principles of SABER in terms of its hybrid stream processing model and its architecture for query execution. ...
different types of processors. ...
doi:10.1145/2933267.2933291
dblp:conf/debs/KoliousisWFWCP16
fatcat:mdxbmdygunay5ihstlm57m2jxm
AMC: Advanced Multi-accelerator Controller
2015
Parallel Computing
In this article, we propose the integration of an intelligent memory system and efficient scheduler in the HLS-based multi-accelerator environment called Advanced Multi-accelerator Controller (AMC). ...
A generic FPGA based HLS multiaccelerator system requires a microprocessor (master core) that manages memory and schedules accelerators. ...
The parameters Stream and Stride define the type of memory access. ...
doi:10.1016/j.parco.2014.10.003
fatcat:z7xne5erxjbihk54ns6kjwjpve
« Previous
Showing results 1 — 15 out of 53,093 results