A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Parallel visual data restoration on multi-GPGPUs using stencil-reduce pattern
2015
The international journal of high performance computing applications
In this paper, a highly effective parallel filter for visual data restoration is presented. ...
The filter is designed following a skeletal approach, using a newly proposed stencil-reduce, and has been implemented by way of the FastFlow parallel programming library. ...
ACKNOWLEDGEMENT This work has been supported by the EU FP7 grant IST-2011-288570 "ParaPhrase: Parallel Patterns for Adaptive Heterogeneous Multicore Systems", Compagnia di San Paolo project id. ...
doi:10.1177/1094342014567907
fatcat:6tpsriexfvbnbda2pfhlb7wula
High-Performance Reverse Time Migration on GPU
2009
2009 International Conference of the Chilean Computer Science Society
Due to GPU characteristics, the parallelism paradigm shifts from the classical threads plus SIMD to Single Program Multiple Data (SPMD). ...
One the most popular mathematical schemes to solve a PDE is Finite Difference (FD). In this work we map a PDE-FD algorithm called Reverse Time Migration to a GPU using CUDA. ...
RTM ON GPGPU A. ...
doi:10.1109/sccc.2009.19
dblp:conf/sccc/CabezasAGNMC09
fatcat:wkj22rbi2rdyhmc4mjirrfwolm
GPGPU-based Gaussian Filtering for Surface Metrological Data Processing
2008
2008 12th International Conference Information Visualisation
Thirdly, this thesis devised methods for carrying out result visualization directly on GPU by storing processed data in local GPU memory through making use of GPU's rendering device features to achieve ...
In addition, the category of parallel architecture pattern that the GPGPU belongs to has been specified, which formed the foundation of the GPGPU programming framework design in the thesis. ...
with sunstantial improvements on the data visualization. ...
doi:10.1109/iv.2008.14
dblp:conf/iv/SuXJ08
fatcat:lpagxjxstjbj5lcdumztlpgolu
MULTITHREADED RENDERING FOR CROSS-PLATFORM 3D VISUALIZATION BASED ON VULKAN API
2020
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Exploiting the parallelism and the multi-core performance of the Graphics Processing Unit (GPU), a cross-platform 3D viewer is developed based on the Vulkan API and modern C++. ...
Furthermore, push-constants are used to send uniform data to the GPU and render passes to adapt to the tile-based rendering of the mobile devices. ...
The majority orients to computing operations performed by CUDA or OpenCL with GPGPU programming while the literature on multi-threading entirely for visual processes and 3D rendering is less consistent ...
doi:10.5194/isprs-archives-xliv-4-w1-2020-57-2020
fatcat:lyohnjjadzcb3c7gaaaqrxyspu
ESSEX: Equipping Sparse Solvers For Exascale
[chapter]
2020
Lecture Notes in Computational Science and Engineering
Furthermore, ESSEX focused on hardware-efficient kernels for all relevant architectures and efficient data structures for block vector formulations of the eigensolvers. ...
The ESSEX project has investigated programming concepts, data structures, and numerical algorithms for scalable, efficient, and robust sparse eigenvalue solvers on future heterogeneous exascale systems ...
We are grateful for computer time granted on the LRZ SuperMUC and SuperMUC-NG, the CSCS Piz Daint, and the OakForest PACS systems. ...
doi:10.1007/978-3-030-47956-5_7
fatcat:srs4qavwcvezbovdhpbhf6d2ni
GPU Virtualization and Scheduling Methods
2017
ACM Computing Surveys
Heterogeneous computing with GPUs can benefit the Cloud by reducing operational costs and improving resource and energy efficiency. ...
We believe that our survey delivers a perspective on the challenges and opportunities for virtualization of heterogeneous computing environments. ...
data-parallel components. ...
doi:10.1145/3068281
fatcat:bng347au6veltazpmyyzv5ijmu
Larrabee: A Many-Core x86 Architecture for Visual Computing
2009
IEEE Micro
The Larrabee native programming model supports a variety of highly parallel applications that use irregular data structures. ...
The customizable software graphics rendering pipeline for this architecture uses binning in order to reduce required memory bandwidth, minimize lock contention, and increase opportunities for parallelism ...
Soupikov, and others from Intel's Application Research Lab, Software Systems Group, and Visual Computing Group. ...
doi:10.1109/mm.2009.9
fatcat:7bxdltx7nbd37h52cslorskrue
Larrabee: A Many-Core Intel Architecture for Visual Computing
[chapter]
2010
Lecture Notes in Computer Science
The Larrabee native programming model supports a variety of highly parallel applications that use irregular data structures. ...
The customizable software graphics rendering pipeline for this architecture uses binning in order to reduce required memory bandwidth, minimize lock contention, and increase opportunities for parallelism ...
Soupikov, and others from Intel's Application Research Lab, Software Systems Group, and Visual Computing Group. ...
doi:10.1007/978-3-642-11515-8_2
fatcat:n7fmdb6d2nbprhcxpsuiu3jyue
Larrabee: A many-Core x86 architecture for visual computing
2008
2008 IEEE Hot Chips 20 Symposium (HCS)
The Larrabee native programming model supports a variety of highly parallel applications that use irregular data structures. ...
The customizable software graphics rendering pipeline for this architecture uses binning in order to reduce required memory bandwidth, minimize lock contention, and increase opportunities for parallelism ...
Soupikov, and others from Intel's Application Research Lab, Software Systems Group, and Visual Computing Group. ...
doi:10.1109/hotchips.2008.7476560
fatcat:q36s4ogravbsbpvmyc2b6crslm
A large-scale cross-architecture evaluation of thread-coarsening
2013
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13
OpenCL has become the de-facto data parallel programming model for parallel devices in today's high-performance supercomputers. ...
In this paper we consider a data parallel compiler transformation -thread-coarsening -and evaluate its effects across a range of devices by developing a source-to-source OpenCL compiler based on LLVM. ...
For this reason, we propose an extension to basic coarsening which restores the original coalesced access pattern. ...
doi:10.1145/2503210.2503268
dblp:conf/sc/MagniDO13
fatcat:nbnb3ytssbhw7ap7at2goffks4
Parallel Programming With Global Asynchronous Memory: Models, C++ Apis And Implementations
2017
Zenodo
- parallelism only and it relies on expensive sharing protocols. ...
The durable MPI (Message Passing Interface) standard, with send/receive communication, broadcast, gather/scatter, and reduction collectives is still used to construct parallel programs where each communication ...
Parallel visual data restoration on multi-GPGPUs using stencilreduce pattern. International Journal of High Performance Computing Applications, 29(4):461-472, 2015 (J6) I. Merelli, F. Tordini, M. ...
doi:10.5281/zenodo.1037585
fatcat:ecjm5xj5x5exbfxe3eokl7uneu
Larrabee
2008
ACM SIGGRAPH 2008 papers on - SIGGRAPH '08
The Larrabee native programming model supports a variety of highly parallel applications that use irregular data structures. ...
The customizable software graphics rendering pipeline for this architecture uses binning in order to reduce required memory bandwidth, minimize lock contention, and increase opportunities for parallelism ...
Soupikov, and others from Intel's Application Research Lab, Software Systems Group, and Visual Computing Group. ...
doi:10.1145/1399504.1360617
fatcat:4cgj7bpvbjf7rgyzwfhv2d7bde
Larrabee
2008
ACM Transactions on Graphics
The Larrabee native programming model supports a variety of highly parallel applications that use irregular data structures. ...
The customizable software graphics rendering pipeline for this architecture uses binning in order to reduce required memory bandwidth, minimize lock contention, and increase opportunities for parallelism ...
Soupikov, and others from Intel's Application Research Lab, Software Systems Group, and Visual Computing Group. ...
doi:10.1145/1360612.1360617
fatcat:wc5hjerg35dcre5so2mvudhgly
Pico: A Domain-Specific Language For Data Analytics Pipelines
2017
Zenodo
In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. ...
For this reason, we use the Dataflow model as a starting point to build a programming environment with a simplified programming model implemented as a Domain-Specific Language, that is on top of a stack ...
Here we present two widely used instances of data parallel patterns, namely the map and the reduce patterns, that are also the most often used patterns in Big Data scenarios. ...
doi:10.5281/zenodo.579753
fatcat:aadje57qh5hk3ijmqn4j7vkhpm
Supporting multiple accelerators in high-level programming models
2015
Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores - PMAM '15
These extensions allow for distributing data and computation among a list of devices via easy-to-use annotation interfaces, including specifying the distribution of multi-dimensional arrays and declaring ...
shared data regions among accelerators. ...
multi-threading on CPUs. ...
doi:10.1145/2712386.2712405
dblp:conf/ppopp/0001LLSQ15
fatcat:3bkv56c7ojb2te3q7da32zjgde
« Previous
Showing results 1 — 15 out of 47 results