Filters








47 Hits in 6.3 sec

Parallel visual data restoration on multi-GPGPUs using stencil-reduce pattern

Marco Aldinucci, Guilherme Peretti Pezzi, Maurizio Drocco, Concetto Spampinato, Massimo Torquati
2015 The international journal of high performance computing applications  
In this paper, a highly effective parallel filter for visual data restoration is presented.  ...  The filter is designed following a skeletal approach, using a newly proposed stencil-reduce, and has been implemented by way of the FastFlow parallel programming library.  ...  ACKNOWLEDGEMENT This work has been supported by the EU FP7 grant IST-2011-288570 "ParaPhrase: Parallel Patterns for Adaptive Heterogeneous Multicore Systems", Compagnia di San Paolo project id.  ... 
doi:10.1177/1094342014567907 fatcat:6tpsriexfvbnbda2pfhlb7wula

High-Performance Reverse Time Migration on GPU

J Cabezas, M Araya-Polo, I Gelado, N Navarro, E Morancho, J M Cela
2009 2009 International Conference of the Chilean Computer Science Society  
Due to GPU characteristics, the parallelism paradigm shifts from the classical threads plus SIMD to Single Program Multiple Data (SPMD).  ...  One the most popular mathematical schemes to solve a PDE is Finite Difference (FD). In this work we map a PDE-FD algorithm called Reverse Time Migration to a GPU using CUDA.  ...  RTM ON GPGPU A.  ... 
doi:10.1109/sccc.2009.19 dblp:conf/sccc/CabezasAGNMC09 fatcat:wkj22rbi2rdyhmc4mjirrfwolm

GPGPU-based Gaussian Filtering for Surface Metrological Data Processing

Yang Su, Zhijie Xu, Xiangqian Jiang
2008 2008 12th International Conference Information Visualisation  
Thirdly, this thesis devised methods for carrying out result visualization directly on GPU by storing processed data in local GPU memory through making use of GPU's rendering device features to achieve  ...  In addition, the category of parallel architecture pattern that the GPGPU belongs to has been specified, which formed the foundation of the GPGPU programming framework design in the thesis.  ...  with sunstantial improvements on the data visualization.  ... 
doi:10.1109/iv.2008.14 dblp:conf/iv/SuXJ08 fatcat:lpagxjxstjbj5lcdumztlpgolu

MULTITHREADED RENDERING FOR CROSS-PLATFORM 3D VISUALIZATION BASED ON VULKAN API

C. Ioannidis, A.-M. Boutsi
2020 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences  
Exploiting the parallelism and the multi-core performance of the Graphics Processing Unit (GPU), a cross-platform 3D viewer is developed based on the Vulkan API and modern C++.  ...  Furthermore, push-constants are used to send uniform data to the GPU and render passes to adapt to the tile-based rendering of the mobile devices.  ...  The majority orients to computing operations performed by CUDA or OpenCL with GPGPU programming while the literature on multi-threading entirely for visual processes and 3D rendering is less consistent  ... 
doi:10.5194/isprs-archives-xliv-4-w1-2020-57-2020 fatcat:lyohnjjadzcb3c7gaaaqrxyspu

ESSEX: Equipping Sparse Solvers For Exascale [chapter]

Christie L. Alappat, Andreas Alvermann, Achim Basermann, Holger Fehske, Yasunori Futamura, Martin Galgon, Georg Hager, Sarah Huber, Akira Imakura, Masatoshi Kawai, Moritz Kreutzer, Bruno Lang (+6 others)
2020 Lecture Notes in Computational Science and Engineering  
Furthermore, ESSEX focused on hardware-efficient kernels for all relevant architectures and efficient data structures for block vector formulations of the eigensolvers.  ...  The ESSEX project has investigated programming concepts, data structures, and numerical algorithms for scalable, efficient, and robust sparse eigenvalue solvers on future heterogeneous exascale systems  ...  We are grateful for computer time granted on the LRZ SuperMUC and SuperMUC-NG, the CSCS Piz Daint, and the OakForest PACS systems.  ... 
doi:10.1007/978-3-030-47956-5_7 fatcat:srs4qavwcvezbovdhpbhf6d2ni

GPU Virtualization and Scheduling Methods

Cheol-Ho Hong, Ivor Spence, Dimitrios S. Nikolopoulos
2017 ACM Computing Surveys  
Heterogeneous computing with GPUs can benefit the Cloud by reducing operational costs and improving resource and energy efficiency.  ...  We believe that our survey delivers a perspective on the challenges and opportunities for virtualization of heterogeneous computing environments.  ...  data-parallel components.  ... 
doi:10.1145/3068281 fatcat:bng347au6veltazpmyyzv5ijmu

Larrabee: A Many-Core x86 Architecture for Visual Computing

Larry Seiler, Doug Carmean, Eric Sprangle, Tom Forsyth, Pradeep Dubey, Stephen Junkins, Adam Lake, Robert Cavin, Roger Espasa, Ed Grochowski, Toni Juan, Michael Abrash (+2 others)
2009 IEEE Micro  
The Larrabee native programming model supports a variety of highly parallel applications that use irregular data structures.  ...  The customizable software graphics rendering pipeline for this architecture uses binning in order to reduce required memory bandwidth, minimize lock contention, and increase opportunities for parallelism  ...  Soupikov, and others from Intel's Application Research Lab, Software Systems Group, and Visual Computing Group.  ... 
doi:10.1109/mm.2009.9 fatcat:7bxdltx7nbd37h52cslorskrue

Larrabee: A Many-Core Intel Architecture for Visual Computing [chapter]

Roger Espasa
2010 Lecture Notes in Computer Science  
The Larrabee native programming model supports a variety of highly parallel applications that use irregular data structures.  ...  The customizable software graphics rendering pipeline for this architecture uses binning in order to reduce required memory bandwidth, minimize lock contention, and increase opportunities for parallelism  ...  Soupikov, and others from Intel's Application Research Lab, Software Systems Group, and Visual Computing Group.  ... 
doi:10.1007/978-3-642-11515-8_2 fatcat:n7fmdb6d2nbprhcxpsuiu3jyue

Larrabee: A many-Core x86 architecture for visual computing

Doug Carmean
2008 2008 IEEE Hot Chips 20 Symposium (HCS)  
The Larrabee native programming model supports a variety of highly parallel applications that use irregular data structures.  ...  The customizable software graphics rendering pipeline for this architecture uses binning in order to reduce required memory bandwidth, minimize lock contention, and increase opportunities for parallelism  ...  Soupikov, and others from Intel's Application Research Lab, Software Systems Group, and Visual Computing Group.  ... 
doi:10.1109/hotchips.2008.7476560 fatcat:q36s4ogravbsbpvmyc2b6crslm

A large-scale cross-architecture evaluation of thread-coarsening

Alberto Magni, Christophe Dubach, Michael F. P. O'Boyle
2013 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13  
OpenCL has become the de-facto data parallel programming model for parallel devices in today's high-performance supercomputers.  ...  In this paper we consider a data parallel compiler transformation -thread-coarsening -and evaluate its effects across a range of devices by developing a source-to-source OpenCL compiler based on LLVM.  ...  For this reason, we propose an extension to basic coarsening which restores the original coalesced access pattern.  ... 
doi:10.1145/2503210.2503268 dblp:conf/sc/MagniDO13 fatcat:nbnb3ytssbhw7ap7at2goffks4

Parallel Programming With Global Asynchronous Memory: Models, C++ Apis And Implementations

Maurizio Drocco, Marco Aldinucci
2017 Zenodo  
- parallelism only and it relies on expensive sharing protocols.  ...  The durable MPI (Message Passing Interface) standard, with send/receive communication, broadcast, gather/scatter, and reduction collectives is still used to construct parallel programs where each communication  ...  Parallel visual data restoration on multi-GPGPUs using stencilreduce pattern. International Journal of High Performance Computing Applications, 29(4):461-472, 2015 (J6) I. Merelli, F. Tordini, M.  ... 
doi:10.5281/zenodo.1037585 fatcat:ecjm5xj5x5exbfxe3eokl7uneu

Larrabee

Larry Seiler, Robert Cavin, Roger Espasa, Ed Grochowski, Toni Juan, Pat Hanrahan, Doug Carmean, Eric Sprangle, Tom Forsyth, Michael Abrash, Pradeep Dubey, Stephen Junkins (+2 others)
2008 ACM SIGGRAPH 2008 papers on - SIGGRAPH '08  
The Larrabee native programming model supports a variety of highly parallel applications that use irregular data structures.  ...  The customizable software graphics rendering pipeline for this architecture uses binning in order to reduce required memory bandwidth, minimize lock contention, and increase opportunities for parallelism  ...  Soupikov, and others from Intel's Application Research Lab, Software Systems Group, and Visual Computing Group.  ... 
doi:10.1145/1399504.1360617 fatcat:4cgj7bpvbjf7rgyzwfhv2d7bde

Larrabee

Larry Seiler, Robert Cavin, Roger Espasa, Ed Grochowski, Toni Juan, Pat Hanrahan, Doug Carmean, Eric Sprangle, Tom Forsyth, Michael Abrash, Pradeep Dubey, Stephen Junkins (+2 others)
2008 ACM Transactions on Graphics  
The Larrabee native programming model supports a variety of highly parallel applications that use irregular data structures.  ...  The customizable software graphics rendering pipeline for this architecture uses binning in order to reduce required memory bandwidth, minimize lock contention, and increase opportunities for parallelism  ...  Soupikov, and others from Intel's Application Research Lab, Software Systems Group, and Visual Computing Group.  ... 
doi:10.1145/1360612.1360617 fatcat:wc5hjerg35dcre5so2mvudhgly

Pico: A Domain-Specific Language For Data Analytics Pipelines

Claudia Misale, Marco Aldinucci, Guy Tremblay
2017 Zenodo  
In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters.  ...  For this reason, we use the Dataflow model as a starting point to build a programming environment with a simplified programming model implemented as a Domain-Specific Language, that is on top of a stack  ...  Here we present two widely used instances of data parallel patterns, namely the map and the reduce patterns, that are also the most often used patterns in Big Data scenarios.  ... 
doi:10.5281/zenodo.579753 fatcat:aadje57qh5hk3ijmqn4j7vkhpm

Supporting multiple accelerators in high-level programming models

Yonghong Yan, Pei-Hung Lin, Chunhua Liao, Bronis R. de Supinski, Daniel J. Quinlan
2015 Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores - PMAM '15  
These extensions allow for distributing data and computation among a list of devices via easy-to-use annotation interfaces, including specifying the distribution of multi-dimensional arrays and declaring  ...  shared data regions among accelerators.  ...  multi-threading on CPUs.  ... 
doi:10.1145/2712386.2712405 dblp:conf/ppopp/0001LLSQ15 fatcat:3bkv56c7ojb2te3q7da32zjgde
« Previous Showing results 1 — 15 out of 47 results