138 Hits in 4.2 sec

High-Performance Reverse Time Migration on GPU

J Cabezas, M Araya-Polo, I Gelado, N Navarro, E Morancho, J M Cela
2009 2009 International Conference of the Chilean Computer Science Society  
One the most popular mathematical schemes to solve a PDE is Finite Difference (FD). In this work we map a PDE-FD algorithm called Reverse Time Migration to a GPU using CUDA.  ...  These preliminary results confirm that GPUs are a real option for HPC, from performance to programmability.  ...  ACKNOWLEDGMENT The authors thank the Barcelona Supercomputing Center for their permission to publish the material reported in this article.  ... 
doi:10.1109/sccc.2009.19 dblp:conf/sccc/CabezasAGNMC09 fatcat:wkj22rbi2rdyhmc4mjirrfwolm

Accelerating Anisotropic Mesh Adaptivity on nVIDIA's CUDA Using Texture Interpolation [chapter]

Georgios Rokos, Gerard Gorman, Paul H. J. Kelly
2011 Lecture Notes in Computer Science  
Anisotropic mesh smoothing is used to generate optimised meshes for Computational Fluid Dynamics (CFD).  ...  The key point is that this calculation can be automatically performed by dedicated texturing hardware outside multiprocessors.  ...  In this example, our domain is the piece of rubber and we want to solve a PDE in this domain.  ... 
doi:10.1007/978-3-642-23397-5_38 fatcat:enlclrpnpzfkfivxyfw7n66npa

Analysis of photonic networks for a chip multiprocessor using scientific applications

Gilbert Hendry, Shoaib Kamil, Aleksandr Biberman, Johnnie Chan, Benjamin G. Lee, Marghoob Mohiyuddin, Ankit Jain, Keren Bergman, Luca P. Carloni, John Kubiatowicz, Leonid Oliker, John Shalf
2009 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip  
re-injection. ‡ Same as * , for a 3µm ring modulator. § Based on experimental measurements in [22] .  ...  With recent advances in 3D Integration CMOS technology, the possibility for realizing hybrid photonic-electronic networks-on-chip warrants investigating real application traces on functionally comparable  ...  One of the fundamental assumptions of this work is that 3D integrated chips will play an important role as the interconnect plane for future chip multiprocessors, whether the NoC is electrical or photonic  ... 
doi:10.1109/nocs.2009.5071458 dblp:conf/nocs/HendryKBCLMJBCKOS09 fatcat:5mql6lbfajgu5ltexbr2euh7ei

Exploring Multi-Grained Parallelism in Compute-Intensive DEVS Simulations

Qi Liu, Gabriel Wainer
2010 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation  
Together, the parallelization and optimization strategies produced promising experimental results, accelerating the simulation of a 3D environmental model by a factor of up to 33.06.  ...  The proposed methods can also be applied to other multicore and shared-memory architectures.  ...  INTRODUCTION Chip Multiprocessor (CMP) architectures have been used to address the limitations of microprocessor performance.  ... 
doi:10.1109/pads.2010.5471652 dblp:conf/pads/LiuW10 fatcat:azanuptxwrcgxiktzukp6ix4ya

Hybrid analog-digital solution of nonlinear partial differential equations

Yipeng Huang, Ning Guo, Mingoo Seok, Yannis Tsividis, Kyle Mandli, Simha Sethumadhavan
2017 Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture - MICRO-50 '17  
We use a hybrid analog-digital computer architecture to solve nonlinear PDEs that draws on the strengths of each model of computation and avoids their weaknesses.  ...  A weakness of digital methods for solving nonlinear PDEs is they may not converge unless a good initial guess is used to seed the solution.  ...  Section 5 details the programming model, architecture, and microarchitecture of a prototype analog accelerator for solving nonlinear PDEs, and provides measured results in analog solution accuracy.  ... 
doi:10.1145/3123939.3124550 dblp:conf/micro/HuangGSTMS17 fatcat:rfhoa72tbvfoxjzza2y3w65vfe

Control-theoretic adaptive cache-fair scheduling of chip multiprocessor systems

Huseyin G Arslan, Yu-Chu Tian, Fenglian Li, Chen Peng, Min-Rui Fei
2017 Transactions of the Institute of Measurement and Control  
Zhou X, Chen W and Zheng W (2009b) Figure 1 . 1 Chip multiprocessor (CMP) architecture. Figure 2 . 2 The Framework of the Adaptive Cache-Fair Multiprocessor Scheduling (ACMS).  ...  However, it is computationally expensive to solve a bilinear system model.  ... 
doi:10.1177/0142331217715064 fatcat:deydkzyqivhklfddswjd6dwh3u

Accelerating Fluid Registration Algorithm on Multi-FPGA Platforms

Jason Cong, Muhuan Huang, Yi Zou
2011 2011 21st International Conference on Field Programmable Logic and Applications  
Viscous fluid registration is a powerful PDE-based method that can register large deformations in the imaging process.  ...  We would like to thank Luminita Vese and Alex Bui for providing the original reference code, and Janice Martin-Wheeler for proof-reading of the paper.  ...  ACKNOWLEDGMENT This work is partially funded by the Center for Domain-Specific Computing (NSF Expedition in Computing Award CCF-0926127), and grants from Nvidia Corp. and Mentor Graphics Corp. under the  ... 
doi:10.1109/fpl.2011.20 dblp:conf/fpl/CongHZ11 fatcat:74ivacoo2rbc5alks4efkt6pxi

High performance noise reduction for biomedical multidimensional data

S. Tabik, E.M. Garzón, I. García, J.J. Fernández
2007 Digital signal processing (Print)  
This method is based on a partial differential equation (PDE) tightly coupled with a massive set of eigensystems.  ...  Denoising large 3D images in biomedicine and structural cellular biology by AND is extremely expensive from a computational point of view, with huge memory needs.  ...  Carrascosa (National Center for Biotechnology, Madrid, Spain) for kindly providing, respectively, the mitochondrion and Vaccinia virus datasets.  ... 
doi:10.1016/j.dsp.2006.11.004 fatcat:32u24esfunbwld4i5hgd367deq

Graphics processing unit (GPU) programming strategies and trends in GPU computing

André R. Brodtkorb, Trond R. Hagen, Martin L. Sætra
2013 Journal of Parallel and Distributed Computing  
There are several numerical methods for approximating the solution of hyperbolic PDEs like the shallow water equations, and finite volume methods constitute an important class.  ...  This work extends to architectures similar to the GPU and to other hyperbolic conservation laws. iii iv  ...  Partly accomplished at the National Center for Computational Hydroscience and Engineering, this research was also funded by the Department of Homeland Security-sponsored Souteast Region Research Initiative  ... 
doi:10.1016/j.jpdc.2012.04.003 fatcat:7s4fnkx3yrekbmxabztmto5fzq

Full-chip thermal analysis of 3D ICs with liquid cooling by GPU-accelerated GMRES method

Xue-Xin Liu, Zao Liu, Sheldon X.-D. Tan, Joseph Gordon
2012 Thirteenth International Symposium on Quality Electronic Design (ISQED)  
Active cooling techniques such as integrated inter-tier liquid cooling are promising alternatives for traditional fan-based cooling, which is insufficient for 3D-ICs.  ...  Experimental results show the proposed GPU-GMRES solver is up to 4.3× faster than parallel CPU-GMRES for DC analysis and 2.3× faster than parallel LU decomposition and one or two orders of magnitude faster  ...  Unlike existing fast thermal analysis methods, our method starts from the heat equations to model 3D-ICs with inter-tier liquid cooling microchannels, and directly solves the resulting PDE using GMRES.  ... 
doi:10.1109/isqed.2012.6187484 dblp:conf/isqed/LiuLTG12 fatcat:ioxgyuk6wjairdirzrwraegrra

Survey on Efficient Linear Solvers for Porous Media Flow Models on Recent Hardware Architectures

Ani Anciaux-Sedrakian, Peter Gottschling, Jean-Marc Gratien, Thomas Guignon
2014 Oil & Gas Science and Technology  
(and serial) architectures to solve their PDE systems.  ...  With the Kepler architecture, NVIDIA provides the SMX (Streaming Multiprocessor eXtreme) architecture.  ... 
doi:10.2516/ogst/2013184 fatcat:zdxhrlm5nvdujn7stynit4vyg4

Architectural model of the human neuroregulator system based on Multi-Agent Systems and implementation of System-on-Chip using FPGA

Francisco Maciá Pérez, Leandro Zambrano Mendez, José Vicente Berna Martínez, Roberto Sepúlveda Lima, Iren Lorenzo Fonseca
2022 Microprocessors and microsystems  
The present study enriches this theoretical model with an architectural model that makes it suitable to implement in hardware.  ...  We thus focused on the Cortical-Diencephalic (CD) centre, responsible for voluntary micturition.  ...  Now that we have detailed the experimentation scenario, we will dedicate the rest of the section to the validation phase, from the design of the experiments to the tests and the validation of the results  ... 
doi:10.1016/j.micpro.2022.104431 fatcat:ryufob6dg5curemyxhu7dydyn4

A GPU-Based Application Framework Supporting Fast Discrete-Event Simulation

Hyungwook Park, Paul A. Fishwick
2009 Simulation (San Diego, Calif.)  
A graphics processing unit (GPU) is a dedicated graphics processor that renders 3D graphics in real time. Re-  ...  We present the design and implementation of this library, which is based on the compute unified device architecture (CUDA) general purpose parallel applications programming interface for the NVIDIA class  ...  On the other hand, simulation libraries written in a general purpose lan-cently, the GPU has become an increasingly attractive architecture for solving compute-intensive problems for general purpose computation  ... 
doi:10.1177/0037549709340781 fatcat:e4sojlpscndtlhr2hsvhbdcwga

Trends in Algorithms for Nonuniform Applications on Hierarchical Distributed Architectures [chapter]

David E. Keyes
2000 Computational Aerosciences in the 21st Century  
However, disregard for the differential costs of accessing different locations in memory (the "flat memory" model) can put unnecessary amounts of synchronization and data motion on the critical path of  ...  For this purpose, pseudo-transient Newton-Krylov-Schwarz methods are briefly introduced and their parallel scalability in bulk synchronous SPMD applications is explored. We also indicate some funda-  ...  He is grateful to Manny Salas and Kyle Anderson, organizers of the "Computational Aerosciences for the 21st Century" workshop for the unnatural impetus to contemplate a computational world 10-20 years  ... 
doi:10.1007/978-94-010-0948-5_6 fatcat:7y2rsfl5frdwhemru5agrzj4ge

GiPSi: A Framework for Open Source/Open Architecture Software Development for Organ-Level Surgical Simulation

M.C. Cavusoglu, T.G. Goktekin, F. Tendick
2006 IEEE Transactions on Information Technology in Biomedicine  
This paper presents the architectural details of an evolving open source/open architecture software framework for developing organ-level surgical simulations.  ...  Index Terms-Open architecture framework, shared development, surgical simulation, virtual environments.  ...  , namely systems of differential equations, in particular PDEs. 1) Simulation of PDE-Based Models: The first step in solving a continuous PDE is to discretize the spatial domain it is defined on.  ... 
doi:10.1109/titb.2006.864479 pmid:16617620 fatcat:ncepnuqfcjeflmvyxgfzpus6xa
« Previous Showing results 1 — 15 out of 138 results