Filters








512 Hits in 5.0 sec

Belief Propagation Implementation Using CUDA on an NVIDIA GTX 280 [chapter]

Yanyan Xu, Hui Chen, Reinhard Klette, Jiaju Liu, Tobi Vaudrey
2009 Lecture Notes in Computer Science  
This paper provides implementation details, primarily concerned with the inequality constraints, involving the threads and shared memory, required for efficient programming on a GPU.  ...  Disparity map generation is a significant component of vision-based driver assistance systems.  ...  Calculate the data (Gaussian) pyramid 6. Message passing using created pyramid 7. Compute disparity map from messages and data-cost 8.  ... 
doi:10.1007/978-3-642-10439-8_19 fatcat:uf2hjoyx2rhhvpy5amjefcbsfe

Hierarchical gate-array routing on a hypercube multiprocessor

O.A. Olukotun, T.N. Mudge
1990 Journal of Parallel and Distributed Computing  
This paper presents an algorithm for routing gate-arrays that uses a hypercube connected parallel processor to provide the necessary computation power.  ...  On the basis of the results of executing the algorithm on two gate-array benchmarks the case is made for using hypercuhe multiprocessors as accelerators for computeintensive CAD operations. 0 1990Academic  ...  We have presented a parallel hierarchical routing algorithm for routing gate-arrays and have mapped it onto a hypercube multiprocessor to route two modestly sized gatearrays.  ... 
doi:10.1016/0743-7315(90)90130-h fatcat:q2npn2i5u5aolf2loep2lenhxu

Efficient Mapping of Multiresolution Image Filtering Algorithms on Graphics Processors [chapter]

Richard Membarth, Frank Hannig, Hritam Dutta, Jürgen Teich
2009 Lecture Notes in Computer Science  
However, it is hard to efficiently map such algorithms to the graphics hardware even with detailed insight into the architecture.  ...  Graphics card architectures provide an optimal platform for parallel execution of many number crunching loop programs from fields like image processing, linear algebra, etc.  ...  The complete program has to be divided into such sub-problems that can be processed independently on one multiprocessor.  ... 
doi:10.1007/978-3-642-03138-0_31 fatcat:gly67pr4and3fodzkzbitvfgqy

Acceleration of Stereo-Matching on Multi-core CPU and GPU

Tian Xu, Paul Cockshott, Susanne Oehler
2014 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS)  
This paper presents an accelerated version of a dense stereo-correspondence algorithm for two different parallelism enabled architectures, multi-core CPU and GPU.  ...  To analyse the origin of the speed-up and gain deeper understanding about the choice of the optimal hardware, the algorithm was broken into key sub-tasks and the performance was tested for four different  ...  In order to accelerate the processing time of the algorithm, parallel programming (Multi-core CPU and GPU programming) was tried. B.  ... 
doi:10.1109/hpcc.2014.22 dblp:conf/hpcc/XuCO14 fatcat:c26c4u2ovvhhzgjlwlx26yjfq4

Accelerating the Retinex Algorithm with CUDA

Hyo-Seok Seo, Oh-Young Kwon
2010 Journal of information and communication convergence engineering  
We parallelize this recursive pyramidal average calculating for all layers, map the average data into the 2D plane and reduce the calculating time dramatically.  ...  We parallelize this recursive pyramidal average calculating for all layers, map the average data into the 2D plane and reduce the calculating time dramatically.  ... 
doi:10.6109/jicce.2010.8.3.323 fatcat:owqjtpm3nvfe7c2x3nk2qtqsya

Research on the Application of Visual SLAM in Embedded GPU

Tianji Ma, Nanyang Bai, Wentao Shi, Xi Wu, Lutao Wang, Tao Wu, Changming Zhao, Zhili Zhou
2021 Wireless Communications and Mobile Computing  
Use CUDA, a parallel computing platform, to accelerate the visual front-end processing of the visual SLAM algorithm. Extensive experiments are done to verify the effectiveness of the method.  ...  Simultaneous localization and mapping (SLAM) technology can incrementally construct a map of the robot's moving path in an unknown environment while estimating the position of the robot in the map, providing  ...  Acknowledgments This work was supported by the Sichuan Science and Technology Program (2019ZDZX0007 and 2019YFG0399).  ... 
doi:10.1155/2021/6691262 fatcat:oyec6hwuirhehjlwnybnigd4wi

Efficient Mapping of Streaming Applications for Image Processing on Graphics Cards [chapter]

Richard Membarth, Hritam Dutta, Frank Hannig, Jürgen Teich
2019 Advances in Biochemical Engineering/Biotechnology  
However, it is hard to efficiently map such algorithms to the graphics hardware even with detailed insight into the architecture.  ...  Graphics card architectures provide an optimal platform for parallel execution of many number crunching loop programs from fields like image processing or linear algebra.  ...  Acknowledgments We are indebted to our colleagues Philipp Kutzer and Michael Glaß for providing the sample pictures.  ... 
doi:10.1007/978-3-662-58834-5_1 fatcat:tz3azu6gb5hj5mysy2whv7grui

Performance of the Hough transform on a distributed memory multiprocessor

Austin Underhill, Mohammed Atiquzzaman, John Ophel
1999 Microprocessors and microsystems  
To obtain maximum performance from parallel machines, parallel algorithms should be designed to reflect the architecture of the parallel machine.  ...  One of the disadvantages of the transform is its requirement for large amounts of computing power. Parallel machines have given programmers the potential for incredible computing power.  ...  Acknowledgements The authors would like to thank the CAP project at the Australian National University for providing access to the AP1000 machine.  ... 
doi:10.1016/s0141-9331(98)00093-3 fatcat:vgcunsfctvacfbpjgaruhyl2sm

Mixing Graphics and Compute for Real-Time Multiview Human Body Tracking [chapter]

Bogusław Rymut, Bogdan Kwolek
2014 Lecture Notes in Computer Science  
This paper presents an effective algorithm for 3D modelbased human motion tracking using a GPU-accelerated particle swarm optimization.  ...  We demonstrate that thanks to GPU hardware rendering the time needed for calculation of the objective function is shorter.  ...  This work has been partially supported by the Polish Ministry of Science and Higher Education within a grant for young researchers (U-530/DS/M) and the National Science Center within the project N N516  ... 
doi:10.1007/978-3-319-11331-9_64 fatcat:lm25pp674rccxn676ifrwta4xy

GPU-accelerated feature tracking

Alexander Graves
2016 2016 IEEE National Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS)  
., The motivation of this research is to prove that GPUs can provide significant speedup of long-executing image processing algorithms by way of parallelization and massive data throughput.  ...  This research explains how KLT could benefit from GPGPU programming and provides the corresponding OpenCL implementation.  ...  In a Data Parallel Programming Model, each work-item is mapped to a data element in a one-to-one ratio.  ... 
doi:10.1109/naecon.2016.7856842 fatcat:3lm3kg6ezjaglhuqnwoo66dkpa

NETRA: a hierarchical and partitionable architecture for computer vision systems

A.N. Choudhary, J.H. Patel, N. Ahuja
1993 IEEE Transactions on Parallel and Distributed Systems  
This paper presents a multiprocessor architecture+ called "NE-TRA," for computer vision systems. NETFLA is a highly flexible architecture.  ...  A typical CVS employs algorithms from a very broad spectrum such as such as numerical, image processing, graph algorithms, symbolic processing, and artificial intelligence.  ...  SIMD Architectures Massively parallel SIMD multiprocessors are well suited for low-level and well structured vision algorithms that exhibit spatial parallelism at the pixel level.  ... 
doi:10.1109/71.246071 fatcat:frkufkmkkjgv7fbkefeol7fhbu

GPU implementation of motion estimation for visual saliency

Anis Rahman, Dominique Houzet, Denis Pellerin, Lionel Agud
2010 2010 Conference on Design and Architectures for Signal and Image Processing (DASIP)  
These pathways produce intermediate saliency maps that are merged together to get salient regions distinct from what surround them.  ...  The implementation involves a number of code and memory optimizations to get the performance gains, resultantly materializing real-time video analysis capability for the visual saliency model.  ...  GPU presents us with an inherently parallel architecture for image-based algorithms that are often complex and time-consuming.  ... 
doi:10.1109/dasip.2010.5706268 dblp:conf/dasip/RahmanHPA10 fatcat:ybwoaxyyyzeflnkdcijejk6b5y

A performance study of general-purpose applications on graphics processors using CUDA

Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, Kevin Skadron
2008 Journal of Parallel and Distributed Computing  
The paper also discusses advantages and inefficiencies of the CUDA programming model and some desirable features that might allow for greater ease of use and also more readily support a larger body of  ...  Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths.  ...  We would also like to thank Michael Garland of NVIDIA Research for pointing out the pyramidal algorithm for the HotSpot problem, Karthik Sankaranarayanan for his help with the HotSpot grid solver and John  ... 
doi:10.1016/j.jpdc.2008.05.014 fatcat:xj7jenabbjgjzer5wiaeiiai4a

The REFINE multiprocessor — theoretical properties and algorithms

Suchendra M. Bhandarkar, Hamid R. Arabnia
1995 Parallel Computing  
A large class of algorithms for the Boolean n-cube which includes the FFT and the Batcher's bitonic sort is shown to map efficiently on the REFINE topology.  ...  Primitive parallel operations on the REFINE topology are described and analyzed. These primitive operations could be used as building blocks for more complex parallel algorithms.  ...  These primitive operations could be used as building blocks for more complex parallel algorithms.  ... 
doi:10.1016/0167-8191(95)00032-9 fatcat:avvtr7fbxjaxnemwcyykst7tpe

The hybrid CPU/GPU implementation of the computational procedure for digital terrain models generation from satellite images

V.A. Fursov, Image Processing Systems Institute оf RAS – Branch of the FSRC "Crystallography and Photonics" RAS, Samara, Russia, Ye.V. Goshin, A.P. Kotov, Samara National Research University, Samara, Russia, Image Processing Systems Institute оf RAS – Branch of the FSRC "Crystallography and Photonics" RAS, Samara, Russia, Samara National Research University, Samara, Russia, Image Processing Systems Institute оf RAS – Branch of the FSRC "Crystallography and Photonics" RAS, Samara, Russia, Samara National Research University, Samara, Russia
2016 Computer Optics  
The procedure is based on the authors' previously developed algorithms of fast image matching for building disparity maps implemented on GPUs (Graphics Processing Units).  ...  In this paper we propose a computational procedure for constructing a DTM from the satellite stereo images.  ...  Disparity map: a) ENVI, b) proposed parallel algorithm DEMs shown in Fig. 7a , b were generated for the above mentioned regions of the disparity map.  ... 
doi:10.18287/2412-6179-2016-40-5-721-728 fatcat:tk7wa6cr3ve47ip3tf2w3psn3q
« Previous Showing results 1 — 15 out of 512 results