A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Performance evaluation of H.264/AVC decoding and visualization using the GPU
2007
Applications of Digital Image Processing XXX
This decoder performs MC, reconstruction, and CSC on the GPU as well. Our results compare both GPU-enabled decoders, as well as a CPU-only decoder in terms of speed, complexity, and CPU requirements. ...
Modern computers are typically equipped with powerful yet cost-effective Graphics Processing Units (GPUs) to accelerate graphics operations. ...
ACKNOWLEDGEMENTS The research as described in this paper was funded by Ghent University, the Interdisciplinary Institute for Broadband Technology (IBBT), the Institute for the Promotion of Innovation by ...
doi:10.1117/12.733151
fatcat:hkt6kqirare3nc3hhje7k7fdd4
Parallel H.264/AVC Motion Compensation for GPUs Using OpenCL
2015
IEEE transactions on circuits and systems for video technology (Print)
Motion compensation is one of the most computeintensive parts in H.264/AVC video decoding. It exposes massive parallelism which can reap the benefit from Graphics Processing Units (GPUs). ...
However, when the overheads of memory copy and OpenCL runtime are included, no speedup is gained at application level. ...
This motivates the use of GPUs for accelerating video codecs. The motion compensation stage in H.264/AVC takes a significant proportion of decoding time [4] . ...
doi:10.1109/tcsvt.2014.2344512
fatcat:w4ogur3kzbg2nbf23vp2imv4v4
GPU-based Graph Traversal on Compressed Graphs
2019
Proceedings of the 2019 International Conference on Management of Data - SIGMOD '19
Graph processing on GPUs received much attention in the industry and the academia recently, as the hardware accelerator offers attractive potential for performance boost. ...
However, the high-bandwidth device memory on GPUs has limited capacity that constrains the size of the graph to be loaded on chip. ...
Massive number of cores and ultra memory bandwidth make GPUs a promising platform for accelerating graph processing. ...
doi:10.1145/3299869.3319871
dblp:conf/sigmod/ShaLT19
fatcat:uiqk5lypujbktczpcuuyiopp5q
GPU acceleration of the HEVC decoder inter prediction module
2015
2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP)
To circumvent this issue, an efficient acceleration of the HEVC inter prediction decoding module is proposed, by offloading the involved workload to GPU devices. ...
The inter prediction decoding is one of the most time consuming modules in modern video decoders, which may significantly limit their real-time capabilities. ...
To provide the fully compliant HEVC real-time encoding/decoding, current research trends aim at accelerating the execution of particular modules by offloading their computations from the Central Processing ...
doi:10.1109/globalsip.2015.7418397
dblp:conf/globalsip/SouzaIRS15
fatcat:tktxntk7svfafats2w6hw7u7ru
GPU-assisted decoding of video samples represented in the YCoCg-R color space
2005
Proceedings of the 13th annual ACM international conference on Multimedia - MULTIMEDIA '05
Our results show that a significant speedup can be achieved by relying on the processing power of the GPU, relative to the CPU. ...
To be more specific, high definition video (1080p), represented in the YCoCg-R color space, could be decoded to RGB at 30 Hz on a PC with an AMD Athlon XP 2800+ CPU, an AGP bus and an NVIDIA GeForce 6800 ...
The research activities that have been described in this paper were funded by Ghent University, the Interdisciplinary Institute for Broadband Technology (IBBT), the Institute for the Promotion of ...
doi:10.1145/1101149.1101248
dblp:conf/mm/NeveRH05
fatcat:2nenvrku5rg6dlg5lhvu7gymke
Accelerate video decoding with generic GPU
2005
IEEE transactions on circuits and systems for video technology (Print)
In this paper, we present our study on leveraging the GPUs graphics engine to accelerate the video decoding. ...
By moving the whole motion compensation feedback loop of the decoder to the GPU, the CPU and GPU have been made to work in parallel in a pipelining fashion. ...
In this section, we first explore the feasibility of GPU acceleration for video decoding and the constraints of GPU. ...
doi:10.1109/tcsvt.2005.846440
fatcat:htdvbzkzfjfz3a3ey4b36aj76i
Low-Latency Software Polar Decoders
2016
Journal of Signal Processing Systems
Finally, we show that the energy efficiency of the proposed decoders is comparable to state-of-the-art software polar decoders. ...
These proposed decoders have an order of magnitude lower latency and memory footprint compared to state-of-the-art decoders, while maintaining comparable throughput. ...
Claude Thibeault is a member of ReSMiQ. Warren J. Gross is a member of ReSMiQ and SYTACom. ...
doi:10.1007/s11265-016-1157-y
fatcat:ozsx2cobevbgtjbiio5qunur3u
Accelerating wavelet-based video coding on graphics hardware using CUDA
2009
2009 Proceedings of 6th International Symposium on Image and Signal Processing and Analysis
We have integrated our DWT into the Dirac Wavelet Video Codec (DWVC), of which the overlapped block motion compensation compensation and frame arithmetic have been accelerated using CUDA as well. ...
This transform, by means of the lifting scheme, can be performed in a memory and computation efficient way on modern, programmable GPUs, which can be regarded as massively parallel co-processors through ...
Acknowledgements This research is part of the "VIEW: Visual Interactive Effective Worlds" program, funded by the Dutch National Science Foundation (NWO), project no. 643.100.501. ...
doi:10.1109/ispa.2009.5297658
fatcat:fb2fu2g5efcvhdfbsw35nulize
Accelerating JPEG Decompression on GPUs
[article]
2021
arXiv
pre-print
For GPU-accelerated computer vision and deep learning tasks, such as the training of image classification models, efficient JPEG decoding is essential due to limitations in memory bandwidth. ...
Furthermore, it achieves a speedup of up to 3.4 over nvJPEG accelerated with the dedicated hardware JPEG decoder on an A100. ...
[3] accelerated JPEG decoding on GPUs by parallelizing the IDCT step using CUDA. Sodsong et al. ...
arXiv:2111.09219v1
fatcat:xzn5ovus65cajgpermbb6hny4y
An Optimized Parallel IDCT on Graphics Processing Units
[chapter]
2013
Lecture Notes in Computer Science
In this paper we present an implementation of the H.264/AVC Inverse Discrete Cosine Transform (IDCT) optimized for Graphics Processing Units (GPUs) using OpenCL. ...
By exploiting that most of the input data of the IDCT for real videos are zero valued coefficients a new compacted data representation is created that allows for several optimizations. ...
Implementation of IDCT on GPU Our GPU implementation is based on an optimized CPU version of the H.264 decoder that, in turn, is based on FFmpeg [9] . ...
doi:10.1007/978-3-642-36949-0_18
fatcat:ypddl63i5fgrtbdkmupgfxovoa
A GPU-based Branch-and-Bound algorithm using Integer–Vector–Matrix data structure
2016
Parallel Computing
The implementation on GPU is based on the Integer-Vector-Matrix (IVM) data structure which is used instead of a conventional linked-list to store and manage the pool of subproblems. ...
Compared to a GPU-accelerated B&B based on a linked-list, the algorithm presented in this paper solves a set of standard flowshop instances on average 3.3 times faster. ...
GPU-accelerated linked-list-based B&B [1] (GPU-LL), described in Subsection 1.3. ...
doi:10.1016/j.parco.2016.01.008
fatcat:zicfpfonsffnhdbqpszkjsa22u
Interleaved entropy coders
[article]
2014
arXiv
pre-print
This allows for very efficient encoding and decoding on CPUs supporting superscalar execution or SIMD instructions, as well as GPU implementations. ...
state that the encoder was in when writing those bits---all "buffering" of information is explicitly part of the coder state and identical between encoder and decoder. ...
Acknowledgments Thanks to my colleagues Charles Bloom and Sean Barrett for reviewing earlier drafts of this paper and making valuable suggestions. ...
arXiv:1402.3392v1
fatcat:3b3t2mok4fhg5aohb5g7zrrkx4
Fast software polar decoders
2014
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
We also show that, for a similar error-correction performance, the throughput of polar decoders both surpasses that of LDPC decoders targeting general-purpose processors and is competitive with that of ...
state-of-the-art software LDPC decoders running on graphic processing units. ...
the fastest software GPU-based LDPC decoders. ...
doi:10.1109/icassp.2014.6855069
dblp:conf/icassp/GiardSTG14
fatcat:bj2rzfrtpre7tkdb7rgaehbnem
Parallelization of Variable Rate Decompression through Metadata
2020
2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)
On a GPU, we achieve average decoding rates of up to 100 GiB/s. Our strategies allow the user to make a trade-off between decoding throughput and metadata size overhead. ...
On a CPU, we achieve a near optimal decoding speedup and an overhead size which is consistently less than 0.04% of the compressed data size. ...
This work was partially performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. ...
doi:10.1109/pdp50117.2020.00045
dblp:conf/pdp/NoordsijVBAL20
fatcat:kowe2t6aczdufjbsbazwa4fxhu
Parallel nonbinary LDPC decoding on GPU
2012
2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR)
This paper proposes a massively parallel implementation of a nonbinary LDPC decoding accelerator based on a graphics processing unit (GPU) to achieve both great flexibility and scalability. ...
We highlight the methodology to partition the decoding task to a heterogeneous platform consisting of the CPU and GPU. ...
We partition the decoding algorithm into five OpenCL kernels, which are listed in Table II . ...
doi:10.1109/acssc.2012.6489229
dblp:conf/acssc/WangSYWSC12
fatcat:chinirbyujddlkkkcswnw2cwyy
« Previous
Showing results 1 — 15 out of 2,064 results