954 Hits in 6.8 sec

Energy efficient video decoding on multi-core devices

Damla Kiliçarslan, C. Göktuğ Gürler, Öznur Özkasap, A. Murat Tekalp
2011 Proceedings of the 2nd International Conference on Energy-Efficient Computing and Networking - e-Energy '11  
We offer and develop two approaches for the H.264 standard. The former is based on a coarse-grained frame level, and the latter is a fine-grained macroblock level parallelism.  ...  Various approaches of parallelism at data and task levels can be incorporated in video decoders, bringing efficiency in energy consumption rates and/or performance.  ...  ACKNOWLEDGMENTS This work was partially supported by the COST (European Cooperation in Science and Technology) framework, under Action IC0804, and by TUBITAK (The Scientific and Technical Research Council  ... 
doi:10.1145/2318716.2318728 dblp:conf/eenergy/KilicarslanGOT11 fatcat:tufphdoa35egxgselfwoqbh6xi

Coarse Grain Parallelization of H.264 Video Decoder and Memory Bottleneck in Multi-Core Architectures

Ahmet Gürhanlı, Charlie Chung-Ping Chen, Shih-Hao Hung
2011 Journal of clean energy technologies  
Fine grain methods for parallelization of the H.264 decoder have good latency performance and less memory usage.  ...  Index Terms-video compression, H.264 decoder, parallel processing, high-performance computing, image processing.  ...  algorithm in personal devices Coarse Grain Parallelization of H.264 Video Decoder and Memory Bottleneck in Multi-Core Architectures Ahmet Gürhanlı, Charlie Chung-Ping Chen, and Shih-Hao Hung International  ... 
doi:10.7763/ijcte.2011.v3.335 fatcat:jafipjrvv5eutd6mef3bqm4oky

An Implementation of Multiple-Standard Video Decoder on a Mixed-Grained Reconfigurable Computing Platform

Leibo LIU, Dong WANG, Yingjie CHEN, Min ZHU, Shouyi YIN, Shaojun WEI
2016 IEICE transactions on information and systems  
The proposed RPU, including 16 × 16 multi-functional processing elements (PEs), is used to accelerate computeintensive tasks in the video decoding.  ...  standards, including MPEG-2, AVS, H.264, and HEVC.  ...  Most FPGAs provide low-level fine-grained parallelism with a high degree of flexibility but normally pay a power or area penalty.  ... 
doi:10.1587/transinf.2015edp7369 fatcat:4ixd2sywvvfv5izvwwhpzwl5xe

Multi-Grain Parallel Accelerate System for H.264 Encoder on ULTRASPARC T2

Yu Wang, Linda Wu, Jing Guo
2013 Journal of Computers  
This paper describes a multi-grain parallel accelerate system for H.264 encoder on UltraSPARC T2 processor.  ...  This system integrates pipeline parallelism, frame-level, slice-level, macroblock-level data parallelism and SIMD technology.We use x264, an H.264 video encoder to implement our parallel accelerate system  ...  Realization of the H.264 Encoding Parallel Accelerate System Fig. 4 shows the complete process of multi-grain parallel accelerate system.  ... 
doi:10.4304/jcp.8.12.3293-3297 fatcat:nn3syf7yrbdxnl52woym2pglie

Manycore processor for video mining applications

Y. Matsumoto, H. Uchida, M. Hagimoto, Y. Hibi, S. Torii, M. Izumida
2013 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC) What is Video Mining System Feature Extraction(Low Level) Video Decoder + α Camera(s) Video(MPEG, H.264, …) Video Stream Motion Stream Audio Stream Shot boundary Detection  ...  JPEG Encoder H.264 Decoder Ray Tracing Via Via Reg Total Memory access to Local Memory FIFO FIFO TOPS Systems Corp.  ...  Passing Mechanism (ZOMP) can efficiently increases the system performance and scalability of Manycore processors.  Block based distributed processing drastically reduces memory access bandwidth and increases  ... 
doi:10.1109/aspdac.2013.6509659 dblp:conf/aspdac/MatsumotoUHHTI13 fatcat:gbn6xhrmnrbvzlhd2s2qsxh24u

A Hardware Task Scheduler for Embedded Video Processing [chapter]

Ghiath Al-Kadi, Andrei Sergeevich Terechko
2009 Lecture Notes in Computer Science  
We found that our hardware task scheduler speeds up a Quad HD H.264 video decoding by 1.17 times compared to a chip multi-processor with a state-of-the-art hardware task queues.  ...  Moreover, our hardware task scheduler allows decreasing the number of cores needed to meet the real-time performance requirements for the H.264 decoder and, consequently, reduces the silicon area of the  ...  Acknowledgements We thank Jan Hoogerbrugge for the Task Scheduling Unit simulator model and Marc Duranton for inspiring discussions.  ... 
doi:10.1007/978-3-540-92990-1_12 fatcat:gsrsgn2g7ff55endukm2t3ogdy

KAHRISMA: A Novel Hypermorphic Reconfigurable-Instruction-Set Multi-grained-Array Architecture

Ralf Koenig, Lars Bauer, Timo Stripf, Muhammad Shafique, Waheed Ahmed, Juergen Becker, Jorg Henkel
2010 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)  
With the help of an encrypted H.264 en-/decoding case study we demonstrate that our novel KAHRISMA architecture will deliver the required flexibility to design future-proof embedded systems that are not  ...  In this paper we present our innovative processor architecture concept KAHRISMA (KArlsruhe's Hypermorphic Reconfigurable-Instruction-Set Multi-grained-Array).  ...  ISA types) and Array Modes (i.e. multi-grained CI) that may execute in parallel.  ... 
doi:10.1109/date.2010.5456939 dblp:conf/date/KoenigBSSABH10 fatcat:sx4exkhdqjfjbk7hb2bcx6biyq

Evaluation of Automatic Power Reduction with OSCAR Compiler on Intel Haswell and ARM Cortex-A9 Multicores [chapter]

Tomohiro Hirano, Hideo Yamamoto, Shuhei Iizuka, Kohei Muto, Takashi Goto, Tamami Wake, Hiroki Mikami, Moriyuki Takamura, Keiji Kimura, Hironori Kasahara
2015 Lecture Notes in Computer Science  
On the ARM cortex-A9, having three-cores with power control obtained a power reduction of 57.9% with the H.264 decoder and 67.2% with Optical Flow.  ...  Exploiting parallelism and decreasing redundant power dissipation by fine grain power control for multicore/manycore systems are promising approaches, which can ensure continuous performance improvements  ...  In addition, loop iteration level parallelism is translated into coarse-grained task parallelism by decomposing a loop into multiple loops.  ... 
doi:10.1007/978-3-319-17473-0_16 fatcat:yb4atcg33jeytiysvacxnt3rwi

Heterogeneous multi-core platform for consumer multimedia applications

P. Kollig, C. Osborne, T. Henriksson
2009 2009 Design, Automation & Test in Europe Conference & Exhibition  
The successful usage of a heterogeneous multi-core SoC platform is presented and it is shown how specific challenges such as inter-processor communication and real-time performance guarantees in physically  ...  This paper presents a multi-core SoC architecture for consumer multimedia applications.  ...  The latter two categories are better suited for multimedia because there is a high level of fine-grained data parallelism in many of the algorithms and a system typically is constructed as a pipeline of  ... 
doi:10.1109/date.2009.5090857 dblp:conf/date/KolligOH09 fatcat:owaqbujdtvh6tlgts563yfe2ru

A Look-Ahead Task Management Unit for Embedded Multi-Core Architectures

Magnus Själander, Andrei Terechko, Marc Duranton
2008 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools  
H.264 video decoding parallelized at macro-block level) these tasks have dependencies among each other.  ...  In overall, the TMU-based multi-core architecture reaches a speedup of more than 14x on 16 cores running H.264 video decoding, assuming CABAC is implemented in a dedicated coprocessor.  ...  We also thank our colleague Jan Hoogerbrugge at NXP for his help with TTISim and the parallelization of the H.264 decoder.  ... 
doi:10.1109/dsd.2008.45 dblp:conf/dsd/SjalanderTD08 fatcat:zq5xvprydfh45fde5ucafy6uyu

An Overview of H.264 Hardware Encoder Architectures Including Low-Power Features

Ngoc-Mai Nguyen, Duy-Hieu Bui, Nam-Khanh Dang, Edith Beigne, Suzanne Lesecq, Pascal Vivet, Xuan-Tu Tran
2014 REV Journal on Electronics and Communications  
We also propose the VENGME's design, a particular hardware architecture of H.264 encoder that enables applying low-power techniques and developing power-aware ability.  ...  This low power encoder is a four-stage architecture with memory access reduction, in which, each module has been optimized.  ...  Therefore, parallel processing solutions such as DSP-based, stream processor-based, multi-core systems or dedicated VLSI hardware architectures must be addressed to respond to this demand.  ... 
doi:10.21553/rev-jec.72 fatcat:us45zrwuxff3tpf32ms2wn5tse

Using OpenMP superscalar for parallelization of embedded and consumer applications

Michael Andersch, Chi Ching Chi, Ben Juurlink
2012 2012 International Conference on Embedded Computer Systems (SAMOS)  
To determine the usability of OmpSs, we show in detail how to implement complex parallelization strategies such as ones used in parallel H.264 decoding.  ...  In the past years, research and industry have introduced several parallel programming models to simplify the development of parallel applications.  ...  INTRODUCTION Since the advent of multi-core processors and systems, programmers are faced with the challenge of exploiting threadlevel parallelism (TLP).  ... 
doi:10.1109/samos.2012.6404154 dblp:conf/samos/AnderschCJ12 fatcat:qb5clcj2k5e77knylk27knuruu

Resource recycling

Yongjun Park, Hyunchul Park, Scott Mahlke, Sukjin Kim
2010 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems - CASES '10  
In this paper, a compilation framework is introduced that maximizes application throughput with hybrid resource partitioning of a PPA system.  ...  Mobile computing platforms in the form of smart phones, netbooks, and personal digital assistants have become an integral part of our everyday lives.  ...  Target Applications and fine-grain parallelism To evaluate the performance, we used three application domains: audio decoding (aac), video decoding (h.264) and 3D graphics (3d).  ... 
doi:10.1145/1878921.1878925 dblp:conf/cases/ParkPMK10 fatcat:5stvi4c3sbbsxmwlz25tio7feu

Parallelizing Complex Streaming Applications on Distributed Scratchpad Memory Multicore Architecture

Shin-Kai Chen, Cheng-Yu Hung, Ching-Chih Chen, Chih-Wei Liu
2013 International journal of parallel programming  
The full-HD H.264/AVC decoder applications can achieve nearly 50 fps.  ...  To test and verify the proposed design flow, three popular multimedia applications were implemented: a full-HD motion JPEG decoder, an object detector, and a full-HD H.264/AVC decoder.  ...  Acknowledgments This work was supported in part by the Nation Science Council, Taiwan, under Grant NSC-102-2220-E-009-013-and Ministry of Economic Affairs, Taiwan, under Grant MOEA-101-EC-17-A-02-S1-202  ... 
doi:10.1007/s10766-013-0256-7 fatcat:f5gwz3str5a2jlnjyhqfshf4my

A Multi-core Architecture Based Parallel Framework for H.264/AVC Deblocking Filters

Sung-Wen Wang, Shu-Sian Yang, Hong-Ming Chen, Chia-Lin Yang, Ja-Ling Wu
2008 Journal of Signal Processing Systems  
Deblocking filter is one of the most time consuming modules in the H.264/AVC decoder as indicated in many studies.  ...  This paper proposes a novel parallel algorithm for H.264/AVC deblocking filter to speed the H.264/AVC decoder up.  ...  The tasks with light workload have to wait for the completeness of the threads with heavy workload, and in consequence the multi-core system could not be fully utilized.  ... 
doi:10.1007/s11265-008-0321-4 fatcat:5lj4c65wrbeyfo5n3r67ibmnaa
« Previous Showing results 1 — 15 out of 954 results