Filters








91,767 Hits in 3.4 sec

Efficient complex operators for irregular codes

Jack Sampson, Ganesh Venkatesh, Nathan Goulding-Hotta, Saturnino Garcia, Steven Swanson, Michael Bedford Taylor
2011 2011 IEEE 17th International Symposium on High Performance Computer Architecture  
Complex "fat operators" are important contributors to the efficiency of specialized hardware.  ...  This paper introduces two new techniques for constructing efficient fat operators featuring up to dozens of operations with arbitrary and irregular data and memory dependencies.  ...  ECOcores improve energy efficiency and performance over other systems designed to execute irregular code by leveraging two architectural techniques.  ... 
doi:10.1109/hpca.2011.5749754 dblp:conf/hpca/SampsonVGGST11 fatcat:yqjxqk44jba4tjjtwweqcwpypi

An Evaluation of Selective Depipelining for FPGA-Based Energy-Reducing Irregular Code Coprocessors

Jack Sampson, Manish Arora, Nathan Goulding-Hotta, Ganesh Venkatesh, Jonathan Babb, Vikram Bhatt, Steven Swanson, Michael Bedford Taylor
2011 2011 21st International Conference on Field Programmable Logic and Applications  
As the complexity of FPGA-based systems scales, the importance of efficiently handling irregular code increases.  ...  Recent work has proposed Irregular Code Energy Reducers (ICERs), a high-level synthesis approach for FPGAs that offers significant energy reduction for irregular code compared to a soft core processor.  ...  Recent work on ICERs [1] creates energy-efficient specialized processors for irregular applications.  ... 
doi:10.1109/fpl.2011.16 dblp:conf/fpl/SampsonAGVBBST11 fatcat:jegt624lxrehdorxjdugio2qku

Efficient vectorization of SIMD programs with non-aligned and irregular data access hardware

Hoseok Chang, Wonyong Sung
2008 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems - CASES '08  
A nonaligned or irregular data access operation incurs many overhead cycles for data alignment. Moreover, this causes difficulty in efficient code generation and hinders automatic vectorization.  ...  Automatic vectorization of programs for partitioned-ALU SIMD (Single Instruction Multiple Data) processors has been difficult because of not only data dependency issues but also non-aligned and irregular  ...  Complex data access in SIMD processors can be categorized into non-aligned and irregular access operations.  ... 
doi:10.1145/1450095.1450121 dblp:conf/cases/ChangS08 fatcat:zqc46qjigvduznafoumhlrbpva

Reducing the Energy Cost of Irregular Code Bases in Soft Processor Systems

Manish Arora, Jack Sampson, Nathan Goulding-Hotta, Jonathan Babb, Ganesh Venkatesh, Michael Bedford Taylor, Steven Swanson
2011 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines  
This paper describes an architecture and FPGA synthesis toolchain for building specialized, energy-saving coprocessors called Irregular Code Energy Reducers (ICERs) for a wide range of unmodified C programs  ...  In contrast, because the ICER approach targets energy rather than performance, it easily scales to large, irregular applications that are poor candidates for traditional acceleration.  ...  We would also like to thank Adrian Caulfield for help with the b-tree benchmark.  ... 
doi:10.1109/fccm.2011.45 dblp:conf/fccm/AroraSGBVTS11 fatcat:xxbwbena4ze75icihoqxq6j4vm

New OpenMP Directives for Irregular Data Access Loops

J. Labarta, E. Ayguadé, J. Oliver, D.S. Henty
2001 Scientific Programming  
Whenever the code is actually executed, only these selected updates are protected. We propose a new OpenMP clause, indirect, for parallel loops that have irregular data access patterns.  ...  We describe efficient compiler implementation strategies for the new directive.  ...  This research has been supported by the Ministry of Education of Spain under contract TIC98-511, by the CEPBA (European Center for Parallelism of Barcelona), by EPCC (Edinburgh Parallel Computing Center  ... 
doi:10.1155/2001/798505 fatcat:vw25jm5gcvbxbo2l25v6narp6u

Design of Efficiently Encodable Moderate-Length High-Rate Irregular LDPC Codes

M. Yang, W.E. Ryan, Y. Li
2004 IEEE Transactions on Communications  
Index Terms-Efficient encoding, error-rate floor, irregular repeat-accumulate codes, low-density parity-check (LDPC) codes.  ...  Codes in this class admit low-complexity encoding and have lower error-rate floors than other irregular LDPC code-design approaches.  ...  Siegel of the University of California, San Diego, for help with the initial density evolution program.  ... 
doi:10.1109/tcomm.2004.826367 fatcat:2n3bzgj345d57iraj2mcgsk6y4

Performance Analysis of LDPC Decoding Techniques

Abdel Halim A. Zikry, Ashraf Y. Hassan, Wageda I. Shaban, Sahar F. Abdel-Momen
2021 International journal of recent technology and engineering  
LDPC performance and also introduce different methods for decoding LDPC.  ...  Channel coding might be considered as the finest conversant and most potent components of cellular communications systems, that was employed for transmitting errors corrections imposed by noise, fading  ...  While irregular LDPC codes are more efficient than regular codes, there is an error floor and a senior level of irregular LDPC codes. encoding Complexity over regular codes .  ... 
doi:10.35940/ijrte.e5067.019521 fatcat:ivuvfz6cyjfgvd7bqvn7xr3lnu

Irregular Accesses Reorder Unit: Improving GPGPU Memory Coalescing for Graph-Based Workloads [article]

Albert Segura, Jose-Maria Arnau, Antonio Gonzalez
2020 arXiv   pre-print
The IRU reorders data processed by the threads on irregular accesses which significantly improves memory coalescing, and allows increased performance and energy efficiency.  ...  We evaluate our proposal for state-of-the-art graph-based algorithms and a wide selection of applications.  ...  Significant programmer effort, code complexity and underlying hardware knowledge is required to create efficient GPU code for irregular applications such as graph processing algorithms.  ... 
arXiv:2007.07131v1 fatcat:udclkksulbcibek2i4l6i2xtlm

On the parallelization of irregular and dynamic programs

Oscar Plata, Rafael Asenjo, Eladio Gutiérrez, Francisco Corbera, Angeles Navarro, Emilio L. Zapata
2005 Parallel Computing  
In this paper we discuss a methodology we designed to develop efficient parallelization techniques for irregular and dynamic applications, that proceeds in three stages: recognizing the complex program  ...  Complex applications may be characterized as irregular and dynamic. Irregular applications arrange data as multidimensional arrays and memory is referenced through array indirections.  ...  For instance, we discussed an efficient solution for irregular reductions.  ... 
doi:10.1016/j.parco.2005.02.012 fatcat:rlnytbtnpndzzoai2fu6rgfupq

The Paradigm compiler for distributed-memory multicomputers

P. Banerjee, J.A. Chandy, M. Gupta, E.W. Hodges, J.G. Holm, A. Lain, D.J. Palermo, S. Ramaswamy, E. Su
1995 Computer  
A unified approach efficiently supports regular and irregular computations using data and functional parallelism.  ...  The Paradigm (Parallelizing Compiler for Distributed-Memory, General-Purpose Multicomputers) project at the University of Illinois addresses this problem by developing automatic methods for efficient parallelization  ...  We are also grateful to the National Center for Supercomputing Applications, the San Diego Supercomputing Center, and the Argonne National Laboratory for providing access to their machines.  ... 
doi:10.1109/2.467577 fatcat:ghmtervcfzehzlelvf2ealwgyu

Efficient Mapping of Irregular C++ Applications to Integrated GPUs

Rajkishore Barik, Rashid Kaleem, Deepak Majeti, Brian T. Lewis, Tatiana Shpeisman, Chunling Hu, Yang Ni, Ali-Reza Adl-Tabatabai
2014 Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization  
The nine applications are pointer-intensive and operate on irregular data structures such as trees and graphs; they include face detection, BTree, single-source shortest path, soft-body physics simulation  ...  Our results show that Concord acceleration using the GPU improves energy efficiency by up to 6.04× on the Ultrabook and 3.52× on the  ...  The OpenCL code generated by our compiler for the loop body operator() function is shown on the right.  ... 
doi:10.1145/2544137.2544165 fatcat:bjpwxwclfbeoflbz6c5d3f6fnq

A family of irregular LDPC codes with low encoding complexity

S.J. Johnson, S.R. Weller
2003 IEEE Communications Letters  
We consider in this letter irregular quasi-cyclic lowdensity parity-check (LDPC) codes derived from difference families.  ...  The resulting codes can be encoded with low complexity and perform well when iteratively decoded with the sum-product algorithm.  ...  One option for efficient encoding is to use algebraic code constructions and exploit the subsequent code structure.  ... 
doi:10.1109/lcomm.2002.808375 fatcat:7cltlwodzjdevi6okqo2x7tcgu

Author retrospective for PYRROS

Tao Yang, Apostolos Gerasoulis
2014 25th Anniversary International Conference on Supercomputing Anniversary Volume -  
Since the publication of the PYRROS project, there have been new advancements in the area of DAG scheduling algorithms, the use of DAG scheduling for irregular and large-scale computation, and software  ...  Given a program with annotated task parallelism represented as a directed acyclic graph (DAG), the PYRROS project was focused on fast DAG scheduling, code generation and runtime execution on distributed  ...  The overall algorithm complexity is O(e + vlogv) for a DAG with v nodes and e edges, and the near-linear complexity allows the system to handle a large task graph efficiently.  ... 
doi:10.1145/2591635.2591647 dblp:conf/ics/YangG14 fatcat:35dgziip7rbrnacduqv2z7z3ji

A modified Offset Min-Sum decoding algorithm for LDPC codes

Meng Xu, Jianhui Wu, Meng Zhang
2010 2010 3rd International Conference on Computer Science and Information Technology  
For example, when BER is 10 5, our algorithm can achieve 0.1dB and 0.2dB decoding gain over Offset Min-Sum algorithm for regular and irregular LDPC codes respectively.  ...  In this paper a modified Offset Min-Sum decoding algorithm for Low-Density Parity Check Codes is presented.  ...  Simulation result shows that, for both regular and irregular LDPC codes, our modified algorithm can achieve better decoding performance with only minor increased computational complexity required.  ... 
doi:10.1109/iccsit.2010.5564884 fatcat:o3ydy2z62neopkd533gwpi4fru

An Adaptive Heterogeneous Runtime for Irregular Applications in the Case of Ray-Tracing (Extended Abstract) [chapter]

Chih-Chen Kao, Wei-Chung Hsu
2014 Lecture Notes in Computer Science  
For example, in regular code, control flow and data memory references are not data dependent. Dense matrix multiplication operations are good examples of regular code.  ...  On the other hand, in irregular code, both control flow and data memory references could be data dependent. For example, graph-based applications are  ...  Therefore, mapping irregular code efficiently onto a heterogeneous system remains difficult [7] [8] .  ... 
doi:10.1007/978-3-662-44917-2_63 fatcat:kisoz3mahvddnda5kpxv5vwjsi
« Previous Showing results 1 — 15 out of 91,767 results