Filters








6,043 Hits in 6.5 sec

Programming for parallelism and locality with hierarchically tiled arrays

Ganesh Bikshandi, Jia Guo, Daniel Hoeflinger, Gheorghe Almasi, Basilio B. Fraguela, María J. Garzarán, David Padua, Christoph von Praun
2006 Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '06  
Tiling can be used to organize computations so that communication costs in parallel programs are reduced and locality in sequential codes or sequential components of parallel programs is enhanced.  ...  To support this claim, we discuss our experiences with the implementation of HTAs for MATLAB and C++ and the rewriting of the NAS benchmarks and a few other programs into HTA-based parallel form.  ...  Acknowledgments This work was supported by the National Science Foundation (NGS program) under Grant No. 0103610.  ... 
doi:10.1145/1122971.1122981 dblp:conf/ppopp/BikshandiGHAFGPP06 fatcat:npnhajyndvfrxluz3ug4ng3ay4

Programming for Locality and Parallelism with Hierarchically Tiled Arrays [chapter]

Gheorghe Almási, Luiz De Rose, Basilio B. Fraguela, José Moreira, David Padua
2004 Lecture Notes in Computer Science  
This paper introduces a new primitive data type, hierarchically tiled arrays (HTAs), which could be incorporated into conventional languages to facilitate parallel programing and programming for locality  ...  Also, the paper shows that, with HTAs, parallel computations and the associated communication operations can be expressed as array operations within single threaded programs.  ...  In a nutshell, our idea is to distribute the outermost tiles of a hierarchically tiled array for parallelism, and used the inner tiles for locality and message aggregation.  ... 
doi:10.1007/978-3-540-24644-2_11 fatcat:55l3jqrtebbadge27x4ywf6zpu

Hierarchically tiled arrays for parallelism and locality

Jia Guo, G. Bikshandi, D. Hoeflinger, G. Almasi, B. Fraguela, M.J. Garzaran, D. Padua, C. von Praun
2006 Proceedings 20th IEEE International Parallel & Distributed Processing Symposium  
Parallel programming is facilitated by constructs which, unlike the widely used SPMD paradigm, provide programmers with a global view of the code and data structures.  ...  In this paper, we describe a class developed at Illinois and its MATLAB implementation. This class can be used to conveniently express both parallelism and locality.  ...  Introduction This paper describes a class of objects, hierarchically tiled arrays (HTAs) [4] , which can be used to represent both tiled parallel computations and sequential computations tiled for locality  ... 
doi:10.1109/ipdps.2006.1639573 dblp:conf/ipps/GuoBHAFGPP06 fatcat:gll7rp2j6jdk7ob25pmoasgqe4

The Hierarchically Tiled Arrays programming approach

Basilio B. Fraguela, Jia Guo, Ganesh Bikshandi, María J. Garzarán, Gheorghe Almási, José Moreira, David Padua
2004 Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems - LCR '04  
In this paper, we show our initial experience with a class of objects, called Hierarchically Tiled Arrays (HTAs), that encapsulate parallelism.  ...  The tiled and recursive nature of HTAs facilitates the adaptation of the programs that use them to varying machine configurations, and eases the mapping of data and tasks to parallel computers with a hierarchical  ...  The idea is to distribute across processors the outermost tiles of a HTA for parallelism, and use the inner tiles for locality.  ... 
doi:10.1145/1066650.1066657 fatcat:ukq2jifsavfh3iz6brx4dijjqy

Implementation of Parallel Numerical Algorithms Using Hierarchically Tiled Arrays [chapter]

Ganesh Bikshandi, Basilio B. Fraguela, Jia Guo, María J. Garzarán, Gheorghe Almási, José Moreira, David Padua
2005 Lecture Notes in Computer Science  
The tiled and recursive nature of HTAs facilitates the development of algorithms with a high degree of parallelism as well as locality.  ...  In this paper we explore the possibility of extending a single-threaded objectoriented programming language with a new class, called Hierarchically Tiled Array or HTA [3] , that encapsulates the parallelism  ...  A hierarchically tiled array (HTA) is a tiled array where each tile is either an unpartitioned array or an HTA.  ... 
doi:10.1007/11532378_8 fatcat:obeuord7izeutbyyhxr3s6ai6a

A Parallel Numerical Solver Using Hierarchically Tiled Arrays [chapter]

James C. Brodman, G. Carl Evans, Murat Manguoglu, Ahmed Sameh, María J. Garzarán, David Padua
2011 Lecture Notes in Computer Science  
The Hierarchically Tiled Array data type extends traditional data-parallel array operations with explicit tiling and allows programmers to directly manipulate tiles.  ...  Exploiting parallelism is essential for solving complex systems, and this traditionally involves writing parallel algorithms on top of a library such as MPI.  ...  Acknowledgments This material is based upon work supported by the National Science Foundation under Awards CCF 0702260 and by the Universal Parallel Computing Research Center at the University of Illinois  ... 
doi:10.1007/978-3-642-19595-2_4 fatcat:4udcahaqbfdjjarcmryszmbg2m

Hierarchical overlapped tiling

Xing Zhou, Jean-Pierre Giacalone, María Jesús Garzarán, Robert H. Kuhn, Yang Ni, David Padua
2012 Proceedings of the Tenth International Symposium on Code Generation and Optimization - CHO '12  
Hierarchical overlapped tiling performs overlapped tiling hierarchically to balance communication overhead and redundant computation, and thus has the potential to provide better performance.  ...  This paper introduces hierarchical overlapped tiling, a transformation that applies loop tiling and fusion to conventional loops.  ...  ACKNOWLEDGMENTS This material is based upon work supported by the National Science Foundation under Awards CNS 1111407 and CCF 0702260, and by the Illinois-Intel Parallelism Center at the University of  ... 
doi:10.1145/2259016.2259044 dblp:conf/cgo/ZhouGGKNP12 fatcat:dqaxqufq6zffdnajiaug5kfdzi

An Extensible System for Multilevel Automatic Data Partition and Mapping

Arturo Gonzalez-Escribano, Yuri Torres, Javier Fresno, Diego R. Llanos
2014 IEEE Transactions on Parallel and Distributed Systems  
Currently, it supports hierarchical tiling of arrays with dense and stride domains, that allows the implementation of both data and task parallelism using a SPMD model.  ...  and multicore systems, and substantially reduces programming effort.  ...  INTRODUCTION Tiling is a well-known technique used to distribute data and tasks in parallel programs [1] and to improve the locality of loop nests in parallel and sequential code [2] .  ... 
doi:10.1109/tpds.2013.83 fatcat:4eb3vdbd4zghhax2ed4krcxgxe

Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime

Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Jack Dongarra
2014 2014 IEEE International Parallel & Distributed Processing Symposium Workshops  
To demonstrate this scalability, in this paper, we design and implement a 3D virtual systolic array to compute a tile QR decomposition of a tall-and-skinny dense matrix.  ...  machine and obtain excellent parallel performance.  ...  ACKNOWLEDGMENTS This work is supported by grant #SHF-1117062: "Parallel Unified Linear algebra with Systolic ARrays (PULSAR)" from the National Science Foundation (NSF).  ... 
doi:10.1109/ipdpsw.2014.167 dblp:conf/ipps/YamazakiKLD14 fatcat:iim272guh5gmxpysxpcadc6pvy

Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime

Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Jack Dongarra
2014 Parallel Processing Letters  
To demonstrate this scalability, in this paper, we design and implement a 3D virtual systolic array to compute a tile QR decomposition of a tall-and-skinny dense matrix.  ...  machine and obtain excellent parallel performance.  ...  ACKNOWLEDGMENTS This work is supported by grant #SHF-1117062: "Parallel Unified Linear algebra with Systolic ARrays (PULSAR)" from the National Science Foundation (NSF).  ... 
doi:10.1142/s0129626414420043 fatcat:tug2bdpfnnd5fcei4qmz2nqdye

Design and Use of htalib – A Library for Hierarchically Tiled Arrays [chapter]

Ganesh Bikshandi, Jia Guo, Christoph von Praun, Gabriel Tanase, Basilio B. Fraguela, María J. Garzarán, David Padua, Lawrence Rauchwerger
2007 Lecture Notes in Computer Science  
Hierarchically Tiled Arrays (HTAs) are data structures that facilitate locality and parallelism of array intensive computations with block-recursive nature.  ...  We describe the interface and design of htalib and our experience with the new programming constructs.  ...  There are three principal approaches to implement global view programming models for distributed memory systems: (i) extensions of existing languages with Acknowledgment.  ... 
doi:10.1007/978-3-540-72521-3_3 fatcat:qhevzrqxercybolfcp2f3udbgy

Blending Extensibility and Performance in Dense and Sparse Parallel Data Management

Javier Fresno, Arturo Gonzalez-Escribano, Diego R. Llanos
2014 IEEE Transactions on Parallel and Distributed Systems  
Dealing with both dense and sparse data in parallel environments usually leads to two different approaches: To rely on a monolithic, hard-to-modify parallel library, or to code all data management details  ...  Our solution integrates dense and sparse data management using a common interface, that also decouples data representation, partitioning, and layout from the algorithmic and parallel strategy decisions  ...  Mogecopp project TIN2011-25639, CAPAP-H3 network TIN2010-12011-E, CAPAP-H4 network TIN2011-15734-E); and the HPC-EUROPA2 project (project number: 228398) with the support of the European Commission -Capacities  ... 
doi:10.1109/tpds.2013.248 fatcat:ecciaa4e6razpmy24z7lefzutq

Optimization techniques for efficient HTA programs

Basilio B. Fraguela, Ganesh Bikshandi, Jia Guo, María J. Garzarán, David Padua, Christoph von Praun
2012 Parallel Computing  
This paper describes our experience with the implementation of a C++ data type called Hierarchically Tiled Array (HTA).  ...  This object includes data parallel operations and allows the manipulation of tiles to facilitate developing efficient parallel codes and codes with high degree of locality.  ...  of the European Union, under the grant TIN2010-16735, as well as by the National Science Foundation under Awards CNS 1111407, CCF 0702260, and CNS 0720594, and by the Illinois-Intel Parallelism Center  ... 
doi:10.1016/j.parco.2012.05.002 fatcat:cfddquglbjcvhevlq3j3m7ji2e

Programming with tiles

Jia Guo, Ganesh Bikshandi, Basilio B. Fraguela, Maria J. Garzaran, David Padua
2008 Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming - PPoPP '08  
This paper discusses Hierarchically Tiled Arrays (HTAs), a data type which facilitates the easy manipulation of tiles in objectoriented languages with emphasis on two new features, dynamic partitioning  ...  This gives place to bloated programs populated with numerous subscript expressions which make the code difficult to read and coding mistakes more likely.  ...  Acknowledgments We thank Christoph von Praun and James Brodman for their contributions to this research.  ... 
doi:10.1145/1345206.1345225 dblp:conf/ppopp/GuoBFGP08 fatcat:mnogojrba5fvna4enhgv42htga

Nonlinear array layouts for hierarchical memory systems

Siddhartha Chatterjee, Vibhor V. Jain, Alvin R. Lebeck, Shyam Mundhra, Mithuna Thottethodi
1999 Proceedings of the 13th international conference on Supercomputing - ICS '99  
Programming languages that provide multidimensional arrays and a flat linear model of memory must implement a mapping between these two domains to order array elements in memory.  ...  In reality, modern memory systems are architecturally hierarchical rather than flat, with substantial differences in performance among different levels of the hierarchy.  ...  [10] discuss hierarchical tiling schemes for a hierarchical shared memory model.  ... 
doi:10.1145/305138.305231 dblp:conf/ics/ChatterjeeJLMT99 fatcat:m65myxjpxnexzlypymltmu7c5m
« Previous Showing results 1 — 15 out of 6,043 results