Filters








1,297 Hits in 2.9 sec

Tile Coding Based on Hyperplane Tiles [chapter]

Daniele Loiacono, Pier Luca Lanzi
2008 Lecture Notes in Computer Science  
We compared the performance of hyperplane tile coding with the usual tile coding on three well-known benchmark problems.  ...  Our results suggest that the hyperplane tiles improve the generalization capabilities of the tile coding approximator: in the hyperplane tile coding broad generalizations over the problem space result  ...  As an example, Algorithm 1 reports the pseudo code for the implementation of Q-learning based on hyperplane tile coding.  ... 
doi:10.1007/978-3-540-89722-4_14 fatcat:n6f7h5bbrzhnzee4itpa5wuxjq

Tiling stencil computations to maximize parallelism

Vinayaka Bandishti, Irshad Pananilath, Uday Bondhugula
2012 2012 International Conference for High Performance Computing, Networking, Storage and Analysis  
We first provide necessary and sufficient conditions on tiling hyperplanes to enable concurrent start for programs with affine data accesses. We then provide an approach to find such hyperplanes.  ...  Experimental evaluation on a 12-core Intel Westmere shows that our code is able to outperform a tuned domain-specific stencil code generator by 4% to 27%, and previous compiler techniques by a factor of  ...  It itself uses the Cloog [9] library for code generation, and PIP [25] to solve for coefficients of hyperplanes. PrimeTile [16] is used to perform unroll-jam on Pluto generated code.  ... 
doi:10.1109/sc.2012.107 dblp:conf/sc/BandishtiPB12 fatcat:r5j5h4subncu5nntb5xnvfaf2e

Effective automatic parallelization of stencil computations

Sriram Krishnamoorthy, Muthu Baskaran, Uday Bondhugula, J. Ramanujam, Atanas Rountev, P Sadayappan
2007 SIGPLAN notices  
However, loop skewing is typically required in order to tile stencil codes along the time dimension, resulting in load imbalance in pipelined parallel execution of the tiles.  ...  In this paper, we develop an approach for automatic parallelization of stencil codes, that explicitly addresses the issue of load-balanced execution of tiles.  ...  We thank David Callahan for suggesting split tiling.  ... 
doi:10.1145/1273442.1250761 fatcat:xij5hgziqrh63lma6dl4izoxjy

Effective automatic parallelization of stencil computations

Sriram Krishnamoorthy, Muthu Baskaran, Uday Bondhugula, J. Ramanujam, Atanas Rountev, P Sadayappan
2007 Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation - PLDI '07  
However, loop skewing is typically required in order to tile stencil codes along the time dimension, resulting in load imbalance in pipelined parallel execution of the tiles.  ...  In this paper, we develop an approach for automatic parallelization of stencil codes, that explicitly addresses the issue of load-balanced execution of tiles.  ...  We thank David Callahan for suggesting split tiling.  ... 
doi:10.1145/1250734.1250761 dblp:conf/pldi/KrishnamoorthyBBRRS07 fatcat:m3cyagrr2rdrlljligiphr53tm

Jagged Tiling for Intra-tile Parallelism and Fine-Grain Multithreading [chapter]

Sunil Shrestha, Joseph Manzano, Andres Marquez, John Feo, Guang R. Gao
2015 Lecture Notes in Computer Science  
of parallel tiles with an efficient synchronization registry.  ...  The main contributions of this paper include the introduction of multi-hierarchical tiling techniques that increases intra tile parallelism; and a data-flow inspired runtime library that allows the expression  ...  This approach is very effective as we can see in classical tiling approaches where tile sizes are based on cache sizes.  ... 
doi:10.1007/978-3-319-17473-0_11 fatcat:z4mrrvucyrecvezcdvdkw4xlrq

A practical automatic polyhedral parallelizer and locality optimizer

Uday Bondhugula, Albert Hartono, J. Ramanujam, P. Sadayappan
2008 SIGPLAN notices  
The framework has been implemented into a tool to automatically generate OpenMP parallel code from C program sections.  ...  driven by an integer linear optimization framework that takes an explicit view of finding good ways of tiling for parallelism and locality using affine transformations.  ...  The code has four statements -three of them 3-d and one 2-d and are nested imperfectly. Our transformation framework finds three tiling hyperplanes (all in one band -fully permutable).  ... 
doi:10.1145/1379022.1375595 fatcat:mx5tqjwvdzfelgf4j7rrwb7ojm

A practical automatic polyhedral parallelizer and locality optimizer

Uday Bondhugula, Albert Hartono, J. Ramanujam, P. Sadayappan
2008 Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation - PLDI '08  
The framework has been implemented into a tool to automatically generate OpenMP parallel code from C program sections.  ...  driven by an integer linear optimization framework that takes an explicit view of finding good ways of tiling for parallelism and locality using affine transformations.  ...  The code has four statements -three of them 3-d and one 2-d and are nested imperfectly. Our transformation framework finds three tiling hyperplanes (all in one band -fully permutable).  ... 
doi:10.1145/1375581.1375595 dblp:conf/pldi/BondhugulaHRS08 fatcat:oxeykavud5fqffeswz3o7k5ote

Locality aware concurrent start for stencil applications

Sunil Shrestha, Guang R. Gao, Joseph Manzano, Andres Marquez, John Feo
2015 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)  
In this paper, we provide an efficient tiling technique that allows hierarchical concurrent start for memory hierarchy aware tile groups.  ...  Stencil computations are at the heart of many physical simulations used in scientific codes. Thus, there exists a plethora of optimization efforts for this family of computations.  ...  The domain with these supernode iterators becomes the tiled domain, where using hyperplanes (0,1) and (1,0) as tiling hyperplanes results in the same hyperplanes as the ones used to create Level 1 tiles  ... 
doi:10.1109/cgo.2015.7054196 dblp:conf/cgo/ShresthaGMMF15 fatcat:a6pswjqqjjdubkkazxrwj4k77y

The Relation Between Diamond Tiling and Hexagonal Tiling

Tobias Grosser, Sven Verdoolaege, Albert Cohen, P. Sadayappan
2014 Parallel Processing Letters  
with tile sizes for hybrid-hexagonal tiling has been exploited for effective generation of GPU code.  ...  We analyze the effects of tile size and wavefront choices on tile-level parallelism, and formulate constraints for optimal diamond tile shapes.  ...  The compiler is not based on the polyhedral model, but uses abstract interpretation for array regions, performing powerful inter-procedural analysis on the input code.  ... 
doi:10.1142/s0129626414410023 fatcat:5kz7nzjdlzgu3hvexwqgzht5fu

Effective automatic computation placement and dataallocation for parallelization of regular programs

Chandan Reddy, Uday Bondhugula
2014 Proceedings of the 28th ACM international conference on Supercomputing - ICS '14  
Experimental results on a 32-core shared-memory SMP system shows a mean speedup of 2.67× over code that is not data tiled.  ...  Our approach for data allocation is driven by tiling of data spaces along with a scheme to allocate and deallocate tiles on demand and reuse them.  ...  Previous work [4] provided an automatic approach to find compute tiling hyperplanes that exposed maximal course-grained parallelism and locality based on an Integer Linear Programming formulation.  ... 
doi:10.1145/2597652.2597673 dblp:conf/ics/ReddyB14 fatcat:56ct6skdwrdr7apbp55du5zjnu

Automatic Parallelization of Tiled Loop Nests with Enhanced Fine-Grained Parallelism on GPUs

Peng Di, Ding Ye, Yu Su, Yulei Sui, Jingling Xue
2012 2012 41st International Conference on Parallel Processing  
We find tiling hyperplanes by embedding parallelismenhancing constraints in the polyhedral model to maximize intra-tile, i.e., intra-SM parallelism.  ...  Presently, compilers may generate code significantly slower than hand-optimized code for certain applications.  ...  Finally, we use CLooG with the extension described in [4] to generate CUDA code from the tiling hyperplanes selected. V.  ... 
doi:10.1109/icpp.2012.19 dblp:conf/icpp/DiYSSX12 fatcat:ednmeyeip5d6zdzozuv6unhhky

Parameterized Diamond Tiling for Stencil Computations with Chapel parallel iterators

Ian J. Bertolacci, Catherine Olschanowsky, Ben Harshbarger, Bradford L. Chamberlain, David G. Wonnacott, Michelle Mills Strout
2015 Proceedings of the 29th ACM on International Conference on Supercomputing - ICS '15  
Parallel scaling of stencil computations can be significantly improved on multicore processors using advanced tiling techniques that include the time dimension, such as diamond tiling.  ...  Ideally, the execution schedule or tiling code will be expressed orthogonally to the computation. This supports code reuse, easier tuning, and improved programmer productivity.  ...  Given tiling hyperplanes, it is possible to specify a tiling to a polyhedral code generator with a scattering function.  ... 
doi:10.1145/2751205.2751226 dblp:conf/ics/BertolacciOHCWS15 fatcat:r6gsrx4f3jax3egcdg3aepyqwu

Adaptive value function approximations in classifier systems

Lashon B. Booker
2005 Proceedings of the 2005 workshops on Genetic and evolutionary computation - GECCO '05  
Hyperplane coding is a closely related variation of tile coding [3] in which classifier rule conditions fill the role of tiles, and there are few restrictions on the way those "tiles" are organised.  ...  One open question remaining about hyperplane coding is how the quality of the approximation is affected by the set of classifiers in the population.  ...  Acknowledgements This work is based on research originally funded by the MITRE Sponsored Research program. That support is gratefully acknowledged.  ... 
doi:10.1145/1102256.1102276 dblp:conf/gecco/Booker05 fatcat:56yvt5dn4nejjiicntjj36lste

Automatic Storage Optimization for Arrays

Somashekaracharya G. Bhaskaracharya, Uday Bondhugula, Albert Cohen
2016 ACM Transactions on Programming Languages and Systems  
computations, high-performance computing, and the class of tiled codes in general.  ...  We formulate the problem of intra-array storage optimization as one of finding the right storage partitioning hyperplanes: each storage partition corresponds to a single storage location.  ...  Suppose that the hyperplane Γ has been found based on the conflict set CS = K 1 ∪ K 2 ∪ · · · ∪ K l .  ... 
doi:10.1145/2845078 fatcat:ol4jd5gfonhcva4qhk2adfvzca

Mapping Optimization of Affine Loop Nests for Reconfigurable Computing Architecture

Dajiang LIU, Shouyi YIN, Chongyong YIN, Leibo LIU, Shaojun WEI
2012 IEICE transactions on information and systems  
Polyhedron model is a powerful tool to give a reasonable transformation on such nested loops.  ...  Compared with DFG-based optimization approach, the execution performances of 1-d jacobi and matrix multiplication are improved by 28% and 48.47%.  ...  Finding Good Θ Hyperplane Based on the theory of polyhedron model in the previous subsection, we propose an algorithm to find two hyperplanes to split the affine loop nests and map the split tile to the  ... 
doi:10.1587/transinf.e95.d.2898 fatcat:uuhngffr3bbbnfnhodqhzwbrda
« Previous Showing results 1 — 15 out of 1,297 results