A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2010; you can also visit the original URL.
The file type is application/pdf
.
Filters
Memory-Constrained Data Locality Optimization for Tensor Contractions
[chapter]
2004
Lecture Notes in Computer Science
In this paper, we address the memory-constrained data-locality optimization problem in the context of this class of computations. ...
The accurate modeling of the electronic structure of atoms and molecules involves computationally intensive tensor contractions over large multidimensional arrays. ...
Acknowledgments We thank the National Science Foundation for its support of this research through the Information Technology Research program (CHE-0121676 and CHE-0121706), NSF grants CCR-0073800 and EIA ...
doi:10.1007/978-3-540-24644-2_7
fatcat:ixthi66h3rgr5onjrwdobvz2oe
Empirical performance model-driven data layout optimization and library call selection for tensor contraction expressions
2012
Journal of Parallel and Distributed Computing
order [44] and loop fusion [43, 42] for reducing memory requirements, space-time trade-off optimization [15] , data locality optimization, which combines loop fusion and tiling for reducing disc-to-memory ...
The order of indices of the intermediate tensors is not constrained. ...
GEMM implementations can achieve such a high performance, since they use tiling for all levels in the memory hierarchy to optimize temporal locality. ...
doi:10.1016/j.jpdc.2011.09.006
fatcat:xxgixibzbzhr3o6sw34sympcpa
Data Locality Optimization for Synthesis of Efficient Out-of-Core Algorithms
[chapter]
2003
Lecture Notes in Computer Science
This paper describes an approach to synthesis of efficient out-of-core code for a class of imperfectly nested loops that represent tensor contraction computations. ...
Tensor contraction expressions arise in many accurate computational models of electronic structure. ...
We would also like to thank the Ohio Supercomputer Center (OSC) for the use of their computing facilities. ...
doi:10.1007/978-3-540-24596-4_44
fatcat:uf7sndsr6vbjnlq5hvdyeilkda
Practical Loop Transformations for Tensor Contraction Expressions on Multi-level Memory Hierarchies
[chapter]
2011
Lecture Notes in Computer Science
Optimizing applications for such architectures requires careful management of the data movement across all these levels. ...
In this paper, we focus on the problem of mapping tensor contractions to memory hierarchies with more than two levels, specifically addressing placement of memory allocation and data movement statements ...
The tile sizes for the data movement placement can be determined by solving this non-linear constrained optimization problem. ...
doi:10.1007/978-3-642-19861-8_15
fatcat:eyf4rajoercptjnsktt3adiumi
Analytical cache modeling and tilesize optimization for tensor contractions
2019
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '19
Tiling and loop permutation are key techniques for improving data locality. ...
In this paper we provide an analytical model based approach to multi-level tile size optimization and permutation selection for tensor contractions. ...
ACKNOWLEDGMENTS We thank the reviewers for their valuable feedback. This work was supported in part by the U.S. ...
doi:10.1145/3295500.3356218
dblp:conf/sc/LiSVLRRS19
fatcat:376oo5hdqbehtfell2khswwv7i
Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver
2006
Journal of Parallel and Distributed Computing
We address the problem of efficient out-of-core code generation for a special class of imperfectly nested loops encoding tensor contractions. ...
These loops operate on arrays too large to fit in physical memory. The problem involves determining optimal tiling and placement of disk I/O statements. ...
Benjamin Wah and Yi Xin Chen of the University of Illinois for their significant help with using the Discrete Constrained Search (DCS) Solver. ...
doi:10.1016/j.jpdc.2005.06.017
fatcat:vbkqcart6jeyrals3umw2uqohe
A Quantum-Inspired Tensor Network Algorithm for Constrained Combinatorial Optimization Problems
2022
Frontiers in Physics
In this paper, we propose a quantum-inspired tensor-network-based algorithm for general locally constrained combinatorial optimization problems. ...
Our algorithm constructs a Hamiltonian for the problem of interest, effectively mapping it to a quantum problem, then encodes the constraints directly into a tensor network state and solves the optimal ...
The code used to generate the data presented in this study can be publicly accessed on GitHub at [30] . ...
doi:10.3389/fphy.2022.906590
fatcat:numnghitqbcjzl4uf5cgxbbtju
A quantum-inspired tensor network method for constrained combinatorial optimization problems
[article]
2022
arXiv
pre-print
In this paper, we propose a quantum inspired algorithm for general locally constrained combinatorial optimization problems by encoding the constraints directly into a tensor network state. ...
Combinatorial optimization is of general interest for both theoretical study and real-world applications. ...
The code used to generate the data presented in this study can be publicly accessed on GitHub at [5] . ...
arXiv:2203.15246v1
fatcat:y7iglyelvvde5hgupub4pezywy
Cross-scale efficient tensor contractions for coupled cluster computations through multiple programming model backends
2017
Journal of Parallel and Distributed Computing
These calculations are dominated by a sequence of tensor contractions, motivating the development of numerical libraries for such operations. ...
We achieve up to 240× speedup compared with the optimized shared memory implementation of Libtensor. ...
We would like to thank the anonymous reviewers for providing suggestions to get a better performance of NWChem runs and improve the presentation in this manuscript. ...
doi:10.1016/j.jpdc.2017.02.010
fatcat:mcrxnl4b2vaslg7r35sodabn3e
Lifetime-based Method for Quantum Simulation on a New Sunway Supercomputer
[article]
2022
arXiv
pre-print
Faster classical simulation becomes essential for the validation of quantum computer, and tensor network contraction is a widely-applied simulation approach. ...
Due to the memory limitation, slicing is adopted to help cutting down the memory size by reducing the tensor dimension, which also leads to additional computation overhead. ...
ACKNOWLEDGMENT We would like to thank Man-Hong Yung, Pengfei Zhou, Zegang Li and Yuxuan Li for advices and discussions. ...
arXiv:2205.00393v1
fatcat:vtraiuzdxreu7mhsdgqhr4vsai
Closing the "Quantum Supremacy" Gap: Achieving Real-Time Simulation of a Random Quantum Circuit Using a New Sunway Supercomputer
[article]
2021
arXiv
pre-print
to about 42 million cores; (3) a fused permutation and multiplication design that improves the compute efficiency for a wide range of tensor contraction scenarios; and (4) a mixed-precision scheme to ...
We develop a high-performance tensor-based simulator for random quantum circuits(RQCs) on the new Sunway supercomputer. ...
Acknowledgement We would like to thank Dapeng Yu, Heng Fan, Guoping Guo, Yongjian Han, and Xiaobo Zhu for advices and discussions. ...
arXiv:2110.14502v1
fatcat:otiuvk735ncarbngangfrq777i
Analysis and tuning of libtensor framework on multicore architectures
2014
2014 21st International Conference on High Performance Computing (HiPC)
It has been optimized for symmetry and sparsity to be memory efficient. This allows it to run efficiently on the ubiquitous and cost-effective SMP architectures. ...
To that end, in this paper, we explore a number of optimization techniques including a thread-friendly and NUMA-aware memory allocator and garbage collector, tuning the tensor tiling factor, and tuning ...
Data locality The performance of BLAS routines, typically used in tensor contractions, depends on the locality of the data. ...
doi:10.1109/hipc.2014.7116881
dblp:conf/hipc/IbrahimWEK14
fatcat:fi63t7ifjne57bzfqor2rhvkua
Empirical Performance-Model Driven Data Layout Optimization
[chapter]
2005
Lecture Notes in Computer Science
The performance model with empirically determined cost components is used to perform data layout optimization in the context of the Tensor Contraction Engine, a compiler for a high-level domainspecific ...
language for expressing computational models in quantum chemistry. ...
The approach was developed for a program synthesis system targeted at the quantum chemistry domain. ...
doi:10.1007/11532378_7
fatcat:rsnop6hjx5fltjrj2ivbfx2mdi
AutoHOOT: Automatic High-Order Optimization for Tensors
[article]
2020
arXiv
pre-print
These tensor methods are used for data analysis and simulation of quantum systems. ...
High-order optimization methods, including Newton's method and its variants as well as alternating minimization methods, dominate the optimization algorithms for tensor decompositions and tensor networks ...
Constrained Contraction Path Construction We provide a constrained contraction path selection routine, such that the contraction path is optimized under the constraint that partial inputs' contraction ...
arXiv:2005.04540v2
fatcat:4kfib322srf7xen555toxa7774
Synthesis of High-Performance Parallel Programs for a Class of ab Initio Quantum Chemistry Models
2005
Proceedings of the IEEE
These computations are expressible as a set of tensor contractions and arise in electronic structure modeling. ...
This paper provides an overview of a program synthesis system for a class of quantum chemistry computations. ...
DATA LOCALITY OPTIMIZATION Once a solution is found that fits onto disk, we optimize the data locality to reduce memory and disk access times. ...
doi:10.1109/jproc.2004.840311
fatcat:gaxrixebifhstbb7ul2gsrpc4m
« Previous
Showing results 1 — 15 out of 4,540 results