Filters








89 Hits in 8.7 sec

Optimal communication channel utilization for matrix transposition and related permutations on binary cubes

S.Lennart Johnsson, Ching-Tien Ho
1994 Discrete Applied Mathematics  
For schedules with optimal channel utilization, the number of block transfers for a binary d-cube is d. The maximum block size for K elements per node is rK/(2d) 1.  ...  With concurrent communication on all channels of every node in binary cube networks, the number of element transfers in sequence for K elements per node is K/2, irrespective of the number of nodes over  ...  Examples of all-to-all personalized communication are bit-reversal, vector-reversal, matrix transposition and shuffle permutations.  ... 
doi:10.1016/0166-218x(94)90189-9 fatcat:vz3qj24elfdzvmzcs2d4kgfzvi

Author index

1994 Discrete Applied Mathematics  
Kiinig Lennart Johnsson, S. and C.-T. Ho, Optimal communication channel utilization for matrix transposition and related permutations on binary cubes Liestman, A.L., see L.  ...  Peleg, Traffic-light scheduling on the grid Labahn, R., S.T. Hedetniemi and R. Laskar, Periodic gossiping on trees Labahn, R., A minimum broadcast graph on 63 vertices Laskar, R., see R.  ... 
doi:10.1016/0166-218x(94)90195-3 fatcat:tbqkxb4ey5hmjfu5c3ndqdsewe

Page 2935 of Mathematical Reviews Vol. , Issue 95e [page]

1995 Mathematical Reviews  
Lennart (1-HRV; Cambridge, MA); Ho, Ching-Tien (1-IBM2; San Jose, CA) Optimal communication channel utilization for matrix transposition and related permutations on binary cubes.  ...  For schedules with optimal channel utilization, the number of block transfers for a binary d-cube is d. The maximum block size for K elements per node is [K/(2d)].”  ... 

Transposing Arrays on Multicomputers Using de Bruijn Sequences

Paul N. Swarztrauber
1998 Journal of Parallel and Distributed Computing  
Varvarigos and Bertsekas [21] use additive matrix decomposition to develop a class of optimal algorithms for isotropic communication tasks, i.e. a combination of task and architecture that is symmetric  ...  For an element-based system, the time required to transmit l elements on a single channel is τl for all l . Saad and Schultz [13] call τ the elemental transfer time.  ...  Index-digit permutations include matrix transposition, shuffles and the bit reversed orderings that are used, for example, in the FFT.  ... 
doi:10.1006/jpdc.1998.1476 fatcat:uf42yxexjnh23nscfets2pmcvi

Page 641 of Mathematical Reviews Vol. , Issue 95b [page]

1995 Mathematical Reviews  
Lennart Johnsson and Ching-Tien Ho, Optimal communication channel utilization for matrix transposition and related permutations on binary cubes (251-274); M. Mahéo and J.-F.  ...  Mysliwietz, Optimal algorithms for dissemination of in- formation in generalized communication modes (55-78); Pierre Fraigniaud and Emmanuel Lazard, Methods and problems of com- munication in usual networks  ... 

Optimal processor mapping for linear-complement communication on hypercubes

Yomin Hou, Chien-Min Wang, Chiu-Yu Ku, Lih-Hsing Hsu
2001 IEEE Transactions on Parallel and Distributed Systems  
An algorithm based on dynamic programming is also proposed to find an optimal reordering mapping for a set of linear-complement communications.  ...  The communication on the hypercube is an LCC, y e I x I , which is the matrix transpose shown in Example 3.  ...  ACKNOWLEDGMENTS The authors would like to thank the anonymous referees for their helpful suggestions.  ... 
doi:10.1109/71.926171 fatcat:6l2jiwn3fbbbrljzby6eh2npie

Intensive hypercube communication Prearranged communication in link-bound machines

Quentin F. Stout, Bruce Wagar
1990 Journal of Parallel and Distributed Computing  
Hypercube algorithms are developed for a variety of communication-intensive tasks such as transposing a matrix, histogramming, sending a (long) message from one node to another, broadcasting a message  ...  from one node to all others, broadcasting a message from each node to all others, and exchanging messages between nodes via a fixed permutation.  ...  closely related to MATRIX TRANSPOSITION.  ... 
doi:10.1016/0743-7315(90)90026-l fatcat:didlyx4zvffsfiqzoxk4erxx3a

A Generalization of the Allreduce Operation [article]

Dmitry Kolmakov, Xuecang Zhang
2020 arXiv   pre-print
We present a novel approach to communication description based on the permutations inspired by the mathematics of a Rubik's cube where the moves form a mathematical structure called group.  ...  The proposed algorithm provides a general solution for any number of processes with the dynamically changing amount of communication steps between logP for the latency-optimal version and 2 ·logP for the  ...  A composition of such elementary transpositions generates other permutations which may be more complex and describe communications involving several processes.  ... 
arXiv:2004.09362v2 fatcat:3f2dsitm7jccbbv776oskg2iji

ROMM routing on mesh and torus networks

Ted Nesson, S. Lennart Johnsson
1995 Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures - SPAA '95  
ROMM routing also offers close to best case performance for many common routing problems. In previous work, these claims were supported by extensive simulations on binary cube networks 30, 31].  ...  Here we present analytical and empirical results for ROMM routing on wormhole routed mesh and torus networks.  ...  Matrix transpositions occur often in scienti c and engineering programming.  ... 
doi:10.1145/215399.215455 dblp:conf/spaa/NessonJ95 fatcat:zoycksfk7jf43al2fpdpw3zlmu

The Hyperstar Interconnection Network

Abdel-Elah Al-Ayyoub, Khaled Day
1998 Journal of Parallel and Distributed Computing  
Embeddings of hypercubes, star graphs, and meshes are discussed. An optimal one-to-all broadcasting algorithm is obtained and analysed.  ...  Some results on fault tolerance, parallel paths, Hamiltonian cycles, and VLSI layouts are obtained. Furthermore, a comparative study between the hyperstar and seven related networks is conducted.  ...  The hypercube based matrix factorization algorithms given in [3] and [5] partition the n-bit binary addresses of the n-cube into two parts.  ... 
doi:10.1006/jpdc.1997.1414 fatcat:uwici5s6jbbfrhh5m7rhbhuusq

Optimal Alphabets and Binary Labelings for BICM at Low SNR

Erik Agrell, Alex Alvarado
2011 IEEE Transactions on Information Theory  
For 8-ary pulse amplitude modulation (PAM) and for 0.75 bit/symbol, the folded binary code results in a higher capacity than the binary reflected gray code (BRGC) and the natural binary code (NBC).  ...  Optimal binary labelings, input distributions, and input alphabets are analyzed for the so-called bit-interleaved coded modulation (BICM) capacity, paying special attention to the low signal-to-noise ratio  ...  We found that the FBC is the asymptotically optimal binary labeling for 8-PSK, unique up to trivial operations, and we conjecture it to be optimal for any M -PSK input alphabet and m ≥ 2.  ... 
doi:10.1109/tit.2011.2162179 fatcat:3lnqdtlidze3teqgpx6awhwvn4

Extended hypercube: a hierarchical interconnection network of hypercubes

J.M. Kumar, L.M. Patnaik
1992 IEEE Transactions on Parallel and Distributed Systems  
The extended hypercube retains the positive features of the k-cube at different levels of hierarchy and at the same time has some additional advantages like reduced diameter and constant degree of a node  ...  A new interconnection topology-the Extended Hypercube-consisting of an interconnection network of k-cubes is discussed.  ...  Applications such as permutation, shuffle, unshuffle, bit-reversal, oddeven-merge, FFT, convolution, matrix transposition have programs consisting of sequences of algorithms in the ASCEND/DESCEND class  ... 
doi:10.1109/71.113081 fatcat:sv4p2qur6bdv7atjaluluclsci

Locally connected VLSI architectures for the Viterbi algorithm

P.G. Gulak, T. Kailath
1988 IEEE Journal on Selected Areas in Communications  
The Viterbi algorithm is a well-established technique for channel and source decoding in high performance digital communication systems.  ...  This restriction is motivated by the fact that both the cost and performance metrics of VLSI favor architectures in which on-chip interprocessor communication is localized.  ...  cases a cube-connected cycles graph.  ... 
doi:10.1109/49.1921 fatcat:gaaaxmtt4rhj3ivmu7wngfhcti

The Art of Signaling: Fifty Years of Coding Theory [chapter]

2009 Information Theory  
The emphasis is on connecting coding theories for Hamming and Euclidean space and on future challenges, specifically in data networking, wireless communication, and quantum information theory.  ...  In 1948 Shannon developed fundamental limits on the efficiency of communication over noisy channels.  ...  in Table I on soft-decision decoding, and to Walter Willinger for education on data network traffic.  ... 
doi:10.1109/9780470544907.ch20 fatcat:eeuyqk35orhmdleeiktamxgv6u

Matrix decomposition on the star graph

A.-E. Al-Ayyoub, K. Day
1997 IEEE Transactions on Parallel and Distributed Systems  
computation complexity and uses O(Nn) communication time to decompose a matrix of order N on a star graph of dimension n, where N ≥ (n -1)!.  ...  We present and evaluate, for the first time, a parallel algorithm for solving the LU decomposition problem on the star graph. The proposed parallel algorithm is of O(N 3 /n!)  ...  A simple matrix distribution on the hypercube can be done by partitioning the n-bit binary addresses of the n-cube into two equal parts.  ... 
doi:10.1109/71.605767 fatcat:poyomvnha5dt3kz4tatolf7qm4
« Previous Showing results 1 — 15 out of 89 results