A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is
We i n vestigate the problem of permuting n data items on an EREW PRAM with p processors using little additional storage. ... We present a simple algorithm with run time O((n=p) logn) and an improved algorithm with run time O(n=p + l o g n log log(n=p)). ... This algorithm runs in time O(n=p + l o g p) and requires O(log n) local memory cells per processor. All these parallel algorithms take a d v antage of some a priori knowledge of the permutation. ...doi:10.1142/s0129626495000126 fatcat:iykcgq42jratzp6jawscekjhba
This article introduces an algorithm, MergeShuffle, which is an extremely efficient algorithm to generate random permutations (or to randomly permute an existing array). ... Finally, our preliminary simulations using OpenMP suggest it is more efficient than the Rao-Sandelius algorithm, one of the fastest existing random permutation algorithms. ... Fisher-Yates shuffle, while extremely fast, gets slowed down once permutations are very large. ...arXiv:1508.03167v1 fatcat:rsnrkwcvg5dbdnq524kwhlk7mq
A disk-parallel algorithm is demonstrated that can multiply two permutations with 12.8 billion points using 16 parallel local disks of a cluster in under one hour. ... Permutation multiplication (or permutation composition) is perhaps the simplest of all algorithms in computer science. ... Section 5 presents new fast algorithms for permutation inverse and multiplication by an inverse. ...doi:10.1145/1837934.1838001 dblp:conf/issac/SlaviciDKC10 fatcat:rks55e4zt5hohl6vuq3ymogruq
Thus, one can determine efficiently at run time whether a permutation to be performed is BMMC and then avoid the general-permutation algorithm and save parallel I/Os by using the BMMC permutation algorithm ... From the text: “First we study asymptotically fast algorithms for rectangular matrix multiplication. ...
International Journal on Perceptive and Cognitive Computing
The algorithms are based on the random selection of the permutation matrix - known as key matrix - entries. ... Although there exist several different cryptosystems, the choice of the best algorithm is the main concern of the researches. ... The performance speed of the introduced algorithm is fast however, there is no parallelism and the decrypting time is slow. ...doi:10.31436/ijpcc.v4i1.68 fatcat:se4ssre7hvgbtfkzrq4hxczwja
The author gives a fast Fourier transform algorithm, which has a simple construction and higher parallelism. The algorithm requires fewer operations than other algorithms. ... Summary: “In this paper, a parallel algorithm is presented for solving pentadiagonal linear systems. The algorithm shows good parallelism. ...
Over the years many fast algorithms have been proposed for the computation of the DCT. ... We address in this paper the implementation of a fast algorithm for the DCT into SIMD-vector processors. ...doi:10.5281/zenodo.39282 fatcat:yrljnn2hbjfnzgvsgelcev6mhy
(D-TRR-CS; Trier) Fast generation of random permutations via networks simulation. (English summary) Fourth European Symposium on Algorithms (Barcelona, 1996). Algorithmica 21 (1998), no. 1, 2-20. ... The common and novel feature of both our algorithms is that we first design a suitable random switching network generating a permutation and then simulate this network on the PRAM model in a fast way.” ...
A fast recursive algorithm for pruned bit-reversal permutations is proposed. ... Moreover, a parallel pruned interleaving algorithm based on computing multiple inliers in parallel is proposed. ... Moreover, we use this algorithm to parallelize a serial PBRI and reduce latency by a desired parallelism factor. ...doi:10.1109/icassp.2012.6288208 dblp:conf/icassp/Mansour12 fatcat:7khp2nknqbazlkit6saxqxrt2y
Lecture Notes in Computer Science
A new, efficient algorithm for finding local reducts for each object in data table is described, as well as its parallelization and some optimization notes. ... A problem of working with tolerances in our algorithm is discussed. Some experimental results generated on large data tables (concerned with real applications) are presented. ... We may choose a set of p permutations σ 1 , . . . σ p and generate p coverings using this algorithm in parallel on p machines. ...doi:10.1007/3-540-69115-4_55 fatcat:mqbtu2v445fbdlntb4qkvhldhe
Both permutation levels exploit the fast on-chip memory bandwidth by transferring large amount of data and allowing for fine-grain SIMD (Single Instruction, Multiple Data) operations. ... This paper proposes an efficient in-place N-dimensional permutation algorithm. The algorithm is based on a novel 3D transpose algorithm that was published recently. ... This is, as of today, the first known generalized parallel in-place n-dimensional permutation algorithm. As a proof of concept, the proposed algorithm is tested on volumes of dimensions up to 7D. ...doi:10.1016/j.aej.2015.03.024 fatcat:ewxoord4grb6hhzag72guihhqe
Proposed parallel Tabu Search algorithm uses multi-start with varying criteria weights in order to improve algorithms effectiveness. ... In this paper we propose a parallel tabu search algorithm for the bi-criteria scheduling problem implemented on CUDA platform. ... Parallel Tabu Search implementation presented in  provides a new fast methods of parallel calculations for the hybrid flow shop problem. ...doi:10.1109/mmar.2015.7283900 dblp:conf/mmar/ZelaznyP15 fatcat:felsyetw4zbzliga5qpypa6fx4
The Fast Fourier Transform (FFT) play s a k ey role in many areas of computational science and engineering. ... When run on a DEC 2100 server with a large memory and eight parallel disks, the optimal algorithm for the PDM runs up to 144.7 times faster than in-core methods under demand paging. ... We plan to investigate true parallel out-of-core algorithms, using parallelized versions of the permutation methods described in this paper. ...doi:10.1016/s0167-8191(97)00114-2 fatcat:2qjvdsfrcncgvl7iwbdaej2dmi
Lecture Notes in Computer Science
Compared with the "dense" FFT algorithms, the input sparsity makes it easier to parallelize the sparse counterparts. ... Many applications invoke the Fast Fourier Transform (FFT) on sparse inputs, with most of their Fourier coefficients being very small or equal to zero. ... We parallelize three main sections in our algorithm: permutation to input, subsampled FFT and coefficient estimation. ...doi:10.1007/978-3-319-09967-5_15 fatcat:tmjyxko23jeojmjglippi7waha
We introduce a tensor sum which is useful for the design and analysis of digit-index permutations (DIPs) algorithms. ... Using this operation we obtain a new high-performance algorithm for the family of DIPs. ... This observation allows us to obtain a new algorithm for the general class of digit-index permutations (DIPs) whose parallel-time complexity is O(log log n). ...doi:10.1155/1996/836910 fatcat:vfnibxsweva7zbrrhxok2ehvwm
« Previous Showing results 1 — 15 out of 43,081 results