98,203 Hits in 4.3 sec

On the improvement of the in-place merge algorithm parallelization [article]

Berenger Bramas
2020 arXiv   pre-print
In this paper, we present several improvements in the parallelization of the in-place merge algorithm, which merges two contiguous sorted arrays into one with an O(T) space complexity (where T is the number  ...  Finally, we provide the so-called linear shifting algorithm that swaps two partitions in-place with contiguous data access.  ...  Acknowledgments Experiments presented in this paper were carried out using the Cobra cluster from the Max Planck Computing and Data Facility (MPCDF).  ... 
arXiv:2005.12648v1 fatcat:j45l5lu6inf3jc27nhxelrbpja

Comparison of parallel sorting algorithms [article]

Darko Bozidar, Tomaz Dobravec
2015 arXiv   pre-print
We chose these algorithms because to the best of our knowledge their sequential and parallel implementations were not yet compared all together in the same execution environment.  ...  In this report we give a short description of seven sorting algorithms and all the results obtained by our tests.  ...  We improved the performance of a parallel algorithm by using Harris's et al. binary scan [9] .  ... 
arXiv:1511.03404v2 fatcat:kn7gm4hodfcopevxs6zuixudiq

High Performance Parallel Sort for Shared and Distributed Memory MIMD [article]

Thoria Alghamdi, Gita Alaghband
2020 arXiv   pre-print
a one-step MSD-Radix to distribute data in ten packets (MPI) while parallel cores of each node use Quicksort to sort their data partitions sequentially then merge and sort them in parallel employing the  ...  Merge sort, known for its stability, is used to design several of our algorithms. We improve its parallel performance by combining it with Quicksort.  ...  place in the original array.  ... 
arXiv:2003.01216v1 fatcat:gi7xroqcnrdhzicqkfs2ilabfm

Sorting and Join Algorithms for Multiprocessor Database Machines [chapter]

Jai Menon
1986 Database Machines  
In particular, we propose a new algorithm called the modified block bitonic sort. We then present the results of analyzing the performance of these different parallel external sorting algorithms.  ...  This paper presents and analyzes algorithms for parallel execution of sort operations in a general multiprocessor architecture. We consider both internal and external sorting algorithms.  ...  The smallest record in PI is compared with the largest record in P2, the smaller of the two is placed in PI"s memory, the larger of the two is placed in P2's memory.  ... 
doi:10.1007/978-3-642-82937-6_13 fatcat:kkmwlurdwjhexjqx53pkdaqsq4

The PSRS Algorithm based on Synchronous Barrier

Yuqiang Sun, Huanhuan Cai, Xian Chang, Xin Gao, Yuwan Gu
2013 Research Journal of Applied Sciences Engineering and Technology  
be introduced to merge those elements, but the algorithm designed in the LogP mode is heavily dependent on the accuracy of parameters such as l, o, g, p.  ...  The biggest characteristic of LogGP model based on LogP mode is sending long messages, if all the elements to be sent are seem as a long message and sent in a single processor, a sorting algorithm should  ...  ACKNOWLEDGMENT Supported by The project of general office of Broadcasting and Television (GD10101) and Natural Science Fund in JiangSu (BK2009535) and Natural Science Fund in ZheJiang (Y1100314)  ... 
doi:10.19026/rjaset.5.4303 fatcat:fbzvrodrrbeebd5vzt642hjliu

Designing a Synchronization-reducing Clustering Method on Manycores

Weijian Zheng, Fengguang Song, Lan Lin
2017 Proceedings of the Machine Learning on HPC Environments - MLHPC'17  
The k-means clustering method is one of the most widely used techniques in big data analytics.  ...  In this paper, we explore the ideas of software blocking, asynchronous local optimizations, and heuristics of simulated annealing to improve the performance of k-means clustering.  ...  We conducted experiment with four real-world datasets of MNIST, CIFAR-10, CIFAR-100, and PLACES-2 for both the sequential algorithm and the parallel algorithm on manycore systems.  ... 
doi:10.1145/3146347.3146357 dblp:conf/sc/ZhengSL17 fatcat:tu67wsckenddlkxu6tjxpzddeq

Collection-intersect join algorithms for parallel object-oriented database systems [chapter]

David Taniar, J. Wenny Rahayu
1998 Lecture Notes in Computer Science  
The parallel sort-merge algorithm can only make use of the divide and partial broadcast data partitioning, whereas the parallel hash algorithm may have a choice which of the two data partitioning to use  ...  One form of collection join queries in OODB is collectionintersect join queries, where the joins are based on collection attributes and the queries check for whether there is an intersection between the  ...  Once a data partitioning method is applied, local join is carried out by either a sort-merge operator (in the case of parallel sort-merge algorithm) or a hash function (in the case of parallel hash).  ... 
doi:10.1007/bfb0057894 fatcat:7nq5i4l4jbhgdaazmdsgkfpndm


X. Guan
2012 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences  
With the help of the development of multicore technology and computer component cost reduction in recent years, high performance clusters become the only economically viable solution for this requirement  ...  The Split-and-Merge paradigm efficiently exploits data parallelism for massive data processing.  ...  ACKNOWLEDGEMENTS This work is supported by the Natural Science Foundation of China (Grant: 40971211 and 40721001).  ... 
doi:10.5194/isprsarchives-xxxix-b4-213-2012 fatcat:ksdx4fh3ybhpnc37qygcddrcuu

GPU merge path

Oded Green, Robert McColl, David A. Bader
2012 Proceedings of the 26th ACM international conference on Supercomputing - ICS '12  
An efficient parallel merging algorithm partitions the sorted input arrays into sets of non-overlapping sub-arrays that can be independently merged on multiple cores.  ...  Following this, we show how each SM performs a parallel merge and how to divide the work so that all the GPU's Streaming Processors (SP) are utilized. All stages in this algorithm are parallel.  ...  In section 3, we show empirical results of the new algorithm on two different GPU architectures and improved performance over existing algorithms on GPU and x86.  ... 
doi:10.1145/2304576.2304621 dblp:conf/ics/GreenMB12 fatcat:hfcvue7qfnhllfrd2t4g7zotpu

Parallel Merging and Sorting on Linked List

Yijie Han, Sreevalli Tata
2021 International Journal of Computer and Information Technology(2279-0764)  
We study linked list sorting and merging on the PRAM model.  ...  We also show that two sorted linked lists of n integers in {0, 1, ..., m} can be merged into one sorted linked list in O(log(c)n(loglogm)1/2) time using n/(log(c)n(loglogm)1/2) processors, where c is an  ...  Let T1 be the time complexity of the best serial algorithm for the same problem. Then pTp  T1. When pTp=T1 then this parallel algorithm is an optimal parallel algorithm.  ... 
doi:10.24203/ijcit.v10i2.85 fatcat:ciydjz474nfjhordxvp2qximli

Data sorting using graphics processing units

Marko J. Misic, Milo V. Tomasevic
2011 2011 19thTelecommunications Forum (TELFOR) Proceedings of Papers  
This paper represents an effort to analyze and evaluate the implementations of the representative sorting algorithms on the graphics processing units.  ...  Three sorting algorithms (Quicksort, Merge sort, and Radix sort) were evaluated on the Compute Unified Device Architecture (CUDA) platform that is used to execute applications on NVIDIA graphics processing  ...  PARALLEL IMPLEMENTATIONS OF SORTING ALGORITHMS Although numerous papers reported various improvements in the domain of sorting algorithms (as reported in Section II), only several implementations are publicly  ... 
doi:10.1109/telfor.2011.6143828 fatcat:6rrf2lpwbvg4feioey6jp6wjk4

Database cracking

Holger Pirk, Eleni Petraki, Stratos Idreos, Stefan Manegold, Martin Kersten
2014 Proceedings of the Tenth International Workshop on Data Management on New Hardware - DaMoN '14  
In this paper, we conduct an in-depth study of the reasons for the low CPU efficiency of pivoted partitioning.  ...  The core of database cracking is, thus, pivoted partitioning.  ...  PARALLELIZATION In this section we present two Cracking algorithms that exploit thread-level parallelism, i.e., first a simple partition & merge parallel algorithm, and then a refined variant of the simple  ... 
doi:10.1145/2619228.2619232 dblp:conf/damon/PirkPIMK14 fatcat:lzebgfgmkbad7na5cfmqu4j4gu

A Partition-Merge Based Cache-Conscious Parallel Sorting Algorithm for CMP with Shared Cache

Song Hao, Zhihui Du, David Bader, Yin Ye
2009 2009 International Conference on Parallel Processing  
The PMCC algorithm consists of two steps: the partition-based in-cache sorting and merge-based k-way merge sorting.  ...  Then for a specific application, parallel sorting, a cache-conscious parallel algorithm, PMCC (Partition-Merge based Cache-Conscious) is designed based on the PSC model.  ...  Then the cores sort them with SIMD instructions in parallel and merge all the subsets with an improved merge sort algorithm.  ... 
doi:10.1109/icpp.2009.26 dblp:conf/icpp/HaoDBY09 fatcat:5gn5prr6l5cbrkny7glguifyl4

Memory footprint matters

Spyros Blanas, Jignesh M. Patel
2013 Proceedings of the 4th annual Symposium on Cloud Computing - SOCC '13  
A critical contribution of this work is in pointing out that in addition to query response time, one must also consider the memory footprint of each join algorithm, as it impacts the number of concurrent  ...  The focus of this paper is on studying hashbased and sort-based equi-join algorithms when the data sets being joined fully reside in main memory.  ...  We also thank Venkatraman Govindaraju for providing the SIMD-optimized implementation of the bitonic merge algorithm for the experiment in Section 4.4.  ... 
doi:10.1145/2523616.2523626 dblp:conf/cloud/BlanasP13 fatcat:mqe7dwnwcrf6rj52626slb5yne

A Parallel Genetic Algorithm Framework for Transportation Planning and Logistics Management

Dmitri I. Arkhipov, Di Wu, Tao Wu, Amelia C. Regan
2020 IEEE Access  
In this paper we extend the standard meta-description for genetic algorithms (GA) with a simple non-trivial parallel implementation.  ...  The framework presented at its parallel base is a modification of the primitive parallelization concept, but if implemented as described it may be gradually extended to fit the qualities of any underlying  ...  ACKNOWLEDGMENT The authors would like to thank the anonymous reviewers for their insightful comments.  ... 
doi:10.1109/access.2020.2997812 fatcat:54egzv4emja73jki6vznqcmbk4
« Previous Showing results 1 — 15 out of 98,203 results