On Sorting, Heaps, and Minimum Spanning Trees

Gonzalo Navarro, Rodrigo Paredes
<span title="2010-03-23">2010</span> <i title="Springer Nature"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/qhi3z76be5c5xeihac5cyiid3m" style="color: black;">Algorithmica</a> </i> &nbsp;
Let A be a set of size m. Obtaining the first k ≤ m elements of A in ascending order can be done in optimal O(m + k log k) time. We present Incremental Quicksort (IQS), an algorithm (online on k) which incrementally gives the next smallest element of the set, so that the first k elements are obtained in optimal expected time for any k. Based on IQS, we present the Quickheap (QH), a simple and efficient priority queue for main and secondary memory. Quickheaps are comparable with classical binary
more &raquo; ... heaps in simplicity, yet are more cache-friendly. This makes them an excellent alternative for a secondary memory implementation. We show that the expected amortized CPU cost per operation over a Quickheap of m elements is O(log m), and this translates into O((1/B) log(m/M )) I/O cost with main memory size M and block size B, in a cache-oblivious fashion. As a direct application, we use our techniques to implement classical Minimum Spanning Tree (MST) algorithms. We use IQS to implement Kruskal's MST algorithm and QHs to implement Prim's. Experimental results show that IQS, QHs, external QHs, and our Kruskal's and Prim's MST variants are competitive, and in many case better in practice than current state-of-the-art alternative (and much more sophisticated) implementations. solved by first finding the k-th smallest element of A using O(m) time Select algorithm [5], and then collecting and sorting the elements smaller than the k-th element. The resulting complexity, O(m + k log k), is optimal under the comparison model, as every cell must be inspected and there are Π 0≤j 0, in which case m dominates k log m. However, according to experiments this scheme is much slower than the offline practical algorithm [26] if a classical heap is used. P. Sanders [32] proposes sequence heaps, a cache-aware priority queue, to solve the online problem. Sequence heaps are optimized to insert and extract all the elements in the priority queue at a small amortized cost. Even though the total CPU time used for this algorithm in the whole process of inserting and extracting all the m elements is pretty close to the time of running Quicksort, this scheme is not so efficient when we want to sort just a small fraction of the set. Then the quest for a practical online algorithm for partial sorting is raised. In this paper we present Incremental Quicksort (IQS), a practical and efficient algorithm for solving the online problem, within O(m + k log k) expected time. Based on IQS, we present the Quickheap (QH), a simple and efficient data structure for implementing priority queues in main and secondary memory. Quickheaps are comparable with classical binary heaps in simplicity, yet are more cache-friendly. This makes them an excellent alternative for a secondary memory implementation. QHs achieve O(log m) expected amortized time per operation when they fit in main memory, and O((1/B) log(m/M )) I/O cost when there are M bytes of main memory and the block size is B in secondary memory, working in a cache-oblivious fashion. IQS and QHs can be used to improve upon the current state of the art on many algorithmic scenarios. In fact, we plug them in the classic Minimum Spanning Tree (MST) techniques: We use incremental quicksort to boost Kruskal's MST algorithm [24] , and a quickheap to boost Prim's MST algorithm [31] . Given a graph G(V, E), we compute its MST in O(|E| + |V | log 2 |V |) average time. Experimental results show that IQS, QHs, external QHs and our Kruskal's and Prim's MST variants are extremely competitive, and in many case better in practice than current state-ofthe-art (and much more sophisticated) alternative implementations. IQS is approximately four times faster than the classic alternative to solve the online problem. QHs are competitive with pairing heaps [16] and up to four times faster than binary heaps [42] (according to [27] , these are the fastest priority queue implementations in practice). Using the same amount of memory, our external QH perform up to 3 times fewer I/O accesses than R-Heaps [1] and up to 5 times fewer than Array-Heaps [8], which are the best alternatives tested in the survey by Brengel et al. [6] . External-memory Sequence Heaps [32], however, are faster than QHs, yet these are much more sophisticated and not cache-oblivious. Finally, our Kruskal's version is much faster than any other Kruskal's implementation we could program or find for any graph density. As a matter of fact, it is faster than Prim's algorithm [31] , even as optimized by B. Moret and H. Shapiro [27] , and also
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s00453-010-9400-6">doi:10.1007/s00453-010-9400-6</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/efsxvf67xja4pbffe2wcwwke7a">fatcat:efsxvf67xja4pbffe2wcwwke7a</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170811122124/https://users.dcc.uchile.cl/~raparede/publ/09algorIQS.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d0/24/d024b3ed4bec3e4dc6af10d5cbe6fdf5d3c8635c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s00453-010-9400-6"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>