Parallel Weighted Random Sampling [in press]

Lorenz Hübschle-Schneider, Peter Sanders
Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory machines. We give efficient, fast, and practicable algorithms for sampling single items, k items with/without replacement, permutation, subset sampling, and reservoir sampling. Our output sensitive algorithm for sampling with replacement also improves the state of the art for sequential algorithms.
doi:10.5445/ir/1000097067 fatcat:hdf4z6x5u5d2bby2iedabonzqi