A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Sketching and Sublinear Data Structures in Genomics
2019
Annual Review of Biomedical Data Science
Large-scale genomics demands computational methods that scale sublinearly with the growth of data. We review several data structures and sketching techniques that have been used in genomic analysis methods. Specifically, we focus on four key ideas that take different approaches to achieve sublinear space usage and processing time: compressed full text indices, approximate membership query data structures, locality-sensitive hashing, and minimizers schemes. We describe these techniques at a high
doi:10.1146/annurev-biodatasci-072018-021156
fatcat:zlqdv6ke4vdmvgaaqwvvd53iae