Peer Review #2 of "Massively parallel read mapping on GPUs with the q-group index and PEANUT (v0.2)" [peer_review]

B Lam
2014 unpublished
We present the q-group index, a novel data structure for read mapping tailored towards graphics processing units (GPUs) with a small memory footprint and efficient parallel algorithms for querying and building. On top of the q-group index we introduce PEANUT, a highly parallel GPU-based read mapper. PEANUT provides the possibility to output both the best hits or all hits of a read. Our benchmarks show that PEANUT outperforms other state-ofthe-art read mappers in terms of speed while maintaining
more » ... d while maintaining or slightly increasing precision, recall and sensitivity. PeerJ reviewing PDF | We present the q-group index, a novel data structure for read mapping tailored towards graphics processing units (GPUs) with a small memory footprint and efficient parallel algorithms for querying and building. On top of the q-group index we introduce PEANUT, a highly parallel GPUbased read mapper. PEANUT provides the possibility to output both the best hits or all hits of a read. Our benchmarks show that PEANUT outperforms other state-of-the-art read mappers in terms of speed while maintaining precision, recall and sensitivity. The software is available at http: //peanut.readthedocs.org. * To whom correspondence should be addressed. Recently, exploiting the parallelization capabilities of graphics processing units (GPUs) for read mapping has become popular and GPU-based BWT-read-mappers appeared, e.g. SOAP3 (Liu et al., 2012), SOAP3-dp (Luo et al., 2013) and CUSHAW2-GPU (Liu and Schmidt, 2014). Using a q-gram index on a GPU is not a common choice because of its large size. Therefore, to the best of our knowledge, q-gram index based mappers so far only use the GPU for calculating the alignments and keep the index on the CPU, e.g. Saruman (Blom et al., 2011) and NextGenMap (Sedlazeck et al., 2013). Here, we present the q-group index, a novel data structure for read mapping which is a variant of the classical q-gram index with a particularly small memory footprint. The q-group index comes with efficient parallel algorithms for building and querying, targeted towards modern GPUs. To the best of our knowledge, the q-group index is the first feasible implementation of q-gram index functionality on the GPU. On top of the q-group index we present PEANUT (ParallEl AligNment UTility), a GPU-based massively parallel read mapper. PEANUT provides both an all-mapping and a bestmapping mode and is the first GPU-based all-mapper. With both a recent and a four years old NVIDIA TM Geforce GPU, we show that PEANUT outperforms other state of the art best-mappers and all-mappers. For all-mapping, PEANUT is 4 to 10 times faster. PEANUT shows a slightly higher precision and recall than other best-mappers and an improved sensitivity compared to other all-mappers at default parameters. This article is structured as follows. We first discuss the GPU architecture and its implications for designing the q-group index to maximize parallel GPU usage (Section 2.1). Then, we describe the q-group index data structure (Section 2.2) and present the PEANUT approach of read mapping with the q-group index (Section 2.3). Section 3 shows benchmark results on speed, precision, recall and sensitivity of PEANUT. A brief discussion concludes the paper.
doi:10.7287/peerj.606v0.2/reviews/2 fatcat:i63pctgekrdulhy5e5ey4mhkkm