Nanopore sequencing data analysis: state of the art, applications and challenges

Alberto Magi, Roberto Semeraro, Alessandra Mingrino, Betti Giusti, Romina D'Aurizio
2017 Briefings in Bioinformatics  
The nanopore sequencing process is based on the transit of a DNA molecule through a nanoscopic pore, and since the 90s is considered as one of the most promising approaches to detect polymeric molecules. In 2014, Oxford Nanopore Technologies (ONT) launched a beta-testing program that supplied the scientific community with the first prototype of a nanopore sequencer: the MinION. Thanks to this program, several research groups had the opportunity to evaluate the performance of this novel
more » ... t and develop novel computational approaches for analyzing this new generation of data. Despite the short period of time from the release of the MinION, a large number of algorithms and tools have been developed for base calling, data handling, read mapping, de novo assembly and variant discovery. Here, we face the main computational challenges related to the analysis of nanopore data, and we carry out a comprehensive and up-to-date survey of the algorithmic solutions adopted by the bioinformatic community comparing performance and reporting limits and advantages of using this new generation of sequences for genomic analyses. Our analyses demonstrate that the use of nanopore data dramatically improves the de novo assembly of genomes and allows for the exploration of structural variants with an unprecedented accuracy and resolution. However, despite the impressive improvements reached by ONT in the past 2 years, the use of these data for small-variant calling is still challenging, and at present, it needs to be coupled with complementary short sequences for mitigating the intrinsic biases of nanopore sequencing technology. Alberto Magi, PhD, is an assistant professor at the University of Florence, Italy. His research interests focus on the development of computational methods for the identification of genomic variants. Roberto Semeraro, PhD, is a postdoctoral researcher at the University of Florence, Italy. His research is focused on computational methods for the analysis of NGS data. Alessandra Mingrino, PhD, is a postdoctoral researcher at the University of Florence, Italy. Her research is focused on the development of novel experimental strategies with third-generation sequencing data. Betti Giusti, PhD, is an associate professor of Clinical Pathology at the University of Florence. Her research activity has focused on genetics of cardiovascular diseases and extracellular matrix disorders. Romina D'Aurizio, PhD, is a postdoctoral researcher at the Italian National Research Council of Pisa. Her research is focused on computational methods for the analysis of high-throughput sequencing data.
doi:10.1093/bib/bbx062 pmid:28637243 fatcat:lxiyn4exdfaibf4helkij43p2q