Flexible and efficient handling of nanopore sequencing signal data with slow5tools [article]

Hiruna Samarakoon, James M Ferguson, Sasha P. Jenner, Timothy G. Amos, Sri Parameswaran, Hasindu Gamaarachchi, Ira W Deveson
2022 bioRxiv   pre-print
Nanopore sequencing is an emerging technology that is being rapidly adopted in research and clinical genomics. We recently developed SLOW5, a new file format for storage and analysis of raw data from nanopore sequencing experiments. SLOW5 is a community-centric, open source format that offers considerable performance benefits over the existing nanopore data format, known as FAST5. Here we introduce slow5tools, a simple, intuitive toolkit for handling nanopore raw signal data in SLOW5 format.
more » ... ults: Slow5tools enables lossless FAST5-to-SLOW5 and SLOW5-to-FAST5 data conversion, and a range of tools for structuring, indexing, viewing and querying SLOW5 files. Slow5tools uses multi-threading, multi-processing and other engineering strategies to achieve fast data conversion and manipulation, including live FAST5-to-SLOW5 conversion during sequencing. We outline a series of examples and benchmarking experiments to illustrate slow5tools usage, and describe the engineering principles underpinning its high performance. Conclusion: Slow5tools is an essential toolkit for handling nanopore signal data, which was developed to support adoption of SLOW5 by the nanopore community. Slow5tools is written in C/C++ with minimal dependencies and is freely available as an open-source program under an MIT licence: https://github.com/hasindu2008/slow5tools.
doi:10.1101/2022.06.19.496732 fatcat:5l4dzaettzetbawmjbifbspyhm