PathoLive - Real time pathogen identification from metagenomic Illumina datasets
[article]
Simon H. Tausch, Tobias P. Loka, Jakob M. Schulze, Andreas Andrusch, Jeanette Klenner, Piotr Wojciech Dabrowski, Martin S. Lindner, Andreas Nitsche, Bernhard Y. Renard
2018
bioRxiv
pre-print
Over the past years, NGS has been applied in time critical applications such as pathogen diagnostics with promising results. Yet, long turnaround times have to be accepted to generate sufficient data, as the analysis can only be performed sequentially after the sequencing has finished. Additionally, the interpretation of results can be further complicated by various types of contaminations, clinically irrelevant sequences, and the sheer amount and complexity of the data. We designed and
more »
... ted PathoLive, a real-time diagnostics pipeline which allows the detection of pathogens from clinical samples up to several days before the sequencing procedure is even finished and currently available tools may start to run. We adapted the core algorithm of HiLive, a real-time read mapper, and enhanced its accuracy for our use case. Furthermore, common contaminations, low-entropy areas, and sequences of widespread, non-pathogenic organisms are automatically marked beforehand using NGS datasets from healthy humans as a baseline. The results are visualized in an interactive taxonomic tree that provides an intuitive overview and detailed measures regarding the relevance of each identified potential pathogen. We applied the pipeline on a human plasma sample that was spiked in vitro with vaccinia virus, yellow fever virus, mumps virus, Rift Valley fever virus, adenovirus, and mammalian orthoreovirus. The sample was then sequenced on an Illumina HiSeq. All spiked agents were detected after the completion of only 12% of the sequencing procedure and were ranked more accurately throughout the run than by any of the tested tools on the complete data. We also found a large number of other sequences and these were correctly marked as clinically irrelevant in the resulting visualization. This tagging allows the user to obtain the correct assessment of the situation at first glance. PathoLive is available at https://gitlab.com/rki_bioinformatics/PathoLive.
doi:10.1101/402370
fatcat:yjgnsnyhkbfidd5ad7m52kixpe