A host subtraction database for virus discovery in human cell line sequencing data

Jason R. Miller, Kari A. Dilley, Derek M. Harkins, Timothy B. Stockwell, Reed S. Shabman, Granger G. Sutton
2019 F1000Research  
The human cell lines HepG2, HuH-7, and Jurkat are commonly used for amplification of the RNA viruses present in environmental samples. To assist with assays by RNAseq, we sequenced these cell lines and developed a subtraction database that contains sequences expected in sequence data from uninfected cells. RNAseq data from cell lines infected with Sendai virus were analyzed to test host subtraction. The process of mapping RNAseq reads to our subtraction database vastly reduced the number
more » ... d the number non-viral reads in the dataset to allow for efficient secondary analyses.
doi:10.12688/f1000research.13580.3 fatcat:bdjmnnw6kjdoxcozezjdgwcsgy