HumGut: A comprehensive Human Gut prokaryotic genomes collection filtered by metagenome data [article]

Pranvera Hiseni, Knut Rudi, Robert C. Wilson, Finn Terje Hegge, Lars Snipen
2020 bioRxiv   pre-print
AbstractA major challenge with human gut microbiome studies is the lack of a publicly accessible human gut genome collection that is verifiably complete. We aimed to create Humgut, a comprehensive collection of healthy human gut prokaryotic genomes, to be used as a reference for worldwide human gut microbiome studies. We screened >2,300 healthy human gut metagenomes for the containment of >486,000 publicly available prokaryotic genomes. The contained genomes were then scored, ranked, and
more » ... ed based on their sequence identity, only to keep representative genomes per cluster, resulting thus in the creation of HumGut. Superior performance in the taxonomic assignment of metagenomic reads, classifying 97% of reads on average, is a benchmark advantage of HumGut. Re-analyses of healthy gut samples using HumGut revealed that >90% contained a core set of 129 bacterial species and that, on average, the guts of healthy people contain around 1,000 bacterial species. The HumGut collection will continuously be updated as the list of publicly available genomes and metagenomes expand. Our approach can also be extended to disease-associated genomes and metagenomes, in addition to other species. The comprehensive, yet slim HumGut database streamlines analyses while significantly improving taxonomic assignments in a field in dire need of method standardization and effectivity.
doi:10.1101/2020.03.25.007666 fatcat:jl52bdeppbdh3lhm55yoliae64