Discovery of an expansive bacteriophage family that includes the most abundant viruses from the human gut

Natalya Yutin, Kira S. Makarova, Ayal B. Gussow, Mart Krupovic, Anca Segall, Robert A. Edwards, Eugene V. Koonin
2017 Nature Microbiology  
Metagenomic sequence analysis is rapidly becoming the primary source of virus discovery 1-3 . A substantial majority of the currently available virus genomes comes from metagenomics, and some of these represent extremely abundant viruses even if never grown in the laboratory. A particularly striking case of a virus discovered via metagenomics is crAssphage, which is by far the most abundant human-associated virus known, comprising up to 90% of the sequences in the gut virome 4 . Over 80% of the
more » ... predicted proteins encoded in the approximately 100 kilobase crAssphage genome showed no significant similarity to available protein sequences, precluding classification of this virus and hampering further study. Here we combine comprehensive search of genomic and metagenomic databases with sensitive methods for protein sequence analysis to identify an expansive, diverse group of bacteriophages related to crAssphage and predict the functions of the majority of phage proteins, in particular, those that comprise the structural, replication and expression modules. Most if not all of the crAss-like phages appear to be associated with diverse bacteria from the phylum Bacteroidetes, which includes some of the most abundant bacteria in the human gut microbiome and are also common in various other habitats. These findings provide for experimental characterization of the most abundant but poorly understood members of the human-associated virome. Viruses are the most abundant biological entities on earth: in most environments, from ocean water to the content of animal guts, the number of detected virus particles exceeds that of Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:
doi:10.1038/s41564-017-0053-y pmid:29133882 pmcid:PMC5736458 fatcat:32i3y5sidzchha2irgibdhinma