A large scale prediction of bacteriocin gene blocks suggests a wide functional spectrum for bacteriocins

James T. Morton, Stefan D. Freed, Shaun W. Lee, Iddo Friedberg
2015 BMC Bioinformatics  
Bacteriocins are peptide-derived molecules produced by bacteria, whose recently-discovered functions include virulence factors and signaling molecules as well as their better known roles as antibiotics. To date, close to five hundred bacteriocins have been identified and classified. Recent discoveries have shown that bacteriocins are highly diverse and widely distributed among bacterial species. Given the heterogeneity of bacteriocin compounds, many tools struggle with identifying novel
more » ... cins due to their vast sequence and structural diversity. Many bacteriocins undergo post-translational processing or modifications necessary for the biosynthesis of the final mature form. Enzymatic modification of bacteriocins as well as their export is achieved by proteins whose genes are often located in a discrete gene cluster proximal to the bacteriocin precursor gene, referred to as context genes in this study. Although bacteriocins themselves are structurally diverse, context genes have been shown to be largely conserved across unrelated species. Methods: Using this knowledge, we set out to identify new candidates for context genes which may clarify how bacteriocins are synthesized, and identify new candidates for bacteriocins that bear no sequence similarity to known toxins. To achieve these goals, we have developed a software tool, Bacteriocin Operon and gene block Associator (BOA) that can identify homologous bacteriocin associated gene blocks and predict novel ones. BOA generates profile Hidden Markov Models from the clusters of bacteriocin context genes, and uses them to identify novel bacteriocin gene blocks and operons. Results and conclusions: We provide a novel dataset of predicted bacteriocins and context genes. We also discover that several phyla have a strong preference for bacteriocin genes, suggesting distinct functions for this group of molecules. Software Availability: https://github.com/idoerg/BOA
doi:10.1186/s12859-015-0792-9 pmid:26558535 pmcid:PMC4642626 fatcat:fi5cmh354ze23ojo6xqpvupipe