SeMPI: a genome-based secondary metabolite prediction and identification web server

Paul F. Zierep, Natàlia Padilla, Dimitar G. Yonchev, Kiran K. Telukunta, Dennis Klementz, Stefan Günther
2017 Nucleic Acids Research  
The secondary metabolism of bacteria, fungi and plants yields a vast number of bioactive substances. The constantly increasing amount of published genomic data provides the opportunity for an efficient identification of gene clusters by genome mining. Conversely, for many natural products with resolved structures, the encoding gene clusters have not been identified yet. Even though genome mining tools have become significantly more efficient in the identification of biosynthetic gene clusters,
more » ... tructural elucidation of the actual secondary metabolite is still challenging, especially due to as yet unpredictable post-modifications. Here, we introduce SeMPI, a web server providing a prediction and identification pipeline for natural products synthesized by polyketide synthases of type I modular. In order to limit the possible structures of PKS products and to include putative tailoring reactions, a structural comparison with annotated natural products was introduced. Furthermore, a benchmark was designed based on 40 gene clusters with annotated PKS products. The web server of the pipeline (SeMPI) is freely available at: http://www.pharmaceuticalbioinformatics.de/sempi.
doi:10.1093/nar/gkx289 pmid:28453782 pmcid:PMC5570227 fatcat:kr5tzd2rdjawvax6jjtrmpureq