SIAMCAT: user-friendly and versatile machine learning workflows for statistically rigorous microbiome analyses [article]

Jakob Wirbel, Konrad Zych, Morgan Essex, Nicolai Karcher, Ece Kartal, Guillem Salazar, Peer Bork, Shinichi Sunagawa, Georg Zeller
2020 bioRxiv   pre-print
The human microbiome is increasingly mined for diagnostic and therapeutic biomarkers. However, computational tools tailored to such analyses are still scarce. Here, we present the SIAMCAT R package, a versatile and user-friendly toolbox for comparative metagenome analyses using machine learning (ML), statistical tests, and visualization. Based on a large meta-analysis of gut microbiome studies, we optimized the choice of ML algorithms and preprocessing routines for default workflow settings.
more » ... rkflow settings. Furthermore, we illustrate common pitfalls leading to overfitting and show how SIAMCAT safeguards against this to make statistically rigorous ML workflows broadly accessible. SIAMCAT is available from and Bioconductor.
doi:10.1101/2020.02.06.931808 fatcat:jraubeuycrbt5chykvybdlzzuy