Integration of Mechanistic Immunological Knowledge into a Machine Learning Pipeline Increases Predictive Power [article]

Anthony Culos, Amy S. Tsai, Natalie Stanley, Martin Becker, Mohammad S. Ghaemi, David R. Mcilwain, Ramin Fallahzadeh, Athena Tanada, Huda Nassar, Edward Ganio, Laura Peterson, Xiaoyuan Han (+17 others)
2020 bioRxiv   pre-print
The dense network of interconnected cellular signaling responses quantifiable in peripheral immune cells provide a wealth of actionable immunological insights. While high-throughput single-cell profiling techniques, including polychromatic flow and mass cytometry, have matured to a point that enables detailed immune profiling of patients in numerous clinical settings, limited cohort size together with the high dimensionality of data increases the possibility of false positive discoveries and
more » ... el overfitting. We introduce a machine learning platform, the immunological Elastic-Net (iEN), which incorporates immunological knowledge directly into the predictive models. Importantly, the algorithm maintains the exploratory nature of the high-dimensional dataset, allowing for the inclusion of immune features with strong predictive power even if not consistent with prior knowledge. In three independent studies our method demonstrates improved predictive power for clinically-relevant outcomes from mass cytometry data generated from whole blood, as well as a large simulated dataset.
doi:10.1101/2020.02.26.967232 fatcat:nlgxcvgk7jbjpk6sf55gremfsa