Robust Machine Learning for Colorectal Cancer Risk Prediction and Stratification

Bradley J. Nartowt, Gregory R. Hart, Wazir Muhammad, Ying Liang, Gigi F. Stark, Jun Deng
<span title="2020-03-10">2020</span> <i title="Frontiers Media SA"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/zsdtezgyevhsxncisld4z5oxfe" style="color: black;">Frontiers in Big Data</a> </i> &nbsp;
While colorectal cancer (CRC) is third in prevalence and mortality among cancers in the United States, there is no effective method to screen the general public for CRC risk. In this study, to identify an effective mass screening method for CRC risk, we evaluated seven supervised machine learning algorithms: linear discriminant analysis, support vector machine, naive Bayes, decision tree, random forest, logistic regression, and artificial neural network. Models were trained and cross-tested
more &raquo; ... the National Health Interview Survey (NHIS) and the Prostate, Lung, Colorectal, Ovarian Cancer Screening (PLCO) datasets. Six imputation methods were used to handle missing data: mean, Gaussian, Lorentzian, one-hot encoding, Gaussian expectation-maximization, and listwise deletion. Among all of the model configurations and imputation method combinations, the artificial neural network with expectation-maximization imputation emerged as the best, having a concordance of 0.70 ± 0.02, sensitivity of 0.63 ± 0.06, and specificity of 0.82 ± 0.04. In stratifying CRC risk in the NHIS and PLCO datasets, only 2% of negative cases were misclassified as high risk and 6% of positive cases were misclassified as low risk. In modeling the CRC-free probability with Kaplan-Meier estimators, low-, medium-, and high CRC-risk groups have statistically-significant separation. Our results indicated that the trained artificial neural network can be used as an effective screening tool for early intervention and prevention of CRC in large populations.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.3389/fdata.2020.00006">doi:10.3389/fdata.2020.00006</a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pubmed/33693381">pmid:33693381</a> <a target="_blank" rel="external noopener" href="https://pubmed.ncbi.nlm.nih.gov/PMC7931964/">pmcid:PMC7931964</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/gmunloykhzbkdnzyjoc37kz5ee">fatcat:gmunloykhzbkdnzyjoc37kz5ee</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220121230549/http://europepmc.org/backend/ptpmcrender.fcgi?accid=PMC7931964&amp;blobtype=pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c1/66/c1664a85b3a6e8f66317d99205a0a20ff1f5d66b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.3389/fdata.2020.00006"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> frontiersin.org </button> </a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931964" title="pubmed link"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> pubmed.gov </button> </a>