Exploratory ensemble interpretable model for predicting local failure in head and neck cancer: the additive benefit of CT and intra-treatment cone-beam computed tomography features

Howard E. Morgan, Kai Wang, Michael Dohopolski, Xiao Liang, Michael R. Folkert, David J. Sher, Jing Wang
2021 Quantitative Imaging in Medicine and Surgery  
Local failure (LF) following chemoradiation (CRT) for head and neck cancer is associated with poor overall survival. If machine learning techniques could stratify patients at risk of treatment failure based on baseline and intra-treatment imaging, such a model could facilitate response-adapted approaches to escalate, de-escalate, or switch therapy. A 1:2 retrospective case control cohort of patients treated at a single institution with definitive radiotherapy for head and neck cancer who failed
more » ... locally, in-field at a primary or nodal structure were included. Radiomic features were extracted from baseline CT and CBCT scans at fractions 1 and 21 (delta) of radiotherapy with PyRadiomics and were selected for by: reproducibility (intra-class correlation coefficients ≥0.95), redundancy [maximum relevance and minimum redundancy (mRMR)], and informativeness [recursive feature elimination (RFE)]. Separate models predicting LF of primaries or nodes were created using the explainable boosting machine (EBM) classifier with 5-fold cross-validation for (I) clinical only, (II) radiomic only (CT1 and delta features), and (III) fused models (clinical + radiomic). Twenty-five iterations were performed, and predicted scores were averaged with a parallel ensemble design. Receiver operating characteristic curves were compared between models with paired-samples t-tests. The fused ensemble model for primaries (using clinical, CT1, and delta features) achieved an AUC of 0.871 with a sensitivity of 78.3% and specificity of 90.9% at the maximum Youden J statistic. The fused ensemble model trended towards improvement when compared to the clinical only ensemble model (AUC =0.788, P=0.134) but reached significance when compared to the radiomic ensemble model (AUC =0.770, P=0.017). The fused ensemble model for nodes achieved an AUC of 0.910 with a sensitivity of 100.0% and specificity of 68.0%, which also trended towards improvement when compared to the clinical model (AUC =0.865, P=0.080). The fused ensemble EBM model achieved high discriminatory ability at predicting LF for head and neck cancer in independent primary and nodal structures. Although an additive benefit of delta radiomics over clinical factors could not be proven, the results trended towards improvement with the fused ensemble model, which are promising and worthy of prospective investigation in a larger cohort.
doi:10.21037/qims-21-274 pmid:34888189 pmcid:PMC8611459 fatcat:rhtqge4vbfhv5neer73chyy4vi