Utility of deep neural networks in predicting gross-total resection after transsphenoidal surgery for pituitary adenoma: a pilot study

Victor E. Staartjes, Carlo Serra, Giovanni Muscas, Nicolai Maldaner, Kevin Akeret, Christiaan H. B. van Niftrik, Jorn Fierstra, David Holzmann, Luca Regli
2018 Neurosurgical Focus  
OBJECTIVEGross-total resection (GTR) is often the primary surgical goal in transsphenoidal surgery for pituitary adenoma. Existing classifications are effective at predicting GTR but are often hampered by limited discriminatory ability in moderate cases and by poor interrater agreement. Deep learning, a subset of machine learning, has recently established itself as highly effective in forecasting medical outcomes. In this pilot study, the authors aimed to evaluate the utility of using deep
more » ... ing to predict GTR after transsphenoidal surgery for pituitary adenoma.METHODSData from a prospective registry were used. The authors trained a deep neural network to predict GTR from 16 preoperatively available radiological and procedural variables. Class imbalance adjustment, cross-validation, and random dropout were applied to prevent overfitting and ensure robustness of the predictive model. The authors subsequently compared the deep learning model to a conventional logistic regression model and to the Knosp classification as a gold standard.RESULTSOverall, 140 patients who underwent endoscopic transsphenoidal surgery were included. GTR was achieved in 95 patients (68%), with a mean extent of resection of 96.8% ± 10.6%. Intraoperative high-field MRI was used in 116 (83%) procedures. The deep learning model achieved excellent area under the curve (AUC; 0.96), accuracy (91%), sensitivity (94%), and specificity (89%). This represents an improvement in comparison with the Knosp classification (AUC: 0.87, accuracy: 81%, sensitivity: 92%, specificity: 70%) and a statistically significant improvement in comparison with logistic regression (AUC: 0.86, accuracy: 82%, sensitivity: 81%, specificity: 83%) (all p < 0.001).CONCLUSIONSIn this pilot study, the authors demonstrated the utility of applying deep learning to preoperatively predict the likelihood of GTR with excellent performance. Further training and validation in a prospective multicentric cohort will enable the development of an easy-to-use interface for use in clinical practice.
doi:10.3171/2018.8.focus18243 fatcat:wwxowr3dnbh2lk3ijkepmbekwi