An evaluation of image descriptors combined with clinical data for breast cancer diagnosis
International Journal of Computer Assisted Radiology and Surgery
Purpose Breast cancer computer-aided diagnosis (CADx) may utilize image descriptors, demographics, clinical observations, or a combination. CADx performance was compared for several image features, clinical descriptors (e.g. age and radiologist's observations), and combinations of both kinds of data. A novel descriptor invariant to rotation, histograms of gradient divergence (HGD), was developed to deal with round-shaped objects, such as masses. HGD was compared with conventional CADx features.
... Method HGD and 11 conventional image descriptors were evaluated using cases from two publicly available mammography data sets, the digital database for screening mammography (DDSM) and the breast cancer digital repository (BCDR), with 1,762 and 362 instances, respectively. Three experiments were done for each data set according to the type of lesion (i.e., all lesions, masses, and calcifications), resulting in six scenarios. For each scenario, 100 training and test sets were generated via resampling without replacement and five machine learning classifiers were used to assess the diagnostic performance of the descriptors. Results Clinical descriptors outperformed image descriptors Electronic supplementary material The online version of this article (in the DDSM sample (three out of six scenarios), and combining the two kind of descriptors was advantageous in five out of six scenarios. HGD was the best descriptor (or comparable to best) in 8 out of 12 scenarios, demonstrating promising capabilities to describe masses. Conclusions The combination of clinical data and image descriptors was advantageous in most mammography CADx scenarios. A new descriptor based on the divergence of the gradient (HGD) was demonstrated to be a feasible predictor of breast masses' diagnosis.