Predicting Visual Semantic Descriptive Terms From Radiological Image Data: Preliminary Results With Liver Lesions in CT

Adrien Depeursinge, Camille Kurtz, Christopher Beaulieu, Sandy Napel, Daniel Rubin
2014 IEEE Transactions on Medical Imaging  
We describe a framework to model visual semantics of liver lesions in CT images in order to predict the visual semantic terms (VST) reported by radiologists in describing these lesions. Computational models of VST are learned from image data using high-order steerable Riesz wavelets and support vector machines (SVM). The organization of scales and directions that are specific to every VST are modeled as linear combinations of directional Riesz wavelets. The models obtained are steerable, which
more » ... eans that any orientation of the model can be synthesized from linear combinations of the basis filters. The latter property is leveraged to model VST independently from their local orientation. In a first step, these models are used to predict the presence of each semantic term that describes liver lesions. In a second step, the distances between all VST models are calculated to establish a non-hierarchical computationally-derived ontology of VST containing inter-term synonymy and complementarity. A preliminary evaluation of the proposed framework was carried out using 74 liver lesions annotated with a set of 18 VSTs from the RadLex ontology. A leave-one-patient-out cross-validation resulted in an average area under the ROC curve of 0.853 for predicting the presence of each VST when using SVMs in a feature space combining the magnitudes of the steered models with CT intensities. Likelihood maps are created for each VST, which enables high transparency of the information modeled. The computationally-derived ontology obtained from the VST models was found to be consistent with the underlying semantics of the visual terms. It was found to be complementary to the RadLex ontology, and constitutes a potential method to link the image content to visual semantics. The proposed framework is expected to foster human-computer synergies for the interpretation of radiological images while using rotation-covariant computational models of VSTs to (1) quantify their local likelihood and (2) explicitly link them with pixel-based image content in the context of a given imaging domain.
doi:10.1109/tmi.2014.2321347 pmid:24808406 pmcid:PMC4129229 fatcat:y32sv3lkdndjpox6phsi6t6q24