Comparison of Objective Image Quality Metrics to Expert Radiologists' Scoring of Diagnostic Quality of MR Images
IEEE Transactions on Medical Imaging
Image quality metrics (IQMs) such as root mean square error (RMSE) and structural similarity index (SSIM) are commonly used in the evaluation and optimization of accelerated magnetic resonance imaging (MRI) acquisition and reconstruction strategies. However, it is unknown how well these indices relate to a radiologist's perception of diagnostic image quality. In this study, we compare the image quality scores of five radiologists with the RMSE, SSIM, and other potentially useful IQMs: peak
... l to noise ratio (PSNR) multi-scale SSIM (MSSSIM), information-weighted SSIM (IWSSIM), gradient magnitude similarity deviation (GMSD), feature similarity index (FSIM), high dynamic range visible difference predictor (HDRVDP), noise quality metric (NQM), and visual information fidelity (VIF). The comparison uses a database of MR images of the brain and abdomen that have been retrospectively degraded by noise, blurring, undersampling, motion, and wavelet compression for a total of 414 degraded images. A total of 1017 subjective scores were assigned by five radiologists. IQM performance was measured via the Spearman rank order correlation coefficient (SROCC) and statistically significant differences in the residuals of the IQM scores and radiologists' scores were tested. When considering SROCC calculated from combining scores from all radiologists across all image types, RMSE and SSIM had lower SROCC than six of the other IQMs included in the study (VIF, FSIM, NQM, GMSD, IWSSIM, and HDRVDP). In no case did SSIM have a higher SROCC or significantly smaller residuals than RMSE. These results should be considered when choosing an IQM in future imaging studies.