Quantitative PET/CT Scanner Performance Characterization Based Upon the Society of Nuclear Medicine and Molecular Imaging Clinical Trials Network Oncology Clinical Simulator Phantom
Journal of Nuclear Medicine
The Clinical Trials Network (CTN) of the Society of Nuclear Medicine and Molecular Imaging (SNMMI) operates a PET/CT phantom imaging program using the CTN's oncology clinical simulator phantom, designed to validate scanners at sites that wish to participate in oncology clinical trials. Since its inception in 2008, the CTN has collected 406 well-characterized phantom datasets from 237 scanners at 170 imaging sites covering the spectrum of commercially available PET/CT systems. The combined and
... The combined and collated phantom data describe a global profile of quantitative performance and variability of PET/CT data used in both clinical practice and clinical trials. Methods: Individual sites filled and imaged the CTN oncology PET phantom according to detailed instructions. Standard clinical reconstructions were requested and submitted. The phantom itself contains uniform regions suitable for scanner calibration assessment, lung fields, and 6 hot spheric lesions with diameters ranging from 7 to 20 mm at a 4:1 contrast ratio with primary background. The CTN Phantom Imaging Core evaluated the quality of the phantom fill and imaging and measured background standardized uptake values to assess scanner calibration and maximum standardized uptake values of all 6 lesions to review quantitative performance. Scanner make-and-modelspecific measurements were pooled and then subdivided by reconstruction to create scanner-specific quantitative profiles. Results: Different makes and models of scanners predictably demonstrated different quantitative performance profiles including, in some cases, small calibration bias. Differences in site-specific reconstruction parameters increased the quantitative variability among similar scanners, with postreconstruction smoothing filters being the most influential parameter. Quantitative assessment of this intrascanner variability over this large collection of phantom data gives, for the first time, estimates of reconstruction variance introduced into trials from allowing trial sites to use their preferred reconstruction methodologies. Predictably, time-of-flight-enabled scanners exhibited less size-based partialvolume bias than non-time-of-flight scanners. Conclusion: The CTN scanner validation experience over the past 5 y has generated a rich, well-curated phantom dataset from which PET/CT make-and-model and reconstruction-dependent quantitative behaviors were characterized for the purposes of understanding and estimating scanner-based variances in clinical trials. These results should make it possible to identify and recommend make-and-model-specific reconstruction strategies to minimize measurement variability in cancer clinical trials. Mul ticenter oncology clinical trials are increasingly using PET/CT imaging as primary and secondary endpoints to define success or failure of treatment regimens, with considerable effort expended in understanding reproducibility and variability (1-11). PET, as an inherently quantitative imaging technique, is arguably the most powerful imaging modality available to researchers to assess response to therapy in the multicenter clinical trial setting. However, the accurate and reproducible quantitation methodology necessary to successfully complete a trial involving quantitative PET imaging has been complicated by vendors of commercial PET/CT scanner systems that understandably strive to generate higher quality diagnostic images to achieve market differentiation. Although these efforts advance the field, they also paradoxically add variability to multicenter trials that include PET/CT equipment whose inherent hardware and software technologies can differ by more than a decade. The introduction of time-of-flight (TOF)-capable scanners and reconstruction advancements including iterative approaches that account for the position-sensitive point-response function have further increased both quantitative and qualitative differences between older-and newer-generation scanners. The divergent image quality and varying quantitation make comparison of quantitative data associated with different makes and models of scanners of different vintages problematic within the context of multicenter clinical trials seeking to use metrics such as standardized uptake values (SUVs) and total lesion glycolysis (1,12). Several professional societies have initiated programs and are devising and promoting standardization practices designed to reduce variability within the context of image quantitation in clinical trials.