Are Surrogate Endpoints Unbiased Metrics in Clinical Benefit Scores of the ASCO Value Framework?
The Journal of the National Comprehensive Cancer Network
Clinical benefit scores (CBS) are key elements of the ASCO Value Framework (ASCO-VF) and are weighted based on a hierarchy of efficacy endpoints: hazard ratio for death (HR OS), median overall survival (mOS), HR for disease progression (HR PFS), median progression-free survival (mPFS), and response rate (RR). When HR OS is unavailable, the other endpoints serve as "surrogates" to calculate CBS. CBS are computed from PFS or RR in 39.6% of randomized controlled trials. This study examined whether
... surrogate-derived CBS offer unbiased scoring compared with HR OS–derived CBS. Methods: Using the ASCO-VF, CBS for advanced disease settings were computed for randomized controlled trials of oncology drug approvals by the FDA, European Medicines Agency, and Health Canada in January 2006 through December 2017. Mean differences of surrogate-derived CBS minus HR OS–derived CBS assessed the tendency of surrogate-derived CBS to overestimate or underestimate clinical benefit. Spearman's correlation evaluated the association between surrogate- and HR OS–derived CBS. Mean absolute error assessed the average difference between surrogate-derived CBS relative to HR OS–derived CBS. Results: CBS derived from mOS, HR PFS, mPFS, and RR overestimated HR OS–derived CBS in 58%, 68%, 77%, and 55% of pairs and overall by an average of 5.62 (n=90), 6.86 (n=110), 29.81 (n=101), and 3.58 (n=108), respectively. Correlation coefficients were 0.80 (95% CI, 0.70–0.86), 0.38 (0.20–0.53), 0.20 (0.00–0.38), and 0.01 (–0.18 to 0.19) for mOS-, HR PFS–, mPFS-, and RR-derived CBS, respectively, and mean absolute errors were 11.32, 12.34, 40.40, and 18.63, respectively. Conclusions: Based on the ASCO-VF algorithm, HR PFS–, mPFS-, and RR-derived CBS are suboptimal surrogates, because they were shown to be biased and poorly correlated to HR OS–derived CBS. Despite lower weighting than OS in the ASCO-VF algorithm, PFS still overestimated CBS. Simple rescaling of surrogate endpoints may not improve their validity within the ASCO-VF given their poor correlations with HR OS–derived CBS.