Underestimation of uncertainties in health utilities dervied from mapping algorithms involving health-related quality of life measures: statistical explanations and potential remedies

K. Chan, A. Willan, M. Gupta, E. Pullenayegum
2013 Value in Health  
A 1 -A 2 9 8 A49 sparse data structures. Separation often leads to failure in convergence of maximum likelihood models or unrealistic parameter estimates with wide confidence intervals. Therefore, the study objective is to compare the empirical performance of alternative methods for modeling sparse data in the context of small sample sizes: Firth-bias corrected logistic regression, exact logistic regression, penalized logistic regressions macro implemented in STATA, removal of the variable
more » ... ng separation, and a Bayesian logistic model with a weakly informative prior (WIP). METHODS: HIPAA compliant diabetes patient records were used for determining factors associated with exposure to Medication Therapy Management (MTM) services at high frequency. Potential predictors of MTM visit frequency included age, gender, medication regimen complexities and presence of diabetes-related complications. This dataset had a small sample size (n=121) and exhibited separation problem; all patients in the high visit frequency group had diabetes with complexity. We compared the results of the Bayesian model with a WIP (coefficients are assigned a N(0,1.38) prior) to results of deleting the problematic variable, exact logistic regression and two different algorithms for penalized log likelihood functions (Firth's Bias-Correction in SAS and a STATA-Macro based routine). RESULTS: The Bayesian model with WIP produced odds ratio estimates of high frequency group membership based on diabetes complexity within expected range of treatment effects and plausible confidence intervals OR=4.64 (CI:0.98, 24.58). Among other models, only Firth-Bias model converged but parameter estimates and confidence intervals were unrealistically large OR=210.9 (CI:1.83, >999.99). Removal of the problematic variable (diabetes complexity) from the model prevented assessment of its effect on the probability of high visit frequency membership. CONCLUSIONS: Bayesian models with WIP represent a useful tool for modeling health outcomes sparse data with small sample size. OBJECTIVES: To explore the impact of inferences from very samples on the outcome of management decisions. In many cases management has some prior belief about the states of nature. We explore the potential advantage of incorporating Bayesian inference to improve the confidence in managerial decisions based on small samples.. Traditionally, survey researchers reconcile differences between survey results and prior beliefs by citing the uncertainty reflected in the sampling error or looking for other explanatory factors (such as possible survey measurement error). The Bayesian approach integrates the different sources of information (i.e., prior belief and observed survey results) to arrive at the most probable estimate. In full realization, a Bayesian approach considers not just the probability that "truth" lies outside some range of values but seeks to estimate the probability of each of many possible hypotheses, given the data was that obtained. METHODS: Using responses to a choice-based conjoint exercise that was embedded in an online survey of approximately 700 individuals, we created a series of samples of different sizes using different restrictions to reflect the ways in which both probability and convenience samples might be generated. We drew multiples of ten random samples of 25, 50, 75, 100, 150, 225 and 450 from our "population" of 897 respondents, resulting in 70 individual samples. We estimated HB models for each sample (using Sawtooth Software's CBC-HB program). RESULTS: Simulated choice probabilities-a key output of discrete choice models-stabilize across samples starting with n=75. For smaller samples, decision confidence can be increased using Bayesian inference and bootstrapping methods. CONCLUSIONS: Meaningful inferences-and hence decisions-can be made with smaller sample sizes by utilizing Bayesian inference and methods such as bootstrapping to better estimate the degree of uncertainty in the data. OBJECTIVES: Health utilities (HUs) are required to conduct cost-utility analyses (CUAs). Often, health-related quality of life (HRQOL) data, instead of HUs, are collected in clinical trials. Increasingly, mapping algorithms have been developed to derive HUs from HRQOL data. However, the variance of the derived HUs based on mapping are observed to be smaller than those of the actual HUs. METHODS: Two reasons are proposed: (1) the presence of important unmeasured predictors leading to a high degree of unexplained variance of derived HUs, and (2) ignoring that the regression coefficients are random variables themselves. We derive three variance estimators of HUs to account for these reasons: (1) R 2 -adjusted estimator, (2) parametric estimator and (3) nonparametric estimator. We tested these estimators using a simulated dataset and a real dataset involving EQ-5D and University of Washington Quality of Life questionnaire for patients with head and neck cancers. RESULTS: The R 2 adjusted estimator can be used in ordinary least square (OLS) based mapping algorithms and requires only the R 2 from the derivation study. The parametric estimator can be used in OLS based mapping algorithms and requires the mean square error (MSE) and the design matrix from the derivation study. The nonparametric estimator can be used in any mapping algorithm and requires leaveone-out cross-validation MSE from the derivation study. In the simulated dataset, all three estimators are within 1% of the variance of the actual HUs. In the real dataset, the unadjusted variance was 44% less than the actual variance, while all three estimators are within 10% of the actual variance. CONCLUSIONS: When conducting CUA based on mapping algorithms, the variance of derived HUs should be properly adjusted using one of the proposed methods so that the results of the CUA will have the appropriate degree of uncertainty. OBJECTIVES: In survey questions where the variable of interest is quantitative, responses often involve selecting one of several mutually exclusive intervals in which the variable lies within (denoted interval data). This precludes one from using many of the popularly reported measures of center (mean, median, mode, etc.). To this end, a simple estimator is proposed to estimate the population mean, μ, when the data are intervaled and its properties are studied. METHODS: For estimation of μ given intervaled data, we propose the Weighted Interval Midpoint Estimator (WIME). Expressions for its expected value and variance are derived. These are then calculated for normal distributions and a χ 2 distribution on 1 degree of freedom using various interval configurations. Bootstrapping methods are then proposed to obtain estimates of the sampling distribution of the WIME as well as the sample mean given the interval counts. RESULTS: In general, the WIME is a biased estimator of μ; this bias is the same for all sample sizes. Simple bounds for the bias can derived. Both the bias and variance of the estimator depend on the choice of intervals. In the case of the normal distribution, equal-length intervals produce estimates with seemingly no bias and variance slightly above that of the sample mean as opposed to a non-equal-length configuration, even if the intervals are not symmetric about μ. For the χ 2 distribution on 1 degree of freedom, using equal-length intervals produces estimates with less bias and variance than when using nonequal-length intervals. CONCLUSIONS: While the WIME is a quick and easy method to estimate μ, its performance depends on the intervals chosen. Thus prudence must be taken when selecting them. In the event no prior information exists to guide the process, equal-length intervals seem to be a safe fallback. OBJECTIVES: 1) To examine the practice of calculating a sample mean cost for each of two or more cohorts, reporting the difference(s) as the incremental cost(s), and reconciling this practice against the common assumption that the underlying expenditures follow a two-parameter gamma distribution, and 2) To revisit the interpretation of incremental costs based on the difference in sample means as the properties of the assumed underlying gamma distribution vary. METHODS: Monte Carlo gamma distribution simulation in SAS version 9.3 varying the shape and scale parameters for the simulations and displaying the results in graphical and tabular format. RESULTS: It is possible to create examples of simulated data sets where the sample means have values that can be in excess of the estimated 75 th percentile. CONCLUSIONS: An analyst should be cautious in his or her reporting of incremental costs as the lay consumer of these quantities may interpret the difference in the means like they would for two or more somewhat symmetrical distributions where the mean can represent the center. However, this interpretation might be misleading depending on the magnitude of the shape and scale parameters that characterize an underlying distribution's behavior. OBJECTIVES: When evaluating multiple drugs for equivalence (or noninferiority) within the context of a Bayesian MTC, most studies base their interpretation solely on the point estimates and respective credible intervals. The following novel methodology advances interpretation by: Incorporating a pre-specified minimal clinically important difference (MCID); presenting a direct probability of equivalence (or non-inferiority), and graphically depicting how the probability varies by MCID. METHODS: As an illustrative example, we applied MTC to compare 12-week HbA1c reduction with vildagliptin 50 mg bid vs. sitagliptin 100 mg qd as monotherapies in patients with type 2 diabetes. Equivalence was assessed with a predefined equivalence margin of MCID. A Bayesian approach has the advantage of being able to provide probability statements for equivalence, to make direct inferential statement that the treatment effect between the two comparisons is between the specified lower and upper MCID (HbA1c ±0.7). The posterior probability of equivalence is calculated based on the area under the curve between lower and upper MCID on distribution of the mean change in HbA1c between the two comparisons. Sensitivity analysis was conducted by varying MCID values. RESULTS: The results of the MTC showed no significance difference between the two interventions in the reduction of HbA1c at 12 weeks (Δ = 0.16; 95% CrI -0.20 to 0.52). However, this evidence of "no significant difference" does not prove equivalence. Applying the new method, at 12 weeks follow-up, the probability
doi:10.1016/j.jval.2013.03.275 fatcat:42dv6vt5xvbrvmuvizs3dng74y