Validity of self-assessment in a quality improvement collaborative in Ecuador

J. Hermida, E. I. Broughton, L. Miller Franco
2011 International Journal for Quality in Health Care  
Objectives. Health care quality improvement (QI) efforts commonly use self-assessment to measure compliance with quality standards. This study investigates the validity of self-assessment of quality indicators. Design. Cross sectional. Setting. A maternal and newborn care improvement collaborative intervention conducted in health facilities in Ecuador in 2005. Participants. Four external evaluators were trained in abstracting medical records to calculate six indicators reflecting compliance
more » ... treatment standards. Interventions. About 30 medical records per month were examined at 12 participating health facilities for a total of 1875 records. The same records had already been reviewed by QI teams at these facilities (self-assessment). Main Outcome Measures. Overall compliance, agreement (using the Kappa statistic), sensitivity and specificity were analyzed. We also examined patterns of disagreement and the effect of facility characteristics on levels of agreement. Results. External evaluators reported compliance of 69 -90%, while self-assessors reported 71 -92%, with raw agreement of 71 -95% and Kappa statistics ranging from fair to almost perfect agreement. Considering external evaluators as the gold standard, sensitivity of self-assessment ranged from 90 to 99% and specificity from 48 to 86%. Simpler indicators had fewer disagreements. When disagreements occurred between self-assessment and external valuators, the former tended to report more positive findings in five of six indicators, but this tendency was not of a magnitude to change program actions. Team leadership, understanding of the tools and facility size had no overall impact on the level of agreement. Conclusions. When compared with external evaluation (gold standard), self-assessment was found to be sufficiently valid for tracking QI team performance. Sensitivity was generally higher than specificity. Simplifying indicators may improve validity.
doi:10.1093/intqhc/mzr057 pmid:21840942 fatcat:in7pme7adjc2jmwrylt6j2hbra