IRT Models for Expert-Coded Panel Data
Kyle L. Marquardt, Daniel Pemstein
2017
Social Science Research Network
Varieties of Democracy (V-Dem) is a new approach to conceptualization and measurement of democracy. It is co-hosted by the University of Gothenburg and University of Notre Dame. With a V-Dem Institute at University of Gothenburg with almost ten sta↵, and a project team across the world with four Principal Investigators, fifteen Project Managers (PMs), 30+ Regional Managers, 170 Country Coordinators, Research Assistants, and 2,500 Country Experts, the V-Dem project is one of the largest ever
more »
... al science research-oriented data collection programs. Please address comments and/or queries for information to: V-Dem Institute Abstract Data sets quantifying phenomena of social-scientific interest often use multiple experts to code latent concepts. While it remains standard practice to report the average score across experts, experts likely vary in both their expertise and their interpretation of question scales. As a result, the mean may be an inaccurate statistic. Item-response theory (IRT) models provide an intuitive method for taking these forms of expert disagreement into account when aggregating ordinal ratings produced by experts, but they have rarely been applied to crossnational expert-coded panel data. In this article, we investigate the utility of IRT models for aggregating expert-coded data by comparing the performance of various IRT models to the standard practice of reporting average expert codes, using both real and simulated data. Specifically, we use expert-coded cross-national panel data from the V-Dem data set to both conduct real-data comparisons and inform ecologically-motivated simulation studies. We find that IRT approaches outperform simple averages when experts vary in reliability and exhibit di↵erential item functioning (DIF). IRT models are also generally robust even in the absence of simulated DIF or varying expert reliability. Our findings suggest that producers of cross-national data sets should adopt IRT techniques to aggregate expert-coded data of latent concepts. Expert surveys are a powerful tool for measuring latent political concepts, ranging from the ideological positions of political parties (see e.g.
doi:10.2139/ssrn.2897442
fatcat:u3fxoavuzrfz5kuzzic4ltgscm