Traits and types of health data repositories

Ted D Wade
2014 Health Information Science and Systems  
We review traits of reusable clinical data and offer a typology of clinical repositories with a range of known examples. Sources of clinical data suitable for research can be classified into types reflecting the data's institutional origin, original purpose, level of integration and governance. Primary data nearly always come from research studies and electronic medical records. Registries collect data on focused populations primarily to track outcomes, often using observational research
more » ... . Warehouses are institutional information utilities repackaging clinical care data. Collections organize data from more organizations than a data warehouse, and more original data sources than a registry. Therefore even if they are heavily curated, their level of internal integration, and thus ease of use, can be less than other types. Federations are like collections except that physical control over data is distributed among donor organizations. Federations sometimes federate, giving a second level of organization. While the size, in number of patients, varies widely within each type of data source, populations over 10 K are relatively numerous, and much larger populations can be seen in warehouses and federations. One imagined ideal structure for research progress has been called an "Information Commons". It would have longitudinal, multi-leveled (environmental through molecular) data on a large population of identified, consenting individuals. These are qualities whose achievement would require long term commitment on the part of many data donors, including a willingness to make their data public.
doi:10.1186/2047-2501-2-4 pmid:25825668 pmcid:PMC4340801 fatcat:4f4dcxbdqfg5vkc3yzhumdxdsy