Identification of Relationships Between Patients Through Elements in a Data Warehouse Using the Familial, Associational, and Incidental Relationship (FAIR) Initiative: A Pilot Study
JMIR Medical Informatics
Over the last several years there has been widespread development of medical data warehouses. Current data warehouses focus on individual cases, but lack the ability to identify family members that could be used for dyadic or familial research. Currently, the patient's family history in the medical record is the only documentation we have to understand the health status and social habits of their family members. Identifying familial linkages in a phenotypic data warehouse can be valuable in
... rt identification and in beginning to understand the interactions of diseases among families. Objective: The goal of the Familial, Associational, & Incidental Relationships (FAIR) initiative is to identify an index set of patients' relationships through elements in a data warehouse. Methods: Using a test set of 500 children, we measured the sensitivity and specificity of available linkage algorithm identifiers (eg, insurance identification numbers and phone numbers) and validated this tool/algorithm through a manual chart audit. Results: Of all the children, 52.4% (262/500) were male, and the mean age of the cohort was 8 years old (SD 5). Of the children, 51.6% (258/500) were identified as white in race. The identifiers used for FAIR were available for the majority of patients: insurance number (483/500, 96.6%), phone number (500/500, 100%), and address (497/500, 99.4%). When utilizing the FAIR tool and various combinations of identifiers, sensitivity ranged from 15.5% (62/401) to 83.8% (336/401), and specificity from 72% (71/99) to 100% (99/99). The preferred method was matching patients using insurance or phone number, which had a sensitivity of 72.1% (289/401) and a specificity of 94% (93/99). Using the Informatics for Integrating Biology and the Bedside (i2b2) warehouse infrastructure, we have now developed a Web app that facilitates FAIR for any index population. Conclusions: FAIR is a valuable research and clinical resource that extends the capabilities of existing data warehouses and lays the groundwork for family-based research. FAIR will expedite studies that would otherwise require registry or manual chart abstraction data sources. (JMIR Med Inform 2015;3(1):e9) doi:10.2196/medinform.3738 KEYWORDS Informatics for Integrating Biology and the Bedside (i2b2); data warehouse; familial relationship JMIR Med Inform 2015 | vol. 3 | iss. 1 | e9 | p.1 http://medinform.jmir.org/2015/1/e9/ (page number not for citation purposes) English et al JMIR MEDICAL INFORMATICS XSL • FO RenderX Address Address had sensitivity of 45.4% (182/401) for identifying relationships, a specificity of 75% (74/99), a PPV of 87.9% (182/207), and an NPV of 25.3% (74/293). Variation in how patients' addresses were entered into the system was a major issue. In particular, it was common to find addresses for an apartment building that were lacking unit numbers. This inaccurate data resulted in a large number of false positives.