Trajectory analysis of cardiovascular phenotypes from biobank data uncovers novel genetic associations
ABSTRACTApproximately 6 million adults in the US have heart failure (HF). HF progression is variable due in part to differences in sex, age, and genetic ancestry. Previous population-based genetic studies have largely focused on cross-sectional data related to HF, a disease known to change over time. Utilizing longitudinal data trajectory probabilities as a continuous trait may increase the likelihood of finding significant, biologically relevant associations in a genome-wide association (GWA)
... association (GWA) analysis. We analyzed data from the electronic health record in a medical biobank from a single, metropolitan US center to gather clinically pertinent data for analyses. We evaluated whole genome sequencing of 896 unrelated biobank participants, including 494 with at least 1 electrocardiogram and 324 who had more than 1 echocardiogram (∼3 observations per person). A censored normal distribution multivariable mixture model was used to cluster phenotype measures for genome-wide analyses. GWA analysis on the trajectory probability of the corrected QT measurement (QTc) taken from electrocardiograms identified significant associations with variants in regulatory regions proximal to the WLS gene, which encodes the Wnt ligand secretion mediator, Wntless. WLS was previously associated with QT length using of approximately 16,000 participants supporting the utility of this method to uncover significant genetic associations in small datasets. GWA analysis on the trajectory probability of left ventricular diameter as taken from echocardiograms identified novel significant associations with variants in regulatory regions near MYO10, which encodes the unconventional Myosin-10. We found that trajectory probabilities improved the ability to discover significant and relevant genetic associations. This novel approach increased yield from smaller, well-phenotyped cohorts with longitudinal data from a medical biobank.AUTHOR SUMMARYApproximately 6 million adults in the US have heart failure, a disease known to change over time. In a hospital based electronic health record, electrocardiograms and echocardiograms, used to evaluate heart failure, can be tracked over time. We utilized these data to create a novel trait that can be applied to genetic analyses. We analyzed genome sequence of 896 biobank participants from diverse racial/ethnic backgrounds. Genome-wide association (GWA) analyses were performed on a subset of these individuals for heart failure outcomes. A statistical model that incorporates cardiac data that are tracked over time was used to cluster these data using a probabilistic approach. These probabilities were used for a GWA analysis for corrected QT measurement (QTc) and left ventricular diameter (LVID). The QTc interval analysis identified significant correlations with variants in regulatory regions near the WLS gene which encodes the Wnt ligand secretion mediator, Wntless. Analysis of LVID identified significant associations with variants in regulatory regions near the MYO10 gene which encodes the unconventional Myosin-10. Through these analyses, we found that using the trajectory probabilities can facilitate the discovery of novel significant, biologically relevant associations. This method reduces the need for larger cohorts, and increases yield from smaller, well-phenotyped cohorts.