Novel meta-analysis pipeline of heterogeneous high-throughput gene expression datasets reveals dysregulated interactions and pathways in asthma
Asthma is a complex and chronic inflammatory disorder with varying degrees of airway inflammation. It affects ~235 million people worldwide, and about 8% of the United States population. Unlike single-gene disorders, asthma phenotypes are guided by a highly variable combination of genotypes, making it a complex disease to study computationally. Recently, several independent high-throughput gene expression studies in bioinformatics have identified and proposed numerous molecular drivers involved
... ar drivers involved in asthma initiation and progression. However, there is a poor consensus in our understanding of the molecular factors involved in the mechanism of this disease due to inherent genetic heterogeneity. Such an uncertainty in bioinformatics studies have led to a "reproducibility crisis" in the field, where similar analyses can often yield greatly varying results. In this study, we seek to harness heterogeneity in asthma by applying a meta-analysis that explores varying tissue environments. Methods: We use three publicly-available microarray gene expression datasets, belonging to different tissues in asthma patients, from NCBI's Gene Expression Omnibus (GEO). As a meta-analysis, we apply a mixed-model effect size test to determine differentially expressed (DE) genes across all three studies. Then, The datasets are pre-processed and subjected to Weighted Gene Co-expression Network Analysis (WGCNA) for identification of functional modules. Using module preservation, we determine modules in asthma that were not preserved in the healthy condition, then combine the three with a Fisher's test for a set of asthma-unique modules. These modules are explored using functional analysis (i.e. GO term analysis). Using the DE genes as well as known transcription factors, we re-construct Gene Regulatory Networks (GRNs) for each of our shortlisted modules. We then studied the topology of these GRNs using hive plots to reveal underlying dysregulations, paving the way for future analyses. Results: Our analysis reveals a novel perspective to a key interaction in asthma inflammatory regulation, the CHD4-CCL26 transcription relation. Our hive plot analysis is able to explore this gene interaction beyond the typical "over-expression, under-expression" results from typical bioinformatics studies. We reveal that CCL26, an important regulator of asthma, appears to increase in expression and topological degree in asthma, but loses connection to CHD4, which seems to be characteristic to the asthma disease. Such analysis suggests that the topology of gene networks, above simply expression values, may be key to understanding the nuanced interactions between fundamental biomarkers and drug targets in complex diseases like asthma.