Mouse obesity network reconstruction with a variational Bayes algorithm to employ aggressive false positive control
We propose a novel variational Bayes network reconstruction algorithm to extract the most relevant disease factors from high-throughput genomic data-sets. Our algorithm is the only scalable method for regularized network recovery that employs Bayesian model averaging and that can internally estimate an appropriate level of sparsity to ensure few false positives enter the model without the need for cross-validation or a model selection criterion. We use our algorithm to characterize the effect
... genetic markers and liver gene expression traits on mouse obesity related phenotypes, including weight, cholesterol, glucose, and free fatty acid levels, in an experiment previously used for discovery and validation of network connections: an F2 intercross between the C57BL/6 J and C3H/HeJ mouse strains, where apolipoprotein E is null on the background. Results: We identified eleven genes, Gch1, Zfp69, Dlgap1, Gna14, Yy1, Gabarapl1, Folr2, Fdft1, Cnr2, Slc24a3, and Ccl19, and a quantitative trait locus directly connected to weight, glucose, cholesterol, or free fatty acid levels in our network. None of these genes were identified by other network analyses of this mouse intercross data-set, but all have been previously associated with obesity or related pathologies in independent studies. In addition, through both simulations and data analysis we demonstrate that our algorithm achieves superior performance in terms of power and type I error control than other network recovery algorithms that use the lasso and have bounds on type I error control. Conclusions: Our final network contains 118 previously associated and novel genes affecting weight, cholesterol, glucose, and free fatty acid levels that are excellent obesity risk candidates.