An Inferential Framework for Network Hypothesis Tests: With Applications to Biological Networks
AN INFERENTIAL FRAMEWORK FOR NETWORK HYPOTHESIS TESTS: WITH APPLICATIONS TO BIOLOGICAL NETWORKS The analysis of weighted co-expression gene sets is gaining momentum in systems biology. In addition to substantial research directed toward inferring co-expression networks on the basis of microarray/high-throughput sequencing data, inferential methods are being developed to compare gene networks across one or more phenotypes. Common gene set hypothesis testing procedures are mostly confined to
... ly confined to comparing average gene/node transcription levels between one or more groups and make limited use of additional network features, e.g., edges induced by significant partial correlations. Ignoring the gene set architecture disregards relevant network topological comparisons and can result in familiar n ≪ p over-parameterized test issues. In this dissertation we propose a method for performing one-and two-sample hypothesis tests for (weighted) networks. We build on a measure of separation defined via a local neighborhood metric. This node-centered additive metric exploits the network properties of nearby neighbors. The use of local neighborhoods seeks to lessen the effect of a large number of (potentially) estimable parameters; biology or algorithms are commonly used to further reduce the prospect of spurious biological associations. Where possible, we avoid specifying dubious network probability models. In order to draw statistical inferences we use a resampling approach. Our method allows for both an overall network test and a post hoc examination of individual gene/node effects. We evaluate our approach using both simulated data and microarray data obtained from diabetes and ovarian cancer studies.