Discovering tightly regulated and differentially expressed gene sets in whole genome expression data

C. Ye, E. Eskin
2007 Bioinformatics  
Motivation: Recently, a new type of expression data is being collected which aims to measure the effect of genetic variation on gene expression in pathways. In these datasets, expression profiles are constructed for multiple strains of the same model organism under the same condition. The goal of analyses of these data is to find differences in regulatory patterns due to genetic variation between strains, often without a phenotype of interest in mind. We present a new method based on notions of
more » ... based on notions of tight regulation and differential expression to look for sets of genes which appear to be significantly affected by genetic variation. Results: When we use categorical phenotype information, as in the Alzheimer's and diabetes datasets, our method finds many of the same gene sets as gene set enrichment analysis. In addition, our notion of correlated gene sets allows us to focus our efforts on biological processes subjected to tight regulation. In murine hematopoietic stem cells, we are able to discover significant gene sets independent of a phenotype of interest. Some of these gene sets are associated with several blood-related phenotypes.
doi:10.1093/bioinformatics/btl315 pmid:17237110 fatcat:kngz5ccvcffhdcdjg5b3zexrge