Exploring the complexity of soybean (Glycine max) transcriptional regulation using global gene co-expression networks
Soybean (Glycine max (L.) Merr.) is one of the most important crops worldwide, constituting a major source of protein and edible oil. Gene co-expression networks (GCN) have been extensively used to study transcriptional regulation and evolution of genes and genomes. Here, we report a soybean GCN using 1,284 publicly available samples from 15 distinct tissues. We found modules that are differentially regulated in specific tissues, comprising processes such as photosynthesis, gluconeogenesis,
... luconeogenesis, lignin metabolism, and response to biotic stress. We identified transcription factors among intramodular hubs, which probably integrate different pathways and shape the transcriptional landscape in different conditions. The top hubs for each module tend to encode proteins with critical roles, such as succinate dehydrogenase and RNA polymerase subunits. Importantly, gene essentiality was strongly correlated with degree centrality and essential hubs enriched in genes involved in nucleic acids metabolism and regulation of cell replication. By using a using a guilt-by-association approach, we predicted functions for 93 of 106 hubs without functional description in soybean. Most of the duplicated genes had different transcriptional profiles, supporting their functional divergence, although paralogs originating from whole-genome duplications (WGD) are more often preserved in the same module than those from other mechanisms. Together, our results highlight the importance of GCN analysis in unraveling key functional aspects of the soybean genome, in particular those associated with hub genes and WGD events.