Identifying Pathway Proteins in Networks using Convergence
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics - BCB'13
One of the key goals of systems biology concerns the analysis of experimental biological data available to the scientific public. New technologies are rapidly developed to observe and report whole-scale biological phenomena; however, few methods exist with the ability to produce specific, testable hypotheses from this noisy 'big' data. In this work, we propose an approach that combines the power of data-driven network theory along with knowledge-based ontology to tackle this problem. Network
... problem. Network models are especially powerful due to their ability to display elements of interest and their relationships as internetwork structures. Additionally, ontological data actually supplements the confidence of relationships within the model without clouding critical structure identification. As such, we postulate that given a (gene/protein) marker set of interest, we can systematically identify the core of their interactions (if they are indeed working together toward a biological function), via elimination of original markers and addition of additional necessary markers. This concept, which we refer to as "convergence," harnesses the idea of "guilt-by-association" and recursion to identify whether a core of relationships exists between markers. In this study, we test graph theoretic concepts such as shortest-path, k-Nearest-Neighbor and clustering) to identify cores iteratively in data-and knowledge-based networks in the canonical yeast Pheromone Mating Response pathway. Additionally, we provide results for convergence application in virus infection, hearing loss, and Parkinson's disease. Our results indicate that if a marker set has common discrete function, this approach is able to identify that function, its interacting markers, and any new elements necessary to complete the structural core of that function. The result below find that the shortest path function is the best approach of those used, finding small target sets that contain a majority or all of the markers in the gold standard pathway. The power of this approach lies in its ability to be used in investigative studies to inform decisions concerning target selection.