Graph embedding-based novel protein interaction prediction via higher-order graph convolutional network
Protein-protein interactions (PPIs) are essential for most biological processes. However, current PPI networks present high levels of noise, sparseness and incompleteness, which limits our ability to understand the cell at the system level from the PPI network. Predicting novel (missing) links in noisy PPI networks is an essential computational method for automatically expanding the human interactome and for identifying biologically legitimate but undetected interactions for experimental
... experimental determination of PPIs, which is both expensive and time-consuming. Recently, graph convolutional networks (GCN) have shown their effectiveness in modeling graph-structured data, which employ a 1-hop neighborhood aggregation procedure and have emerged as a powerful architecture for node or graph representations. In this paper, we propose a novel node (protein) embedding method by combining GCN and PageRank as the latter can significantly improve the GCN's aggregation scheme, which has difficulty in extending and exploring topological information of networks across higher-order neighborhoods of each node. Building on this novel node embedding model, we develop a higher-order GCN variational auto-encoder (HO-VGAE) architecture, which can learn a joint node representation of higher-order local and global PPI network topology for novel protein interaction prediction. It is worth noting that our method is based exclusively on network topology, with no protein attributes or extra biological features used. Extensive computational validations on PPI prediction task demonstrate our method without leveraging any additional biological information shows competitive performance-outperforms all existing graph embedding-based link prediction methods in both accuracy and robustness.