Graph-wise Common Latent Factor Extraction for Unsupervised Graph Representation Learning
Unsupervised graph-level representation learning plays a crucial role in a variety of tasks such as molecular property prediction and community analysis, especially when data annotation is expensive. Currently, most of the best-performing graph embedding methods are based on Infomax principle. The performance of these methods highly depends on the selection of negative samples and hurt the performance, if the samples were not carefully selected. Inter-graph similarity-based methods also suffer
... f the selected set of graphs for similarity matching is low in quality. To address this, we focus only on utilizing the current input graph for embedding learning. We are motivated by an observation from real-world graph generation processes where the graphs are formed based on one or more global factors which are common to all elements of the graph (e.g., topic of a discussion thread, solubility level of a molecule). We hypothesize extracting these common factors could be highly beneficial. Hence, this work proposes a new principle for unsupervised graph representation learning: Graph-wise Common latent Factor EXtraction (GCFX). We further propose a deep model for GCFX, deepGCFX, based on the idea of reversing the above-mentioned graph generation process which could explicitly extract common latent factors from an input graph and achieve improved results on downstream tasks to the current state-of-the-art. Through extensive experiments and analysis, we demonstrate that, while extracting common latent factors is beneficial for graph-level tasks to alleviate distractions caused by local variations of individual nodes or local neighbourhoods, it also benefits node-level tasks by enabling long-range node dependencies, especially for disassortative graphs.