INFOGRAPH: UNSUPERVISED AND SEMI-SUPERVISED GRAPH-LEVEL REPRESENTATION LEARNING VIA MUTUAL INFORMATION MAXIMIZATION
Abstract: This paper studies learning the representations of whole graphs in both unsupervised
and semi-supervised scenarios. Graph-level representations are critical in a
variety of real-world applications such as predicting the properties of molecules and
community analysis in social networks. Traditional graph kernel based methods are
simple, yet effective for obtaining fixed-length representations for graphs but they
suffer from poor generalization due to hand-crafted designs. There are also some
recent methods based on language models (e.g. graph2vec) but they tend to only
consider certain substructures (e.g. subtrees) as graph representatives. Inspired by
recent progress of unsupervised representation learning, in this paper we proposed
a novel method called InfoGraph for learning graph-level representations. We
maximize the mutual information between the graph-level representation and the
representations of substructures of different scales (e.g., nodes, edges, triangles).
By doing so, the graph-level representations encode aspects of the data that are
shared across different scales of substructures. Furthermore, we further propose
InfoGraph*, an extension of InfoGraph for semi-supervised scenarios. InfoGraph*
maximizes the mutual information between unsupervised graph representations
learned by InfoGraph and the representations learned by existing supervised methods.
As a result, the supervised encoder learns from unlabeled data while preserving
the latent semantic space favored by the current supervised task. Experimental
results on the tasks of graph classification and molecular property prediction show
that InfoGraph is superior to state-of-the-art baselines and InfoGraph* can achieve
performance competitive with state-of-the-art semi-supervised model
0 Replies
Loading