Gaussian Mutual Information Maximization for Efficient Graph Self-Supervised Learning: Bridging Contrastive-based to Decorrelation-based

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Enlightened by the InfoMax principle, Graph Contrastive Learning (GCL) has achieved remarkable performance in processing large amounts of unlabeled graph data. Due to the impracticality of precisely calculating mutual information (MI), conventional contrastive methods turn to approximate its lower bound using parametric neural estimators, which inevitably introduces additional parameters and leads to increased computational complexity. Building upon a common Gaussian assumption on the distribution of node representations, a computationally tractable surrogate for the original MI can be rigorously derived, termed as Gaussian Mutual Information (GMI). Leveraging multi-view priors of GCL, we induce an efficient contrastive objective based on GMI with performance guarantees, eliminating the reliance on parameterized estimators and negative samples. The emergence of another decorrelation-based self-supervised learning branch parallels contrastive-based approaches. By positioning the proposed GMI-based objective as a pivot, we bridge the gap between these two research areas from two aspects of approximate form and consistent solution, which contributes to the advancement of a unified theoretical framework for self-supervised learning. Extensive comparison experiments and visual analysis provide compelling evidence for the effectiveness and efficiency of our method while supporting our theoretical achievements.
Relevance To Conference: Our paper presents an efficient method for self-supervised learning, primarily focusing on graph data for research. However, our approach can be easily extended to multimodal data, where different modalities serve as multiple views in self-supervised learning. Compared to GMIM-IC, GMIM does not introduce any prior information about the model architecture, making it particularly suitable for self-supervised multimodal information mining. Relative to current self-supervised learning methods based on neural estimators, our approach demonstrates remarkably high efficiency, which is crucial for handling large-scale multimodal data.
Supplementary Material: zip
Primary Subject Area: [Content] Multimodal Fusion
Submission Number: 1489
Loading