$f$-Mutual Information Contrastive LearningDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: contrastive learning, f-divergence, mutual information
Abstract: Self-supervised contrastive learning is an emerging field due to its power in providing good data representations. Such learning paradigm widely adopts the InfoNCE loss, which is closely connected with maximizing the mutual information. In this work, we propose the $f$-Mutual Information Contrastive Learning framework ($f$-MICL) , which directly maximizes the $f$-divergence-based generalization of mutual information. We theoretically prove that, under mild assumptions, our $f$-MICL naturally attains the alignment for positive pairs and the uniformity for data representations, the two main factors for the success of contrastive learning. We further provide theoretical guidance on designing the similarity function and choosing the effective $f$-divergences for $f$-MICL. Using several benchmark tasks from both vision and natural text, we empirically verify that our novel method outperforms or performs on par with state-of-the-art strategies.
One-sentence Summary: We provide a new framework for self-supervised contrastive learning by directly maximizing the $f$-divergence-based generalization of mutual information.
Supplementary Material: zip
13 Replies