The Benefits of Balance: From Information Projections to Variance Reduction

Published: 25 Sept 2024, Last Modified: 16 Jan 2025NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: regularized optimal transport, self-supervised learning, variance reduction, alternating projection
TL;DR: We present a data balancing approach to distribution estimation that provides theoretical interpretations of the various self-supervised training schemes.
Abstract: Data balancing across multiple modalities and sources appears in various forms in foundation models in machine learning and AI, e.g., in CLIP and DINO. We show that data balancing across modalities and sources actually offers an unsuspected benefit: variance reduction. We present a non-asymptotic statistical bound that quantifies this variance reduction effect and relates it to the eigenvalue decay of Markov operators. Furthermore, we describe how various forms of data balancing in contrastive multimodal learning and self-supervised clustering can be better understood, and even improved upon, owing to our variance reduction viewpoint.
Supplementary Material: zip
Primary Area: Learning theory
Submission Number: 8269
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview