Asymptotic and Finite-Time Guarantees for Langevin-Based Temperature Annealing in InfoNCE

Published: 22 Sept 2025, Last Modified: 01 Dec 2025NeurIPS 2025 WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Contrastive Learning, Simulated Annealing, Langevin Dynamics, Temperature Schedules, Stochastic Optimization, Representation Learning, Machine Learning Theory
TL;DR: We establish a formal link between temperature annealing in InfoNCE and classical simulated annealing, proving that a slow logarithmic temperature schedule guarantees convergence to globally optimal representations.
Abstract: The InfoNCE loss in contrastive learning depends critically on a temperature parameter, yet its dynamics under fixed versus annealed schedules remain poorly understood. We provide a theoretical analysis by modeling embedding evolution under Langevin dynamics on a compact Riemannian manifold. Under mild smoothness and energy-barrier assumptions, we show that classical simulated annealing guarantees extend to this setting: slow logarithmic inverse-temperature schedules ensure convergence in probability to a set of globally optimal representations, while faster schedules risk becoming trapped in suboptimal minima. Our results establish a link between contrastive learning and simulated annealing, providing a principled basis for understanding and tuning temperature schedules.
Submission Number: 15
Loading