Contrastive Learning Through TimeDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Keywords: contrastive learning, object recognition, virtual environment, temporal coherence
Abstract: Contrastive learning has emerged as a powerful form of unsupervised representation learning for images. The utility of learned representations for downstream tasks depends strongly on the chosen augmentation operations. Taking inspiration from biology, we here study contrastive learning through time (CLTT), that works completely without any augmentation operations. Instead, positive pairs of images are generated from temporally close video frames during extended naturalistic interaction with objects. To this end, we develop a new data set using a near-photorealistic training environment based on ThreeDWorld (TDW). We propose a family of CLTT algorithms based on state-of-the-art contrastive learning methods and demonstrate that CLTT allows linear classification performance that approaches that of the fully supervised setting. We also consider temporal correlations resulting from one object being seen systematically before or after another object. We show that this leads to increased representational similarity between these objects, matching classic biological findings. We argue that this "close in time, will align" effect is generically useful for learning abstract representations. The data sets, code and pre-trained models for this paper can be downloaded at: (link will be added in the final version)
One-sentence Summary: We study the formation of object representations through contrastive learning through time, where temporally near video frames are mapped onto close-by latent representations
9 Replies

Loading