Language Guided Representation Learning

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: representation learning, generalization, natural language, shortcut learning, continual learning, language guidance
TL;DR: Exploring natural language guidance in vision models to improve representations, generalization, robustness and continual learning
Abstract:

Deep neural networks have achieved notable success; however, they still encounter significant challenges compared to humans, particularly in areas such as shortcut learning, texture bias, susceptibility to noise, and catastrophic forgetting, all of which hinder their ability to generalize and adapt. Humans excel in learning high-level abstractions, attributed to various mechanisms in the brain, including reasoning, explanation, and the ability to share concepts verbally—largely facilitated by natural language as a tool for abstraction and systematic generalization. Inspired by this, we investigate how language can be leveraged to guide representation learning. To this end, we explore two approaches to language guidance: Explicit Language Guidance, which introduces direct and verbalizable insights into the model, and Implicit Language Guidance, which provides more intuitive and indirect cues. Our extensive empirical analysis shows that, despite being trained exclusively on text, these methods provide supervision to vision encoders, resulting in improvements in generalization, robustness, and task adaptability in continual learning. These findings underscore the potential of language-guided learning to develop AI systems that can benefit from abstract, high-level concepts, similar to human cognitive abilities.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11793
Loading