Track: Short Paper
Abstract: In this work, we investigate whether self-supervised learning (SSL) can abstract visual information into structured symbolic representations without direct supervision. Drawing inspiration from information theory, we explore the use of message encoding to capture complex visual data in symbolic sequences. Our goal is to generate meaningful image descriptions purely from these representations, relying on inherent data patterns rather than explicit labels or guidance. To achieve this, we employ SSL techniques, specifically DINO, hypothesizing that the encoded symbolic sequences will closely reflect the underlying structure of the visual content. This approach aims to determine whether visual data can be effectively distilled into symbolic forms that preserve essential meaning and structure.
Submission Number: 7
Loading