Keywords: fMRI, Hyperbolic Space, Lorentz, cross-modal
Abstract: Understanding the intricate mappings between visual stimuli and their corresponding neural responses is a fundamental challenge in cognitive neuroscience and artificial intelligence. Current vision-brain representation learning approaches predominantly align paired images and functional magnetic resonance imaging (fMRI) responses within a shared Euclidean embedding space. However, Euclidean geometry struggles with the exponential complexity of visual/neural hierarchies, resulting in semantically undiscriminating embeddings. To overcome this, we propose HypBrain, a novel framework that leverages hyperbolic geometry to learn a shared, cross-subject vision-brain representation. Our framework maps both visual information and multi-subject fMRI responses into a shared Lorentz model, a geometry uniquely suited for embedding hierarchical data. We introduce a new mapping logic where abstract visual concepts are embedded near the hyperbolic origin, while more specific fMRI responses are situated in the exponentially expanding periphery, naturally capturing the “entailment” relationship between visual and neural data. Notably, we train a hyperbolic encoder on multi-subject fMRI data to integrate both common and unique characteristics of individual brain responses. Experimental results demonstrate that HypBrain not only exhibits robust capabilities in accurately quantifying semantic alignment but also achieves significant advancements in capturing cross-modal semantic relationships solely by optimizing the geometric properties of the embedding space. Our method confirms the superiority of hyperbolic geometry in aligning cross-modal semantic representations and modeling hierarchical associations, thereby offering an innovative perspective in the field of vision-brain representation learning.
Primary Area: applications to neuroscience & cognitive science
Submission Number: 7315
Loading