Thermodynamic Binding: Freezing Chimeric States in Multi-Modal Associative Memories

Published: 03 Mar 2026, Last Modified: 06 Mar 2026NFAM 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: associative memory, multi-modal learning, binding problem, energy-based models, transformer attention
TL;DR: mTAM resolves the multi-modal binding problem by enforcing global representational consistency through a shared consensus bottleneck derived from a global energy functional.
Abstract: We introduce the Multi-modal Transformer Associative Memory (mTAM), an energy-based architecture extending Exponential Associative Memories to multi-modal retrieval. Standard multi-modal Transformers suffer from the "binding problem'': decoupled attention admits chimeric states where modalities lock onto inconsistent concepts, catastrophic for unified representations. mTAM resolves this by construction via Consensus Split-Bank Attention (CSA), coupling retrieval across all modalities through a single shared probability distribution derived from a Difference-of-Convex (DC) energy functional. We prove monotonic convergence to stationary points and establish, via planted Random Energy Model analysis, that storage capacity depends on a topology-dependent effective variance $\Sigma_{\mathrm{eff}}^2$ controlled by the harmonic in-degree sum. Empirically, mTAM exhibits sharp thermodynamic phase transitions, resolves multi-way chimeric conflicts in one consensus step where baselines fail permanently, and confirms that graph topology quantitatively controls the capacity frontier.
Submission Number: 36
Loading