Thermodynamic Binding: Freezing Chimeric States in Multi-Modal Associative Memories
Keywords: associative memory, multi-modal learning, binding problem, energy-based models, transformer attention
TL;DR: mTAM resolves the multi-modal binding problem by enforcing global representational consistency through a shared consensus bottleneck derived from a global energy functional.
Abstract: We introduce the Multi-modal Transformer Associative Memory (mTAM), an energy-based architecture extending Exponential Associative Memories to multi-modal retrieval.
Standard multi-modal Transformers suffer from the "binding problem'': decoupled attention admits chimeric states where modalities lock onto inconsistent concepts, catastrophic for unified representations.
mTAM resolves this by construction via Consensus Split-Bank Attention (CSA), coupling retrieval across all modalities through a single shared probability distribution derived from a Difference-of-Convex (DC) energy functional.
We prove monotonic convergence to stationary points and establish, via planted Random Energy Model analysis, that storage capacity depends on a topology-dependent effective variance $\Sigma_{\mathrm{eff}}^2$ controlled by the harmonic in-degree sum.
Empirically, mTAM exhibits sharp thermodynamic phase transitions, resolves multi-way chimeric conflicts in one consensus step where baselines fail permanently, and confirms that graph topology quantitatively controls the capacity frontier.
Submission Number: 36
Loading