Quantum entanglement for attention models

TMLR Paper5690 Authors

21 Aug 2025 (modified: 02 Sept 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Attention mechanisms in deep learning establish relationships between different positions within a sequence, enabling models like Transformers to generate effective outputs by focusing on relevant input segments and their relations. The performance of Transformers is highly dependent on the chosen attention mechanism, with various approaches balancing trade-offs between computational cost, memory efficiency, and generalization ability based on the task. Quantum machine learning models possess the potential to outperform their classical counterparts in specialized settings. This makes exploring the benefits of quantum resources within classical machine learning models a promising research direction. The role of entanglement in quantum machine learning, whether in fully quantum or as subroutines in classical-quantum hybrid models, remains poorly understood. In this work, we investigate the hypothesis of whether entanglement can be used to model nuanced correlations in classical data, analogous to its role in many-body systems. We further test whether quantum entanglement can be used as a resource to improve the performance of the attention layer in Transformers. We introduce an entanglement entropy-based attention layer within a classical Transformer architecture and numerically evaluate it across various datasets. Our experiments on standard classification tasks in both vision and NLP domains reveal that the entanglement-based attention layer outperforms existing quantum attention frameworks and the widely used quantum kernel attention models, particularly in the presence of noise. Our work contributes toward exploring the power of quantum resources as a subroutine in the classical-quantum hybrid setting to further enhance classical models.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Christopher_Mutschler1
Submission Number: 5690
Loading