Understanding and Improving Attention Mechanisms with ROPE in Computer Vision Applications

30 Oct 2024 (modified: 05 Nov 2024)THU 2024 Fall AML SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: ROPE, Attention, Transformers, Machine Learning
Abstract: This research proposal presents a comprehensive investigation into attention mechanisms and Rotary Position Embeddings (ROPE) in the context of com- puter vision applications. Building upon recent advances [Heo et al., 2024], we address two fundamental challenges: the interpretability of attention-based mod- els in safety-critical applications and the optimization of attention mechanisms through ROPE for vision tasks. Our work contributes to the field by proposing novel frameworks for attention visualization, developing enhanced ROPE vari- ants for vision applications, and establishing quantitative metrics for attention map analysis. The proposed research has significant implications for improving the re- liability and interpretability of vision transformers in critical applications.
Submission Number: 41
Loading