Keywords: Transformers, Vision Transformers, Human Visual System, Foveation
Abstract: The human visual system (HVS) employs foveated sampling and eye movements to achieve efficient perception, conserving both metabolic energy and computational resources. Drawing inspiration from this efficiency, we introduce the $\textit{Foveated Dynamic Vision Transformer (FDT)}$, a novel architecture that integrates these mechanisms into a vision transformer framework. Unlike existing models, the FDT uses a single-pass strategy, utilizing fixation and foveation modules to enhance computational efficiency and accuracy. The fixation module identifies fixation points to filter out irrelevant information, while the foveation module generates foveated embeddings with multi-scale information. Our findings show that the FDT achieves superior accuracy and computational efficiency, with a 34\% reduction in multiply-accumulate operations. Additionally, the FDT exhibits robustness against various types of noise and adversarial attacks without specific training for these challenges. These attributes make the FDT a significant step forward in creating artificial neural networks that mirror the efficiency, robustness, and adaptability of the HVS.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11817
Loading