ReViT: Enhancing vision transformers feature diversity with attention residual connections

Published: 01 Jan 2024, Last Modified: 05 Mar 2025Pattern Recognit. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Vision transformers suffer from feature collapsing in deeper layers.•Residual attention contrast feature collapsing.•Vision transformers with residual attention learn better representations.•Residual attention improves the ViT’s performance in visual recognition tasks.
Loading