DiagSWin: A multi-scale vision transformer with diagonal-shaped windows for object detection and segmentation

Published: 01 Jan 2024, Last Modified: 26 Jun 2025Neural Networks 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Diagonal-shaped Window attention has fewer computational costs and parameters.•Combines multi-scale feature extractions within a single self-attention layer.•The proposed method can easily capture multi-scale objects in high-resolution images.
Loading