Dynamic Token Normalization improves Vision TransformersDownload PDFOpen Website

2022 (modified: 18 Nov 2022)ICLR 2022Readers: Everyone
Abstract: Vision Transformer (ViT) and its variants (e.g., Swin, PVT) have achieved great success in various computer vision tasks, owing to their capability to learn long-range contextual information. Layer...
0 Replies

Loading