ShareFormer: Share Attention for Efficient Image Restoration

17 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Efficient Image Restoration, Image Super Resolution, Image Denoising, Self Attention, Transformer
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Transformer-based networks are gaining popularity due to their superior ability to handle long-range information. However, they come with significant drawbacks, such as long inference time, and challenging training processes. These limitations become even more pronounced when performing high-resolution image restoration tasks. We have noticed that there is a trade-off between models' latency time and their trainability. Including a convolutional module can improve the networks' trainability but not reduce their latency. Conversely, sparsification notably reduces latency but renders networks harder to optimize. To address these issues, a novel Transformer for image restoration called ShareFormer is proposed here. ShareFormer offers optimal performance with lower latency and better trainability than other Transformer-based methods. It achieves this by facilitating the sharing of the attention maps amongst neighboring blocks in the network, thereby considerably improving the inference speed. To maintain the model's information flow integrity, residual connections are added to the "Value" of self-attention. Several lesion studies indicate that incorporating residual connections on "Value" can aggregate the shallow transformers with shared attention, introducing a local inductive bias and making the network easier to optimize without the need for additional convolution. The effectiveness, efficiency, and easy-to-train of our ShareFormer is supported by numerous experimental results. Our code and pre-trained models will be open-sourced upon publication of the paper.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 977
Loading