Vision Transformer with Irregular Attention

Dmitry Ermilov; Nikolay Kozyrskiy; Igor Vorona; ANH-HUY PHAN; Andrzej Cichocki

Vision Transformer with Irregular Attention

Dmitry Ermilov, Nikolay Kozyrskiy, Igor Vorona, ANH-HUY PHAN, Andrzej Cichocki

24 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: Deep Learning, DNN, Transformer, ViT, DeiT, Tensor Decomposition, Tensor Network, BTD, BTD-LL1, CPD, DNN Compression, DNN Acceleration

TL;DR: A novel method for reducing the computation complexity of multi-head self-attention mechanism in Vision Transformer based on BTD-LL1 decomposition.

Abstract: Compression of Transformer is a natural request that arose in the computer vision community. Apart from quantization that hardly rely on hardware, sparsification is another way to remove redundant parts, usually based on mask training or sparsity regularization. We propose the novel compressed structure of multi-head self-attention (MHSA) mechanism called Irregular Attention (IAtt). IAtt is built on BTD-LL1 tensor decomposition and is aimed at sparsifying pre-trained Vision Transformer by pruning query and key (QK) contraction dimension in MHSA block. We derive the algorithm of rank selection procedure for BTD-LL1 based on the structure of fusion layer obtained from CP decomposition of original MHSA kernels. In order to improve the compression ratio with least possible quality loss we introduce the fine-tuning schemes that yield each head its own sub-optimal rank for QK in the IAtt. We validated the proposed scheme for DeiT architectures on ILSVRC-2012 dataset. Our results show that IAtt has better performance compared to original MHSA compressed by SVD. It indicates that attention heads have non-uniform importance and require different QK contract dimensions.

Primary Area: representation learning for computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9363

Loading