Connectivity-based Token Condensation for Efficient Vision Transformer

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Vision Transformer, Token Condensation, Connectivity
Abstract: The high computational cost of vision transformers blocks their implementation on resource-limited devices such as mobile phones. Reducing the number of tokens can significantly accelerate the inference process and save computational resources. Most of the existing token pruning methods focus on evaluating token's importance and discard the unimportant tokens directly, which incur significant information loss. A few methods suggest ways focusing on merging while directly partition tokens into two parts by random or odd/even partition, which do not consider carefully how to select tokens. In this paper, we propose a new token condensation method based on the connectivity between tokens. Different from the previous methods, we gradually condense the large number of tokens by selection and fusion. The most representative tokens are selected and the others are separately fused into them. Extensive experiments are conducted on benchmark datasets. Compared with the existing methods, our method achieves higher accuracy with lower computational cost. For example, our method can reduce 50\% FLOPs of DeiT-S without accuracy degradation on ImageNet dataset.
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4973
Loading