ToMA: Token Merging with Attention For Diffusion Models

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Diffusion, Token Merge, Attention
TL;DR: We propose an improved token merging algorithm to speed up diffusion.
Abstract: Diffusion models have emerged as leading models for image generation. Plug-and-play token merging techniques have recently been introduced to mitigate the heavy computation cost of transformer blocks in diffusion models. However, existing methods overlook two key factors: 1. they fail to incorporate modern efficient implementation of attention, so that, the overhead backfires the achieved algorithmic efficiency 2. the selection of token to merge ignores the relation among tokens, limiting the image quality. In this paper, we propose Token Merging with Attention(ToMA) with three major improvements. Firstly, we utilize a submodular-based token selection method to identify diverse tokens as merge destinations, representative of the entire token set. Secondly, we propose an attention merge, utilizing the efficient attention implementation, to perform the merge with negligible overhead. Also, we abstract the (un-)merging as (inverse-)linear transformations which also allows shareable transformation across layers/iterations. Finally, we utilize the image locality to further accelerate the computation by performing all the operations on tokens in local tiles.
Supplementary Material: zip
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10923
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview