Keywords: Transformer, Representation Learning, Subspace Clustering
Abstract: This paper proposes TransFusion, a novel framework for training attention-based neural networks to extract useful features for downstream classification tasks. TransFusion leverages the fusion-like behavior of the self-attention mechanism, where samples from the same cluster have higher attention scores and gradually converge. In the pursuit of deriving meaningful features, TransFusion adopts a strategy of training with affinity matrices, effectively capturing the resemblances among samples within the same class. In the context of classification-related tasks with limited understanding of the Attention layer's functionality, we offer theoretical insights into the actual behavior of each layer. Our main result demonstrates TransFusion's effectiveness at fusing data points within the same cluster, while simultaneously ensuring careful management of noise levels. Furthermore, experimental results indicate that TransFusion successfully extracts features that isolate clusters from complex real-world data, leading to improved classification accuracy in downstream tasks.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1375
Loading