Keywords: Similarity learning, kernel methods, constrained clustering, transformer analysis, spectral clustering, supervised learning, deep learning
Abstract: In machine learning, no data point stands alone. We believe that context is an underappreciated concept in many machine learning methods. We propose Attention-Based Clustering (ABC), a neural architecture based on the attention mechanism, which is designed to learn latent representations that adapt to context within an input set, and which is inherently agnostic to input sizes and number of clusters. By learning a similarity kernel, our method directly combines with any out-of-the-box kernel-based clustering approach. We present competitive results for clustering Omniglot characters and include analytical evidence of the effectiveness of an attention-based approach for clustering.
One-sentence Summary: We propose an attention-based architecture that utilizes contextual information to learn a kernel, and combine it with an off-the-shelf clustering method to obtain state-of-the-art results on the Omniglot dataset.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2010.01040/code)
Reviewed Version (pdf): https://openreview.net/references/pdf?id=1Db0V57eR
16 Replies
Loading