Abstract: We propose a Query-Specific Siamese Similarity Metric (QS3M) for query-specific clustering of text documents. It uses fine-tuned BERT embeddings and trains a non-linear projection into a query-specific similarity space. We build on the idea of Siamese networks but include a third component, a representation of the query. The empirical evaluation for clustering employs two TREC datasets with two different clustering benchmarks each. When used to obtain query-relevant clusters, QS3M achieves a 12% performance improvement over a recently published BERT-based reference method and significantly outperforms other unsupervised baselines.
0 Replies
Loading