Beyond Cosine Similarity: Introducing the Unified semantic Similarity Metric Benchmark (USMB) for Text Similarity Measurement
Keywords: Deep Learning or Neural Networks, Similarity and Distance Learning, (Application) Natural Language and Text Processing, (Cognitive/Neuroscience) Language
TL;DR: We introduce the Unified semantic Similarity Metric Benchmark (USMB), a leaderboard of text similarity metrics meant to evaluate each measurement on tasks such as preference alignment, robustness, sensitivity, clustering, and retrieval.
Abstract: Text embedding models are increasingly utilized in production across various applications, from Information Retrieval (IR) to document parsing, but relatively little research has been focused on how to best utilize these embeddings for downstream tasks. While cosine similarity, a popular measure of embedding and text similarity, is widely used, it may not be the strongest metric choice for all tasks. In this work, we introduce the Unified semantic Similarity Metric Benchmark (USMB), a novel leaderboard for text similarity metrics composed of 5 unique tasks and 30+ datasets with the goal of providing a standardized means of measuring the effectiveness of a text similarity metric on a suite of challenging tasks encompassing the nuances of semantic understanding. Additionally, we demonstrate that while cosine similarity achieves the highest score on our benchmark of any pre-existing metric, developing a task-specific ensembled model using our metrics leads to a 40.3\% increase in benchmark performance relative to cosine similarity. We hope that through this work, greater attention can be given to potential performance gains through metric selection and that the field's ability to measure semantic similarity advances as a result.
Supplementary Material: zip
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 12725
Loading