Triple-S: A Sticker Semantic Similarity Benchmark with General Sticker Encoder

ICLR 2026 Conference Submission25572 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: dataset, benchmark, sticker, sticker semantic, general sticker encoder
Abstract: Stickers have become a popular form of visual communication, yet understanding their semantic relationships remains challenging due to their highly diverse and symbolic content. In this work, we formally define the Sticker Semantic Similarity task and introduce Triple-S, the first benchmark for this task, consisting of 905 human-annotated positive and negative sticker pairs. Through extensive evaluation, we show that existing pretrained vision and multimodal models struggle to capture nuanced sticker semantics. To address this, we propose the General Sticker Encoder (GSE), a lightweight and versatile model that learns robust sticker embeddings using both Triple-S and additional datasets. GSE achieves superior performance on unseen stickers, and demonstrates strong results on downstream tasks such as emotion classification and sticker-to-sticker retrieval. By releasing both Triple-S and GSE, we provide standardized evaluation tools and robust embeddings, enabling future research in sticker understanding, retrieval, and multimodal content generation. The Triple-S benchmark and GSE have been publicly released and are available here \footnote{https://anonymous.4open.science/r/triple-s-6E65/}
Primary Area: datasets and benchmarks
Submission Number: 25572
Loading