Triple-S: A Sticker Semantic Similarity Benchmark with General Sticker Encoder

Chee Heng Er Metilda; Jiayin Wang; Zhiqiang Guo; Weizhi Ma; Min Zhang

Triple-S: A Sticker Semantic Similarity Benchmark with General Sticker Encoder

Chee Heng Er Metilda, Jiayin Wang, Zhiqiang Guo, Weizhi Ma, Min Zhang

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: dataset, benchmark, sticker, sticker semantic, general sticker encoder

Abstract: Stickers have become a popular form of visual communication, yet understanding their semantic relationships remains challenging due to their highly diverse and symbolic content. In this work, we formally define the Sticker Semantic Similarity task and introduce Triple-S, the first benchmark for this task, consisting of 905 human-annotated positive and negative sticker pairs. Through extensive evaluation, we show that existing pretrained vision and multimodal models struggle to capture nuanced sticker semantics. To address this, we propose the General Sticker Encoder (GSE), a lightweight and versatile model that learns robust sticker embeddings using both Triple-S and additional datasets. GSE achieves superior performance on unseen stickers, and demonstrates strong results on downstream tasks such as emotion classification and sticker-to-sticker retrieval. By releasing both Triple-S and GSE, we provide standardized evaluation tools and robust embeddings, enabling future research in sticker understanding, retrieval, and multimodal content generation. The Triple-S benchmark and GSE have been publicly released and are available here \footnote{https://anonymous.4open.science/r/triple-s-6E65/}

Primary Area: datasets and benchmarks

Submission Number: 25572

Loading