Ladder Loss for Coherent Visual-Semantic Embedding

Mo Zhou, Zhenxing Niu, Le Wang, Zhanning Gao, Qilin Zhang, Gang Hua

22 Nov 2019 (modified: 10 Nov 2022)OpenReview Archive Direct UploadReaders: Everyone

Abstract: For visual-semantic embedding, the existing methods nor- mally treat the relevance between queries and candidates in a bipolar way – relevant or irrelevant, and all “irrelevant” candi- dates are uniformly pushed away from the query by an equal margin in the embedding space, regardless of their various proximity to the query. This practice disregards relatively dis- criminative information and could lead to suboptimal rank- ing in the retrieval results and poorer user experience, espe- cially in the long-tail query scenario where a matching can- didate may not necessarily exist. In this paper, we introduce a continuous variable to model the relevance degree between queries and multiple candidates, and propose to learn a coher- ent embedding space, where candidates with higher relevance degrees are mapped closer to the query than those with lower relevance degrees. In particular, the new ladder loss is pro- posed by extending the triplet loss inequality to a more gen- eral inequality chain, which implements variable push-away margins according to respective relevance degrees. In addi- tion, a proper Coherent Score metric is proposed to better measure the ranking results including those “irrelevant” can- didates. Extensive experiments on multiple datasets validate the efficacy of our proposed method, which achieves signifi- cant improvement over existing state-of-the-art methods.

0 Replies