Abstract: For visual-semantic embedding, the existing methods nor-
mally treat the relevance between queries and candidates in a
bipolar way – relevant or irrelevant, and all “irrelevant” candi-
dates are uniformly pushed away from the query by an equal
margin in the embedding space, regardless of their various
proximity to the query. This practice disregards relatively dis-
criminative information and could lead to suboptimal rank-
ing in the retrieval results and poorer user experience, espe-
cially in the long-tail query scenario where a matching can-
didate may not necessarily exist. In this paper, we introduce
a continuous variable to model the relevance degree between
queries and multiple candidates, and propose to learn a coher-
ent embedding space, where candidates with higher relevance
degrees are mapped closer to the query than those with lower
relevance degrees. In particular, the new ladder loss is pro-
posed by extending the triplet loss inequality to a more gen-
eral inequality chain, which implements variable push-away
margins according to respective relevance degrees. In addi-
tion, a proper Coherent Score metric is proposed to better
measure the ranking results including those “irrelevant” can-
didates. Extensive experiments on multiple datasets validate
the efficacy of our proposed method, which achieves signifi-
cant improvement over existing state-of-the-art methods.
0 Replies
Loading