Abstract: By the preferable efficiency in storage and computation, deep cross-modal has gained much attention in large-scale multimedia retrieval. Current deep hashing employs the probability outputs of the likelihood function, i.e., Sigmoid or Cauchy, to quantify the semantic similarity between samples in a common Hamming space. However, the inherent weakness of the Sigmoid likelihood function or the Cauchy likelihood function in gradient optimization leads to hashing models failing to exactly describe the hamming ball, which indicates the absolute semantic boundary among classes, thereby giving the high neighborhood ambiguity. In this paper, with the analysis of the likelihood function from the perspective of similarity metric learning, the novel Deep Discriminative Boundary Hashing framework (DDBH) is proposed to learn the discriminative embedding space that separates neighbors and non-neighbors well. Specifically, by introducing the remapping strategy and the base-point adaptive selection, the boundary-preserving loss based on the adjustable likelihood function is proposed to project data points with small gradients to regions with large gradients and give larger gradients for hard samples, facilitating better separation among classes. Meanwhile, to learn class-dependent binary codes, the class-wise quantization loss is designed to heuristically transfer class-wise prior knowledge to the binary quantization, significantly improving the discriminative capability of compact discrete codes. Comprehensive experiments on three benchmark datasets show that our proposed DDBH framework outperforms other representative deep cross-modal hashing. The corresponding code is available at https://github.com/QinLab-WFU/DDBH
External IDs:dblp:journals/tcsv/QinHZHN25
Loading