Deep Scaling Factor Quantization Network for Large-scale Image Retrieval

Ziqing Deng, Zhihui Lai, Yujuan Ding, Heng Kong, Xu Wu

Published: 01 Jan 2024, Last Modified: 15 Apr 2025ICMR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Hash learning aims to map multimedia data into Hamming space, in which the data point is represented by low-dimensional binary codes and the similarity relationships are preserved. Despite existing hash learning methods have been effectively used in data retrieval tasks for its merits of low memory cost and high computational efficiency, there still remain two major technical challenges. Firstly, due to the discrete constraints of hash codes, traditional hash methods typically use relaxation strategy to learn real-value features and then quantize them into binary codes through a sign function, resulting in significant quantization errors. Secondly, hash codes are usually low-dimensional, which would be inadequate to preserve either the information of each data point or the relationship between two. These two challenges would greatly limit the retrieval performance of learned hash codes. To solve these problems, we introduce a novel quantization method called scaling factor quantization to enhance hash learning. Unlike traditional hashing methods, we propose to map the data into two parts, i.e., hash codes and scaling factors, to learn the representative codes for the use of retrieval. Specifically, we design a multi-output branch network structure, i.e., Deep Scaling factor Quantization Network (DSQN) and an iterative training strategy for DSQN to learn the two parts of mapping. Comprehensive experiments conducted on three benchmark datasets demonstrate that the hash codes and scaling factors learned by DSQN significantly improve retrieval accuracy compared to existing hash learning methods.