Deep Hashing Based on Feature Fusion Enhancing Hierarchical Transformer with Distance Separated Centers Guided Polarization Loss

Published: 2025, Last Modified: 22 Jan 2026ICIC (3) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Efficient deep hashing for images excels at transforming high-dimensional features into low-dimensional hash codes, which can be widely used in the field of traditional information retrieval and large model related retrieval-augmented generation, so it has attracted widespread attention. The existing frameworks for deep hashing primarily focus on two aspects: enhancing effectiveness of retrieval performance, boosting the efficiency of training models and generating hash codes. However, it is generally difficult to achieve both effectiveness and efficiency in the task scenarios of deep image hashing. For improving the effect, it is usually necessary to increase the depth and parameters of the backbone network and add more complex mixed loss function constraints, which makes the deep hashing model inefficient. In order to achieve a better balance between effectiveness and efficiency, we propose a novel framework that integrates the hierarchical transformer network architecture with multilevel feature fusion and the polarization loss with distance separated hash centers. By utilizing a more concise network structure and just a single loss function, the complexity of model parameters and hash codes generation processes can be effectively managed, while simultaneously achieving better performance. Through the comparative experiments on multiple datasets, it can be concluded that the proposed method can achieve the state-of-the-art results on image datasets, especially in the low-dimensional hash coding.
Loading