Multi-scale fusion transformer based weakly supervised hashing learning for instance retrieval

Published: 01 Jan 2023, Last Modified: 06 Feb 2025Int. J. Mach. Learn. Cybern. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Instance retrieval is concerned with obtaining representations of instances (objects) in images and using them for similarity comparisons between instances. However, most methods require instance-level categories to train the model, which increases the burden of annotation. Along with the advancement of convolutional neural networks and transformers in computer vision, in this work, we propose a hierarchical with a spatial pyramidal structure for weakly supervised multi-instance hash learning. It merges the advantages of local and multi-scale perception on CNN with the global field of view on Transformer. Further, it leverages the principle of multi-instance learning, allowing the proposed model to implement an instance-level hash mapping capability in a weakly supervised learning manner. The experimental results on three public datasets achieved more improved results compared to the typical methods, validating the effectiveness of the proposed method.
Loading