Locality Preserving Markovian Transition for Instance Retrieval

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Diffusion-based re-ranking methods are effective in modeling the data manifolds through similarity propagation in affinity graphs. However, positive signals tend to diminish over several steps away from the source, reducing discriminative power beyond local regions. To address this issue, we introduce the Locality Preserving Markovian Transition (LPMT) framework, which employs a long-term thermodynamic transition process with multiple states for accurate manifold distance measurement. The proposed LPMT first integrates diffusion processes across separate graphs using Bidirectional Collaborative Diffusion (BCD) to establish strong similarity relationships. Afterwards, Locality State Embedding (LSE) encodes each instance into a distribution for enhanced local consistency. These distributions are interconnected via the Thermodynamic Markovian Transition (TMT) process, enabling efficient global retrieval while maintaining local effectiveness. Experimental results across diverse tasks confirm the effectiveness of LPMT for instance retrieval.
Lay Summary: In image retrieval systems, the exclusive reliance on basic distance metrics such as the commonly used Euclidean distance and Cosine similarity between raw image features often results in suboptimal retrieval performance, as such features may fail to adequately capture the high-level semantics inherent in images. Nevertheless, we observe that feature representations of semantically similar images, even when initially distant in the feature space, often reside on a low-dimensional manifold, forming a smooth and continuous trajectory rather than appearing as isolated points. We leverage the manifold assumption to improve image retrieval through a novel diffusion-based manifold ranking method that uncovers the intrinsic structure of the feature space. To address the issue of information loss, where positive signals tend to vanish after several steps of diffusion from the source, our method first represents each image as a probability distribution shaped by its local neighborhood structure. The distance between images is then measured as the minimal transition cost required to transform one distribution into another along a multi-step path, with each transition restricted to a local region. This design enables robust information propagation across the manifold while preserving local semantic consistency. Our research proposes a versatile solution that can be seamlessly integrated as a post-processing module to enhance a wide range of image retrieval systems. Furthermore, by uncovering deeper, manifold-based similarities, our approach can also benefit other machine learning algorithms that require a nuanced understanding of complex data structures.
Primary Area: General Machine Learning
Keywords: Retrieval, Manifold ranking, Re-ranking, Diffusion
Submission Number: 1075
Loading