Enhancing Representation Learning for Content-Based Information Retrieval: A Knowledge-Enhanced Geometric Approach
Abstract: Representation learning has become a critical component in modern information retrieval systems, driven by advancements in self-supervised learning. This research extends previous work on negative sampling strategies for contrastive learning by incorporating sophisticated geometric principles to generate more informative negative examples. By exploring three novel approaches—median centroids, geometric midpoints, and medoids—the study introduces a knowledge-enhanced method for capturing different characteristics of data distribution. The comprehensive evaluation across multiple datasets, including STL10 and a novel Food3 adaptation, demonstrates significant performance improvements. Experimental results reveal that the proposed geometric approaches consistently outperform existing methods in both Content-Based Image Retrieval and classification tasks. Notably, the methods show remarkable effectiveness in semi-supervised scenarios with limited label availability (10% and 5%), achieving up to 84.50% accuracy on the SVHN dataset with just 10% of available labels. Key findings highlight the median centroid approach as particularly robust, providing the most consistent performance improvements across diverse conditions. The research not only advances understanding of representation learning but also provides a promising strategy for extracting meaningful features in low-resource learning environments, with potential applications in domains ranging from digital archives to specialized image classification tasks.
External IDs:dblp:conf/keir/GoyoFMS25
Loading