Multi-modal Few-shot Image Recognition with enhanced semantic and visual integration

Chunru Dong, Lizhen Wang, Feng Zhang, Qiang Hua

Published: 01 May 2025, Last Modified: 06 Nov 2025Image and Vision ComputingEveryoneRevisionsCC BY-SA 4.0

Abstract: Highlights•A multi-modal few-shot image recognition approach with superior performance.•Introduced a novel multi-scale interaction module for semantic–visual features.•Proposed a similarity measurement module combining diverse methods for FSL.•Achieved superior performance on four benchmarks in 1-shot and 5-shot settings.

External IDs:doi:10.1016/j.imavis.2025.105490