Abstract: It is well known that multiple information fusion can enhance the retrieval performance of multimedia systems. However, what to fuse and how to fuse them are still open issues for multimodal correlation learning. In this paper, we address the problem of combining multiple resources to enhance the multimodal correlation learning ability. We propose two fusion strategies: multi-feature fusion and multi-similarity fusion. For multi-feature fusion, feature concatenation is used to integrate various features. For multi-similarity fusion, three fusion rules are investigated: MIN, MAX, and weighted AVG fusion. The effectiveness of the fusion strategies is evaluated on several state-of-the-art multimodal correlation learning models for cross-modal retrieval tasks. Results suggest that with proper fusion strategy selection, the multimodal retrieval performance can be significantly enhanced.
External IDs:dblp:conf/chinasip/SongWT15
Loading