Fast Cross-Modality Knowledge Transfer via a Contextual Autoencoder Transformation

Published: 2024, Last Modified: 21 Jan 2026ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Cross-modality knowledge transfer aims to apply knowledge learned in the source modality to the target modality. It is more challenging than the general knowledge transfer task because of the aggravated modality shift problem due to introducing heterogeneous data. This paper proposes a novel fast cross-modality knowledge transfer method via a contextual autoencoder transformation. In particular, the encoder projects the contextual representations of the source modality into the target modality. Then to bridge the semantic shared among source and target modalities, the decoder exerts an additional constraint to reconstruct the original source modality. We show that this constraint is beneficial for mitigating the shift problem and improves the generalization from heterogeneous modalities. Remarkably, the autoencoder is linear and symmetric, facilitating scalability for large-scale datasets. Experimental results on two widely used benchmarks demonstrate that the proposed method surpasses several state-of-the-arts baselines, validating its effectiveness and efficiency.
Loading