Understanding Dimensional Collapse in Cross-Modal Feature Distillation

25 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: knowledge distillation, feature distillation, cross-modal learning, dimensional collapse
TL;DR: We investigate the distributional shifts across different modalities that ultimately lead to dimensional collapse in cross-modal knowledge distillation, then propose a methodology to address it.
Abstract: To overcome limited computing resources and the complexity of sensor configurations in deploying multi-modal neural networks in real-world applications, cross-modal knowledge distillation (CMKD) aims to transfer valuable information from a pretrained teacher model to a deployable student model with the target modality. Despite the successful applications of CMKD in various fields, our understanding of knowledge transfer across different modalities remains insufficient to fully explain the efficacy of feature distillation. In this work, we investigate the relationship between the distributional shifts across modalities, referred to as the modality gap, and its impact on the effectiveness of CMKD, particularly focusing on the problem of cross-modal feature distillation. We first hypothesize and empirically validate that the modality gap between the teacher and student causes dimensional collapse in the student's feature space. To prevent such inefficiency, we propose a Cross-modal Information Bottleneck Approximation (CIBA) scheme aimed at extracting and transferring modality-general features from the teacher model. Lastly, we experimentally demonstrate that our distillation strategy effectively reduces the dimensional collapse in the student model, thereby achieving improved performance for various real-world multi-modal datasets.
Primary Area: learning theory
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5189
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview