Meta-Learning Approach for Joint Multimodal Signals with Multimodal Iterative Adaptation

Sehun Lee; Wonkwang Lee; Gunhee Kim

Meta-Learning Approach for Joint Multimodal Signals with Multimodal Iterative Adaptation

Sehun Lee, Wonkwang Lee, Gunhee Kim

Published: 04 Aug 2024, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: In the pursuit of effectively modeling real-world joint multimodal signals, learning to learn multiple Implicit Neural Representations (INRs) jointly has gained attention to overcome data scarcity and enhance fitting speed. However, predominant methods based on multi- modal encoders often underperform due to their reliance on direct data-to-parameter map- ping functions, bypassing the optimization steps necessary for capturing the complexities of real-world signals. To address this gap, we propose Multimodal Iterative Adaptation (MIA), a novel framework that combines the strengths of multimodal fusion with optimization-based meta-learning. The key idea is to enhance the learning of INRs by facilitating exchange of cross-modal knowledge among learners during the iterative optimization processes, improv- ing generalization and enabling a more nuanced adaptation to complex signals. To achieve this, we introduce State Fusion Transformers (SFTs), an attention-based meta-learner de- signed to operate in the backward pass of the learners, aggregating learning states, capturing cross-modal relationships, and predicting enhanced parameter updates for the learners. Our extensive evaluation in various real-world multimodal signal regression setups shows that MIA outperforms existing baselines in both generalization and memorization performances. Our code is available at https://github.com/yhytoto12/MIA.

Submission Length: Regular submission (no more than 12 pages of main content)

Supplementary Material: zip

Changes Since Last Submission: We updated the main draft to incorporate the reviewers' suggestions and our corresponding responses from the rebuttal period. Specifically, we included the following: (1) examples of applications that might benefit from advances in INRs, (2) clarifications on the training/evaluation settings, (3) an algorithm box for MIA, (4) more analysis and discussions on the rationale for MSFTs in SFTs, and (5) minor fixes for typos. #EiC Edit# We've updated the pdf, upon the request of the authors and AE, which now includes an acknowledgements section.

Code: https://github.com/yhytoto12/MIA

Assigned Action Editor: ~Jianbo_Jiao2

Submission Number: 2612

Loading