Abstract: In the pursuit of effectively modeling real-world joint multimodal signals, learning to learn multiple Implicit Neural Representations (INRs) jointly has gained attention to overcome data scarcity and enhance fitting speed. However, predominant methods based on multi- modal encoders often underperform due to their reliance on direct data-to-parameter map- ping functions, bypassing the optimization steps necessary for capturing the complexities of real-world signals. To address this gap, we propose Multimodal Iterative Adaptation (MIA), a novel framework that combines the strengths of multimodal fusion with optimization-based meta-learning. The key idea is to enhance the learning of INRs by facilitating exchange of cross-modal knowledge among learners during the iterative optimization processes, improv- ing generalization and enabling a more nuanced adaptation to complex signals. To achieve this, we introduce State Fusion Transformers (SFTs), an attention-based meta-learner de- signed to operate in the backward pass of the learners, aggregating learning states, capturing cross-modal relationships, and predicting enhanced parameter updates for the learners. Our extensive evaluation in various real-world multimodal signal regression setups shows that MIA outperforms existing baselines in both generalization and memorization performances.
Our code is available at https://github.com/yhytoto12/MIA.
Submission Length: Regular submission (no more than 12 pages of main content)
Supplementary Material: zip
Changes Since Last Submission: We updated the main draft to incorporate the reviewers' suggestions and our corresponding responses from the rebuttal period. Specifically, we included the following: (1) examples of applications that might benefit from advances in INRs, (2) clarifications on the training/evaluation settings, (3) an algorithm box for MIA, (4) more analysis and discussions on the rationale for MSFTs in SFTs, and (5) minor fixes for typos.
#EiC Edit#
We've updated the pdf, upon the request of the authors and AE, which now includes an acknowledgements section.
Code: https://github.com/yhytoto12/MIA
Assigned Action Editor: ~Jianbo_Jiao2
Submission Number: 2612
Loading