Encoding Domain Insights into Multi-modal Fusion: Improved Performance at the Cost of Robustness

Published: 10 Jun 2025, Last Modified: 15 Jul 2025MOSS@ICML2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multi-modal Fusion, Robustness, Inductive Priors, Synthetic Tasks, Small-Scale Experiments, Interpretability
TL;DR: we show that custom made fusion methods, that leverage domain knowledge, improve training time and accuracy with very limited data but harms robustness
Abstract: Using small-scale experiments with real and synthetic tasks, we compare multi-modal fusion methods, including a proposed `Product Fusion', to demonstrate how encoding task-specific priors affects performance. Our results highlight a crucial trade-off: aligning fusion design with priors boosts clean-data accuracy with limited data but significantly diminishes robustness to noisy inputs.
Code: zip
Submission Number: 29
Loading