Track: regular paper (up to 6 pages)
Keywords: Spurious Correlation, Multimodal LLM
Abstract: While multimodal large language models (MLLMs) exhibit remarkable capabilities in visual and textual understanding, they remain highly susceptible to spurious correlations. We propose SpurLens, a novel pipeline leveraging LLMs and open-set object detectors to identify spurious cues and measure their effect on MLLMs in an object detection scenario. Furthermore, we tested different prompting strategies to mitigate this issue, but none proved effective. These findings highlight the urgent need for robust solutions to address spurious correlations in MLLMs.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Presenter: ~Mazda_Moayeri1
Format: Yes, the presenting author will definitely attend in person because they attending ICLR for other complementary reasons.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 43
Loading