Challenges of Multi-Modal Coreset Selection for Depth Prediction

Viktor Moskvoretskii; Narek Alvandian

Challenges of Multi-Modal Coreset Selection for Depth Prediction

Viktor Moskvoretskii, Narek Alvandian

Published: 05 Mar 2025, Last Modified: 19 Mar 2025ICLR 2025 Workshop ICBINBEveryoneRevisionsBibTeXCC BY 4.0

Track: tiny / short paper (up to 2 pages)

Keywords: Multimodal, Coreset selection, Depth prediction

TL;DR: Coreset selection algorithms underperform for multi-modal depth prediction

Abstract: Coreset selection methods are effective in accelerating training and reducing memory requirements but remain largely unexplored in applied multimodal settings. We adapt a state-of-the-art (SoTA) coreset selection technique for multimodal data, focusing on the depth prediction task. Our experiments with embedding aggregation and dimensionality reduction approaches reveal the challenges of extending unimodal algorithms to multimodal scenarios, highlighting the need for specialized methods to better capture inter-modal relationships.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 10

Loading