Audio-Visual Open-Vocabulary Egocentric Spatio-Temporal Action Localization with NeRFs

Published: 22 Sept 2025, Last Modified: 03 Jan 2026WiML @ NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: egocentric vision, neural radiance fields, multimodal learning, action localization, 3D scene understanding
Submission Number: 78
Loading