Keywords: neural acoustic fields, spatial audio, audio scenes, implicit representations, applications
TL;DR: We propose WavNAF, a neural acoustic synthesis framework that leverages physically-informed wave propagation priors to explicitly capture complex acoustic interactions.
Abstract: Room acoustics modeling requires capturing intricate wave phenomena such as reflections, refractions, and diffractions beyond direct sound propagation. Recent neural acoustic synthesis methods have improved acoustic realism but typically focus only on straight sound paths and coarse reverberation, missing detailed interactions like diffraction or multi-order reflections. We propose WavNAF, a neural framework that leverages physically-informed wave propagation priors to explicitly capture complex acoustic interactions. We generate these priors by numerically solving the wave equation with the Finite-Difference Time-Domain (FDTD) method, which directly simulates wave-based acoustic behavior that geometric methods cannot capture. Specifically, we extract essential acoustic parameters for FDTD, such as wave speed and density, from visual scene geometry encoded by Neural Radiance Fields (NeRF). We then generate physically-informed pressure maps and encode them via a feature extractor to learn wave propagation priors that capture intricate acoustic phenomena. To address the inherent computational cost issue of FDTD, we introduce a novel Neural Acoustic Scaling Module, inspired by traditional acoustic scale model. This module adaptively recalibrates encoded pressure map features from temporally compressed simulations to efficiently estimate accurate full-scale Room Impulse Responses. Experimental results demonstrate that WavNAF achieves substantial improvements in acoustic quality across various evaluation metrics compared to existing state-of-the-art methods.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 16111
Loading