Abstract: We present a novel binaural audio generation method with data augmentation from 360° videos. Visually informed binaural audio generation requires ground truth pairs of video and binaural audio. However, collecting diverse ground truth requires a lot of effort, and low data diversity reduces the generalization performance of the model. Our method introduces the data generation from 360° videos to solve the low diversity of ground truth. Experimental results show that our method improves the generalization performance of the binaural audio generation model and that 360° video is effective in generating video and pseudo-binaural audio pairs.
0 Replies
Loading