Keywords: bandwidth extension, inpainting, schrodinger bridges, audio restoration, music restoration
Abstract: Real-world audio is often degraded by numerous factors. This work presents an audio restoration model tailored for high-res (44.1kHz) music. Our model, Audio- to-Audio Schr¨ odinger Bridges (A2SB), is capable of both bandwidth extension (predicting high-frequency components) and inpainting (re-generating missing segments). Critically, it is end-to-end – requiring no vocoder to predict waveform outputs, able to restore hour-long audio inputs, and trained on permissively licensed music data. A2SB is capable of achieving state-of-the-art bandwidth extension and inpainting quality on several out-of-distribution music test sets. Code and model: https://github.com/NVIDIA/diffusion-audio-restoration.
Track: Paper Track
Confirmation: Paper Track: I confirm that I have followed the formatting guideline and anonymized my submission.
(Optional) Short Video Recording File: mp4
Submission Number: 15
Loading