Keywords: Audio editing, diffusion probabilistic model
Abstract: Despite recent advancements in diffusion-based audio generation, precisely editing content in a specific area of a recording remains challenging. In this paper, we introduce AudioMorphix, a training-free audio editor that manipulates a target area of a recording using another recording as a reference. Specifically, we conceptualize audio editing as part of a morphing cycle,
in which different sounds can be combined into a cohesive audio mixture through morphing, whereas the mixture can be disentangled into individual components via demorphing. Leveraging the concept of audio morphing cycle, we optimize the noised latent conditioned on raw input together with reference audio and devise a series of energy functions to refine the guided diffusion process. Additionally, we manipulate the features within self-attention layers to preserve detailed characteristics from the original recordings. To accommodate a broad range of audio editing techniques, we collected a new evaluation dataset, providing editing instructions, reference audio and captions, and the duration of the edited area as guidance. Extensive experiments demonstrate that the AudioMorphix yields promising performance on various audio editing tasks, including addition, removal, and style transferring. Demo and code is available at this url.
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8244
Loading