Contrastive Lyrics Alignment with a Timestamp-Informed Loss

Timon Kick; Florian Grötschla; Luca A Lanzendörfer; Roger Wattenhofer

Contrastive Lyrics Alignment with a Timestamp-Informed Loss

Timon Kick, Florian Grötschla, Luca A Lanzendörfer, Roger Wattenhofer

Published: 10 Oct 2024, Last Modified: 29 Oct 2024Audio Imagination: NeurIPS 2024 WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: lyrics alignment, audio signal processing, open-source code, open-source dataset

Abstract: Recent multimodal methods for lyrics alignment have relied on large datasets. Our approach introduces a box loss that directly incorporates timestamp information into the loss function, enabling precise alignment and competitive results even with limited training data. We also address the noise present in the public DALI dataset, conducting a thorough cleaning process to improve the quality of training data. Finally, we propose JamendoLyrics++, a substantial extension of the common JamendoLyrics evaluation dataset, offering improved genre diversity for better evaluation of lyrics alignment systems.

Submission Number: 30

Loading