Lyrics Matter: Exploiting the Power of Learnt Representations for Music Popularity Prediction

Lyrics Matter: Exploiting the Power of Learnt Representations for Music Popularity Prediction

ACL ARR 2025 July Submission452 Authors

28 Jul 2025 (modified: 26 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Accurately predicting music popularity is a critical challenge in the music industry, offering benefits to artists, producers, and streaming platforms. Prior research has largely focused on audio features, social metadata, or model architectures. This work addresses the under-explored role of lyrics in predicting popularity. We present an automated pipeline that uses LLMs to extract mathematical representations from lyrics, capturing semantic, syntactic, and sequential information. These features are integrated into HitMusicLyricNet, a multimodal architecture that combines audio, lyrics, and social metadata for popularity score prediction in range 0-100. Our method outperforms existing baselines on the SpotGenTrack dataset which contains over 100,000 tracks, achieving 9\% and 20\% improvements in MAE and MSE, respectively. Ablation confirms that gains arise from our LLM-driven lyrics feature pipeline (LyricsAENet), underscoring the value of dense lyric representations.

Paper Type: Short

Research Area: NLP Applications

Research Area Keywords: MultiModal Application

Languages Studied: English

Previous URL: https://openreview.net/forum?id=xT3pXQjgct

Explanation Of Revisions PDF: pdf

Reassignment Request Area Chair: No, I want the same area chair from our previous submission (subject to their availability).

Reassignment Request Reviewers: Yes, I want a different set of reviewers

Justification For Not Keeping Action Editor Or Reviewers: We request a change of reviewers due to significant variance and inconsistencies in the February 2025 ARR reviews. Despite fully addressing the December cycle's feedback in both the revised manuscript and rebuttal, the February reviews mostly overlooked these revisions. One reviewer assigned an unexpectedly low score (1.5) with no concrete reasoning and did not engage further during the rebuttal phase. This lack of response and acknowledgement, combined with high score variability (Overall: 1.5, 2, 3.5) is the reason for our request.

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: N/A

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: Section 3, 4, 5 and Appendix Section A

B2 Discuss The License For Artifacts: Yes

B2 Elaboration: Section 3 and 5

B3 Artifact Use Consistent With Intended Use: Yes

B3 Elaboration: Section 5 and Appendix A

B4 Data Contains Personally Identifying Info Or Offensive Content: N/A

B5 Documentation Of Artifacts: Yes

B5 Elaboration: Section 3 and Appendix A

B6 Statistics For Data: Yes

B6 Elaboration: Section 3

C Computational Experiments: Yes

C1 Model Size And Budget: Yes

C1 Elaboration: Section 5

C2 Experimental Setup And Hyperparameters: Yes

C2 Elaboration: Section 4 and 5

C3 Descriptive Statistics: Yes

C3 Elaboration: Section 5 and Appendix B, C

C4 Parameters For Packages: Yes

C4 Elaboration: Section 5 and Appendix A and B

D Human Subjects Including Annotators: No

D1 Instructions Given To Participants: N/A

D2 Recruitment And Payment: N/A

D3 Data Consent: N/A

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: N/A

E Ai Assistants In Research Or Writing: No

E1 Information About Use Of Ai Assistants: N/A

Author Submission Checklist: yes

Submission Number: 452

Loading