Artificial Melodies: Investigating the Limits of AI in Replicating Human Songwriting

University of Eastern Finland DRDHum 2024 Conference Submission17 Authors

Published: 03 Jun 2024, Last Modified: 03 Jun 2024DRDHum 2024 BestPaperEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Corpus Linguistics, Popular music, Multi-Dimensional Analysis, Artificial Intelligence
TL;DR: This study investigates AI's ability to mimic human songwriting, comparing 4000 human and AI-generated lyrics across genres. Using Lexical Multi-Dimensional Analysis, the findings highlight the uniqueness of human creative expression.
Abstract: The emergence of Artificial Intelligence (AI) chatbots has ignited debates regarding the potential replacement of human-generated texts, particularly structured content such as weather forecasts or financial reports. However, its application in creative domains like poetry or songwriting is increasingly acknowledged. This study aims to investigate the capacity of AI to replicate creative human writing, specifically focusing on song lyric composition in English. To achieve this goal, we conducted two Lexical Multi-Dimensional analyses (LMDA; Berber Sardinha and Fitzsimmons-Doolan, 2024), employing a curated corpus of song lyrics spanning diverse musical genres (including country, pop, rap, rock, and soul), which encompassed both chart-topping and random lyrics. Additionally, we generated a comparative corpus of artificially produced lyrics using ChatGPT, Google’s Gemini, and Meta’s Llama. The corpus consisted of 4000 lyrics, evenly split between human-authored and AI-generated texts, with each subcorpus comprising 400 lyrics per style. The first analysis involved conducting an additive LMDA based on the dimensions of variation identified by Author (2023). These dimensions were derived from a large corpus of over 100,000 song lyrics, each tagged for semantic class using the USAS semantic tagger. The dimensions are the following: 1) Materialism and Superficiality, 2) Alterity and Interpersonal Dynamics, 3) Mysticism and Transcendence, and 4) Romanticism and Personal Quest. After scoring each of our lyrics on these dimensions, we ran a Discriminant Function Analysis (DFA) to classify the lyrics as either human-authored or AI-generated. The results showed a 64% accuracy rate in identifying AI-generated songs and an 84% accuracy rate for human-authored songs. The second LMDA used the actual vocabulary of the songs, rather than semantic classes. Key lemmas were extracted for each authorship condition, which were then subjected to factorial analysis, resulting in a five-dimensional model: 1) Social Justice versus Romance, 2) Reality versus Transcendence, 3) Rural versus Urban, 4) Individualism versus Collectivism, and 5) Extroversion and Physicality versus Introversion and Emotions. This model demonstrated efficacy in classifying lyrics, achieving a 69% success rate for AI-generated lyrics and 90% for human-authored songs. Overall, the findings indicate a significant discrepancy between AI and human songwriting capabilities. Only 30.90% of AI-generated songs closely resembled those written by humans, suggesting that while AI can replicate human songwriting vocabulary, it falls short in generating discourses that align with human musical expressions. Conversely, human-authored songs were accurately identified with a high degree of precision (90.05%), underscoring the distinct and irreplicable aspects of human creativity in songwriting.
Submission Number: 17
Loading