SAFE: Segment-Aware Filtering and Evaluation for Lyric Content Moderation

Published: 01 Jun 2026, Last Modified: 01 Jun 2026Culture x AI 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Natural Language Processing, Machine Learning, Explicit Content Detection, Multi-Label Text Classification, Song Lyrics Analysis, Music Content Moderation, Cultural AI
Abstract: Explicit-content moderation in music is often reduced to a binary platform label, obscuring meaningful distinctions among sexual content, violence, substance use, and culturally specific forms of offensiveness. We introduce SAFE, a segment-aware framework for fine-grained explicit-content modeling in song lyrics. Rather than treating an entire song as a single document, SAFE extracts keyword-centered contextual windows to reduce signal dilution from long neutral lyric passages. On a balanced dataset of 4,104 Spotify tracks, SAFE improves macro F1 from 0.780 to 0.884 over a whole-lyric baseline using TF-IDF with XGBoost, with a Hamming Loss of 0.061. Beyond predictive performance, we analyze the residual Other Explicit category, which captures songs flagged as explicit by the platform but not covered by our semantic taxonomy. This mismatch reveals how platform moderation labels encode cultural and institutional judgments that are not fully captured by standard content categories. Our findings show that segmentation-aware modeling can improve fine-grained explicit-content classification while also exposing the limits of Western-centric and platform-defined moderation schemes.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 87
Loading