Learning Normal Patterns in Musical Loops

Published: 05 Nov 2025, Last Modified: 05 Nov 2025NLDL 2026 SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Autoencoder, Anomaly Detection, Audio Representation Learning, Deep Learning, Latent Space Modeling, Machine Learning, Music Pattern Detection, Music information retrieval (MIR), Unsupervised Learnin
TL;DR: This paper introduces an unsupervised framework for analyzing audio patterns in musical samples (loops) through anomaly detection techniques to learn normal patterns in music loops without requiring labeled training data.
Abstract: We propose an unsupervised framework for analyzing audio patterns in musical loops using deep feature extraction and anomaly detection. Unlike prior methods limited by fixed input lengths, handcrafted features, or domain constraints, our approach combines a pre-trained Hierarchical Token-semantic Audio Transformer (HTS-AT) and Feature Fusion Mechanism (FFM) to generate representations from variable-length audio. These embeddings are analyzed by Deep Support Vector Data Description (Deep SVDD), which models normative patterns in a compact latent space. Experiments on bass and guitar datasets show our Deep SVDD models---especially with residual autoencoders---outperform baselines like Isolation Forest and PCA, achieving better anomaly separation. Our work provides a flexible, unsupervised method for effective pattern discovery in diverse audio samples.
Serve As Reviewer: ~Yuexian_Zou9
Submission Number: 32
Loading