Learning metrics on spectrotemporal modulations reveals the perception of musical instrument timbre

Etienne Thoret, Baptiste Caramiaux, Philippe Depalle, Stephen McAdams

Published: 30 Nov 2020, Last Modified: 14 Jan 2026Nature Human BehaviourEveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Humans excel at using sounds to make judgements about their immediate environment. In particular, timbre is an auditory attribute that conveys crucial information about the identity of a sound source, especially for music. While timbre has been primarily considered to occupy a multidimensional space, unravelling the acoustic correlates of timbre remains a challenge. Here we re-analyse 17 datasets from published studies between 1977 and 2016 and observe that original results are only partially replicable. We use a data-driven computational account to reveal the acoustic correlates of timbre. Human dissimilarity ratings are simulated with metrics learned on acoustic spectrotemporal modulation models inspired by cortical processing. We observe that timbre has both generic and experiment-specific acoustic correlates. These findings provide a broad overview of former studies on musical timbre and identify its relevant acoustic substrates according to biologically inspired models. Thoret and colleagues present a re-analysis of past research to identify the multiple acoustical facets of musical instrument timbre perception, capitalizing on spectrotemporal modulations models and metric learning.
Loading