Subliminal Prosody Learning: Auxiliary Emotion Supervision Redistributes Affective Representations Across ALM Layers
Keywords: Methods (probing, steering, causal interventions), Applications of interpretability, Interpretability for Knowledge Discovery
TL;DR: We show subliminal learning in audio-language models, where learning emotion classification helps the model become more emotionally aware.
Abstract: We study how a simple emotion classification objective, applied to a few
LoRA-adapted layers of an audio-language model (ALM), redistributes affective
information across \emph{all} layers---including those whose parameters remain
frozen---through residual-stream propagation. We call this phenomenon
\emph{subliminal prosody learning} and, to our knowledge, provide the first
systematic study of representational propagation across multiple ALM
architectures: Qwen2.5-Omni-7B, Audio Flamingo~3, and MOSS-Audio-4B.
Mean probe gain in unadapted layers is +32.0\,pp (Omni), +22.0\,pp (AF3), and
+13.6\,pp (MOSS). Out-of-distribution (OOD) classification improves by up to
+23.4\,pp, and learned emotion directions recover the Russell circumplex while
transferring cross-modally.
Critically, linear decodability does not imply functional use: we test whether
this representational accessibility translates into generation behavior.
Results are consistent with a threshold-like relationship---only Omni, with the
largest probe gain, achieves significant prosody-sensitive generation changes
($\Delta = +0.35$, $p < 0.001$), with an emotion-selective pattern
(neutral: +0.04, n.s.; happy: +1.03***; sad: +0.48***)
that rules out generic verbosity.
No empathy supervision was used: prosody-sensitive generation emerges solely as
a consequence of a classification-only auxiliary objective.
Submission Number: 259
Loading