Tracking Training Phases in Compositional Learning with Task-Agnostic Measures
Keywords: developmental interpretability, training dynamics, phase transitions, compositional generalization
TL;DR: We compare 53 task-agnostic measures for detecting training phases on modular addition and a new multilingual variant with controllable transition overlap, finding that once transitions overlap, recovery degrades across the suite.
Abstract: Deep neural networks often acquire their final capabilities through qualitatively distinct training phases. Characterizing these phases sheds light on how models learn and could enable steering away from unwanted outcomes. The most reliable existing methods for detecting training phases rely on a prior mechanistic understanding of how the model performs an underlying task. Task-agnostic scalar measures, i.e., quantities computed from a model's parameters, representations, or outputs, offer a more general alternative, but have largely been studied in isolation. In this paper, we systematically compare 53 such measures by fitting Gaussian hidden Markov models (HMMs) to their trajectories across two compositional settings: modular addition and a new multilingual variant we introduce, in which per-language data fractions control how much consecutive phase transitions overlap. This overlapping regime arises naturally when models acquire capabilities in close succession, yet lacks controlled benchmarks. We find that once transitions overlap, recovery quality drops across all 53 measures, even when we fit the HMM directly to validation accuracy, hinting at limitations of our HMM-based framework. Still, some measures perform relatively better, with prediction entropy, PCA effective dimension, and the Local Learning Coefficient most consistently among the top performers.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 149
Loading