Quiet Feature Learning in Algorithmic Tasks

Prudhviraj Naidu; Zixian Wang; Leon Bergen; Ramamohan Paturi

Quiet Feature Learning in Algorithmic Tasks

Prudhviraj Naidu, Zixian Wang, Leon Bergen, Ramamohan Paturi

Published: 30 Sept 2025, Last Modified: 20 Nov 2025Mech Interp Workshop (NeurIPS 2025) PosterEveryoneRevisionsBibTeXCC BY 4.0

Open Source Links: https://github.com/prudhvirajn/quiet-feature-learning-in-algorithmic-tasks

Keywords: Developmental interpretability, Probing, Causal interventions

Other Keywords: scaling laws, emergent abilities

TL;DR: Quiet feature learning precedes rapid generalization during language model training on algorithmic tasks.

Abstract: We train Transformer-based language models on ten foundational algorithmic tasks and observe pronounced phase transitions in their loss curves that deviate from established power-law scaling trends. Over large ranges of compute, the validation loss barely improves, then abruptly decreases. Probing the models’ internal representations reveals that quiet features are learned prior to any decrease in task loss. These quiet features represent intermediate algorithmic computations that do not by themselves improve the output loss. Ablation experiments demonstrate that individual quiet features are causally necessary for task performance. Our results demonstrate that substantial representational progress can remain hidden beneath an apparently flat loss curve, challenging the prevailing use of cross‑entropy as a proxy for learning and motivating richer diagnostics for monitoring model training.

Submission Number: 25

Loading