Learning State-Tracking from Code: REPL Traces and Probabilistic Automata

Published: 02 Mar 2026, Last Modified: 02 Mar 2026LIT Workshop @ ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Track: long paper (up to 10 pages)
Keywords: state-tracking, linear RNNs, code execution, probabilistic automata, belief propagation, DeltaNet, next-token prediction, finite-state machines
TL;DR: Linear RNNs learn state-tracking from Python REPL traces with sparse supervision where Transformers fail, but probabilistic transitions in real code cause exponential norm decay.
Abstract: Over the last years, state-tracking tasks, particularly permutation composition, have become a testbed to understand the limits of sequence models architectures like Transformers and RNNs (linear and non-linear). However, current experiments use a sequence-to-sequence setup: learning to map actions (permutations) to states, that does not translate to the next-token prediction setting commonly used to train language models. We address this gap by converting permutation composition into code via REPL traces that interleave state-reveals through prints and variable transformations. We show that linear RNNs capable of state-tracking excel also in this setting, while Transformers still fail. Motivated by this representation, we investigate why tracking states in code is generally difficult: actions are not always fully observable. We frame this as tracking the state of a probabilistic finite-state automaton with deterministic state reveals and show that adversarial sequences exist where linear RNNs cannot guarantee stable probabilistic state-tracking.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Presenter: ~Julien_Siems1
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 26
Loading