When Intermediate Supervision Doesn’t Help: Evidence from Recurrent CNNs

Published: 02 Mar 2026, Last Modified: 18 Mar 2026LIT Workshop @ ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Track: long paper (up to 10 pages)
Keywords: implicit reasoning, progress supervision, CoT, intermediate supervision, algorithmic reasoning, recurrent neural networks
Abstract: Intermediate supervision has shown promise for mathematical reasoning, logical inference, and algorithmic tasks, yet recent evidence questions whether these improvements reflect genuine algorithmic reasoning. We test this in maze-solving, teaching recurrent convolutional neural networks systematic search strategies inspired by classical computer science algorithms. Networks successfully replicate the search strategies at test time, yet intermediate supervision consistently underperforms end-to-end learning across all generalisation tests, and performance collapses across all search strategies when networks are tested on mazes with different topologies. These findings suggest that explicit procedural guidance fails to teach networks transferable algorithmic principles. By forcing interpretable traces, such explicit thinking may even constrain network flexibility, challenging assumptions about intermediate supervision's benefits for algorithmic reasoning.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 70
Loading