What's the plan? Metrics for implicit planning in LLMs and their application to rhyme generation

ICLR 2026 Conference Submission21127 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: implicit planning, LLM, rhyming, metrics
TL;DR: quantitative measures of implicit planning in LLMs, with a case study in rhymed poetry generation
Abstract: Prior work suggests that language models, while trained on next token prediction, show implicit planning behavior: they may select the next token in preparation to a predicted future token, such as a likely rhyming word, as supported by a prior qualitative study of Claude 3.5 Haiku using a cross-layer transcoder. We propose much simpler techniques for assessing implicit planning in language models. With the focus on the case study on rhymed poetry generation, we demonstrate that our methodology easily scales to many models. Across models, we find that the generated rhyme can be manipulated by steering at the end of the preceding line with a vector representing e.g. a ``rhyme with -ight" feature, affecting the generation of intermediate tokens leading up to the rhyme. We show that implicit planning for rhyme families is a universal mechanism, present in smaller models than previously thought, starting from 1B parameters. This shows that the phenomenon of rhyming offers a widely applicable direct way to study implicit planning abilities of LLMs. More broadly, understanding planning abilities of language models can inform decisions in AI safety and control.
Supplementary Material: zip
Primary Area: interpretability and explainable AI
Submission Number: 21127
Loading