TruncProof: LL(1)-Constrained Generation in Large Language Models with Maximum Token Limitations

TruncProof: LL(1)-Constrained Generation in Large Language Models with Maximum Token Limitations

ICLR 2026 Conference Submission15806 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, constrained generation, structured generation, context-free grammar

TL;DR: A novel guardrail for LLMs to generate the output that follows the user-specified LL(1) grammar and the maximum token limit.

Abstract: The generation of machine-readable outputs using LLMs has attracted significant attention. However, existing approaches cannot strictly enforce the maximum number of tokens to be generated. To address this limitation, we propose TruncProof, a novel grammar-constrained generation method that enables LLMs to produce grammatically valid outputs while adhering to a predefined token limit. By leveraging the properties of LL(1) parsers, TruncProof efficiently estimates the minimum number of tokens required to complete a grammatically valid output at each decoding step. Experiments on the Text-to-JSON instruction task and Code generation task demonstrate that TruncProof successfully generates syntactically correct outputs even under strict token constraints. Furthermore, we show that TruncProof can be effectively combined with advanced decoding strategies, resulting in outputs that are not only grammatically valid but also semantically accurate. The source code will be made public upon acceptance.

Supplementary Material: zip

Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)

Submission Number: 15806

Loading