$\texttt{SEM-CTRL}$: Semantically Controlled Decoding

Published: 24 Mar 2026, Last Modified: 24 Mar 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Ensuring both syntactic and semantic correctness in Large Language Model (LLM) outputs remains a significant challenge, despite being critical for real-world deployment. In this paper, we introduce $\texttt{SEM-CTRL}$, a unified approach that allows for enforcing rich context-sensitive constraints, and task and instance specific semantics directly on the LLM decoder. Our approach integrates token-level MCTS which is guided by specific syntactic and semantic constraints. The constraints over desired outputs are expressed using Answer Set Grammars, which is a logic-based formalism that generalizes context sensitive grammars while incorporating background knowledge to represent task-specific semantics. We show that our approach helps guarantee valid completions for any off-the-shelf LLM without the need for fine-tuning. We evaluate $\texttt{SEM-CTRL}$ on a range of tasks, including synthetic grammar synthesis, combinatorial reasoning, JSON parsing, and planning. Our experimental results demonstrate that $\texttt{SEM-CTRL}$ allows even small pre-trained LLMs to efficiently outperform larger variants and state-of-the-art reasoning models (e.g., $\text{\textit{o4-mini}}$) while simultaneously guaranteeing semantic validity.
Certifications: J2C Certification
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Camera-ready changes: de-anonymized, added Acknowledgments section, and removed red highlighting from the revision draft. Previous revision major changes: - Defined the acronym for Context-Sensitive Grammars (CSGs) in the Introduction (Page 1). - Updated the Introduction (Page 2) to explicitly distinguish between validity and correctness, and clarified the concept of `global correctness optimization’ with concrete examples. - Updated the ASG Background section (Page 3) to discuss: (1) the computational power and expressiveness of ASP and types of logic encodable in ASGs, (2) limitations (e.g., fuzzy logic) and potential extensions for soft constraints, and (3) theoretical and computational complexity of ASGs with references to formal proofs. - Extended the ASG Background section (Page 4) with guidance on the authoring process for new ASGs, including discussion of CFG accessibility, ASP expertise requirements, and LLM-assisted constraint generation to lower adoption barriers. - Added clarification in the Experimental Setup (Page 9), clarifying that our baselines include methods that perform global correctness optimization. - Expanded Section 5.4 (Page 12) with: (1) formal decomposition of $\texttt{SEM-CTRL}$'s computational cost, (2) justification for reporting tokens and constraint time as hardware-independent metrics, and (3) contextualization of constraint checking overhead for tasks with deep parse trees. - Added Section 5.6 (Page 13) comparing fine-tuning with $\texttt{SEM-CTRL}$ and showcasing its complementarity with our method. - Extended the Broader Impact Statement (Page 14). - Added Appendix D (Page 23, Answer Set Grammar Examples)
Video: https://youtu.be/IuObWsfw8X0
Assigned Action Editor: ~Li_Erran_Li1
Submission Number: 5748
Loading