From Natural Language to Exact Cover: A Neuro-Symbolic Approach to Zebra Puzzles

Paulius Skaisgiris; Thomas Pammer; Veronika Semmelrock; Mykyta Ielanskyi; Maximilian Heisinger; Erich Kobler

From Natural Language to Exact Cover: A Neuro-Symbolic Approach to Zebra Puzzles

Paulius Skaisgiris, Thomas Pammer, Veronika Semmelrock, Mykyta Ielanskyi, Maximilian Heisinger, Erich Kobler

Published: 01 Apr 2026, Last Modified: 28 Apr 2026ICLR 2026 Workshop LLM Reasoning OralEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 10 pages)

Keywords: LLMs, exact cover, neuro-symbolic, logic puzzles, Zebra puzzles

TL;DR: We combine LLM-based semantic parsing with an Exact Cover solver to translate and solve Zebra-style logic puzzles from natural language, achieving deterministic, logically sound solutions and outperforming strong neural and neuro-symbolic baselines.

Abstract: Chain-of-Thought (CoT) generation has substantially improved the performance of Large Language Models (LLMs) on complex reasoning tasks, including code generation, data analysis, and exam-style question answering. Despite these advances, purely neural LLMs continue to struggle with elementary logical reasoning problems and lack the determinism, soundness, and reliability characteristic of symbolic reasoning systems. Conversely, classical symbolic methods such as SAT solving and Exact Cover guarantee correctness and completeness, but require problems to be expressed in highly specialized formal encodings, limiting their applicability to natural language inputs. In this work, we present a tightly integrated neuro-symbolic framework that bridges this gap by combining neural semantic parsing with deterministic constraint solving. Our approach leverages the relational extraction capabilities of modern LLMs to parse Zebra-style logic puzzles written in free-form text and translate the extracted constraints into structured tool calls. These function calls assemble a formally specified Exact Cover instance, which is subsequently solved by a symbolic solver to ensure logically consistent solutions. We conduct a comprehensive empirical evaluation across multiple parameter scales, post-training paradigms, and LLM families. The results on larger puzzles demonstrate that our hybrid approach consistently outperforms strong plain neural baselines, including CoT prompting, as well as recent neuro-symbolic methods.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 183

Loading