Abstract: Despite their linguistic competence, Large Language Models (LLMs) often struggle to reason reliably and flexibly. To identify these shortcomings, we introduce the Non-Linear Reasoning (NLR) dataset, a collection of unique, hand-designed problems that target reasoning bottlenecks arising from the sequential prediction paradigm of LLMs and the inherently linear nature of natural language. NLR tasks require iterative updates, backtracking, and reasoning across multiple parallel chains of thought but only basic arithmetic to solve. To address these limitations, we propose a neurosymbolic reasoning approach that integrates Prolog, a symbolic reasoning engine, into the inference pipeline of LLMs. This division of labor shifts the LLM’s task from iterative computations to inferring all information—explicit or implied through common sense—and encoding it as logical code. Our method yields large and robust performance gains across the GSM8k and BIG-bench Navigate benchmarks and achieves near-perfect accuracy on NLR problems, maintaining robustness even as variable interdependence—the number of other variables on which the value of a single variable depends—increases.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Nicolas_A._Gontier1
Submission Number: 6312
Loading