FCoReBench: Can Large Language Models Solve Challenging First-Order Combinatorial Reasoning Problems?
Keywords: llms, logical-reasoning, first-order-reasoning, neuro-symbolic
TL;DR: We introduce a dataset for first-order combinatorial reasoning and propose a method to integrate LLMs with symbolic solvers through programs, we show significant performance improvements on our dataset and effectiveness on other benchmarks.
Abstract: Can the large language models (LLMs) solve challenging first-order combinatorial
reasoning problems such as graph coloring, knapsack, and cryptarithmetic? By
first-order, we mean these problems can be instantiated with potentially an infinite
number of problem instances of varying sizes. They are also challenging being
NP-hard and requiring several reasoning steps to reach a solution. While existing
work has focused on coming up with datasets with hard benchmarks, there is
limited work which exploits the first-order nature of the problem structure. To
address this challenge, we present FCoReBench, a dataset of 40 such challenging
problems, along with scripts to generate problem instances of varying sizes and
automatically verify and generate their solutions. We first observe that LLMs, even
when aided by symbolic solvers, perform rather poorly on our dataset, being unable
to leverage the underlying structure of these problems. We specifically observe
a drop in performance with increasing problem size. In response, we propose a
new approach, SymPro-LM, which combines LLMs with both symbolic solvers
and program interpreters, along with feedback from a few solved examples, to
achieve huge performance gains. Our proposed approach is robust to changes in the
problem size, and has the unique characteristic of not requiring any LLM call during
inference time, unlike earlier approaches. As an additional experiment, we also
demonstrate SymPro-LM’s effectiveness on other logical reasoning benchmarks.
Supplementary Material: zip
Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 14114
Loading