Exploring the Limitations of Graph-based Logical Reasoning in Large Language Models

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: graph reasoning, LLMs, logical reasoning
TL;DR: In this paper, we analyse various interesting properties of LLMs through graph reasoning problems.
Abstract: Pretrained Large Language Models have demonstrated various types of reasoning capabilities through language-based prompts alone. However, in this paper, we test the depth of logical reasoning for 5 different LLMs (GPT-4, GPT-3.5, Claude-2, Llama-2 and Palm-2) through the problems of graph reasoning. In particular, we design 10 distinct problems of graph traversal, each representing increasing levels of complexities. Further, we analyse the performance of models across various settings such as varying size of graphs as well as different forms of k-shot prompting. These models are evaluated using two distinct metrics -- absolute accuracy, which evaluates model responses using a binary label (Correct/Wrong), and partial credit, which evaluates the sequence of predicted nodes against the actual solution step by step, and awards score for the number of nodes correctly predicted before deviating from the correct response. We find that apart from certain powerful, language models do not possess strong reasoning capabilities The reasoning capabilities has an inverse relation to the average degrees of freedom of traversal per node in graphs. Further, we note that k-shot prompts has an overall negative impact on the reasoning abilities of language models. We finally conclude that powerful models (including GPT-4, Claude-2 and GPT-3.5) possess an estimated variable tracking depth of less than 10 nodes, making them unsuitable for complex reasoning tasks.
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7761
Loading