Improving Commonsense Reasoning and Reliability in LLMs Through Cognitive-Inspired Prompting Frameworks
Keywords: Commonsense Reasoning, Cognitive-Inspired Prompting, Prompt Engineering, Few-Shot Learning, Zero-Shot Learning, Reliability, LLM, Robust, Foundation Models
TL;DR: Cognitively-inspired prompting strategies, especially metacognitive and narrative-based ones, significantly boost LLM reasoning on HellaSwag, surpassing standard reasoning strategies and even advanced models like GPT-o4-mini.
Abstract: Despite their impressive abilities, large language models like GPT-3.5 often falter on tasks requiring commonsense and logical reasoning, raising concerns about their reliability. In this work, we introduce a suite of cognitively inspired prompting strategies, grounded in metacognitive, strategic, narrative, and linguistic reasoning frameworks, to enhance LLMs’ reasoning capabilities. Using the HellaSwag benchmark, we demonstrate that our cognitive-based prompts consistently improve accuracy over standard prompting baselines. Metacognitive and narrative-based frameworks yield the most robust gains in accuracy, outperforming advanced reasoning models like GPT-o4-mini and existing reasoning strategies like Chain-of-Thought. Notably, the few-shot METAL and RNRRR frameworks, which are specific metacognitive strategies, emerge as the most effective strategies overall. These results underscore the importance of structured, cognitive-based prompting in building more dependable and transparent AI systems, contributing meaningfully to the advancement of reliable and responsible LLMs.
Submission Number: 98
Loading