Keywords: consistency, world models, LLMs
TL;DR: LLMs make consistent mistakes when given prompts that are rephrased or in another language. However, when given a substantially different prompt that relies on the same information, LLMs often give inconsistent answers.
Abstract: Do LLMs have a consistent world model that is reflected in their responses? We study whether the behavior of gpt-4o reflects an underlying world model by measuring the consistency of its mistakes across different prompts and prompting strategies. We find that gpt-4o makes consistent mistakes regardless of the exact prompt phrasing or prompt language. However, substantially different prompts that rely on the same underlying information often yield inconsistent results, suggesting that gpt-4o's responses may not reflect a single universal world model.
Submission Number: 40
Loading