Aligning with Logic: Measuring, Evaluating and Improving Logical Preference Consistency in Large Language Models

Yinhong Liu; Zhijiang Guo; Tianya Liang; Ehsan Shareghi; Ivan Vulić; Nigel Collier

Aligning with Logic: Measuring, Evaluating and Improving Logical Preference Consistency in Large Language Models

Yinhong Liu, Zhijiang Guo, Tianya Liang, Ehsan Shareghi, Ivan Vulić, Nigel Collier

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 spotlightposterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We quantify, evaluate and improve the logical preference consistency of LLMs' judgements

Abstract: Large Language Models (LLMs) are expected to be predictable and trustworthy to support reliable decision-making systems. Yet current LLMs often show inconsistencies in their judgments. In this work, we examine \textit{logical preference consistency} as a foundational requirement for building more dependable LLM systems, ensuring stable and coherent decision-making while minimizing erratic or contradictory outputs. To quantify the logical preference consistency, we propose a universal evaluation framework based on three fundamental properties: *transitivity*, *commutativity* and *negation invariance*. Through extensive experimentation across diverse LLMs, we demonstrate that these properties serve as strong indicators of judgment robustness. Furthermore, we introduce a data refinement and augmentation technique, REPAIR, that enhances logical consistency while maintaining alignment with human preferences. Finally, we show that improving consistency leads to better performance in LLM-driven logic-based algorithms, reinforcing stability and coherence in decision-making systems.

Lay Summary: Large language models (like ChatGPT) are used in many systems that help people make decisions, so it’s important that these models give consistent and reliable answers. However, they often show contradictions in how they make choices. This study looks at a basic idea called "logical preference consistency"—making sure the model's decisions follow simple and sensible rules, like not contradicting itself. The researchers created a way to test how consistent these models are using three simple rules of logic. They tested this on different models and found that the more consistently a model follows these rules, the more reliable its decisions are. They also developed a method called REPAIR to clean and improve the data the models learn from, which made the models even more consistent without losing their ability to agree with human values. In short, making language models more logically consistent helps them make better, more stable decisions.

Primary Area: Deep Learning->Large Language Models

Keywords: LLMs, logical consistency, order consistency, transitivity

Submission Number: 4461

Loading