Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and Reliability

Weitong Zhang; Chengqi Zang; Bernhard Kainz

Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and Reliability

Weitong Zhang, Chengqi Zang, Bernhard Kainz

13 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Mechanism Design, Large Language Models (LLMs), Generative Modeling, Alignment

TL;DR: A Bayesian Decoding Game Enhances Consistency and Reliability

Abstract: Large Language Models (LLMs) often produce outputs that -- though plausible -- can lack consistency and reliability, particularly in ambiguous or complex scenarios. Challenges arise from ensuring that outputs align with both factual correctness and human intent. This is problematic in existing approaches that trade improved consistency for lower accuracy. To mitigate these challenges, we propose a novel game-theoretic approach to enhance consistency and reliability during the decoding stage of LLM output generation. Our method models the decoding process as a multistage Bayesian decoding game. This ensures consistency through \textit{Correctness Alignment} and enhances reliability via \textit{Ambiguity Calibration}. The model dynamically converges to a consensus on the most reliable outputs and distinguishes \{Valid}, Specious}\} outputs without human feedback or additional training. Remarkably, our game design allows smaller models to outperform much larger models through game mechanisms (\textit{e.g.} 78.1 LLaMA13B \textit{vs} 76.6 PaLM540B), as well as integrating various LL strategies and models, demonstrating the potential of game-theoretic tools to improve the truthfulness and reliability of LLMs.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 542

Loading