Abstract: Consistency is a fundamental dimension of trustworthiness in Large Language Models (LLMs). For humans to be able to trust LLM-based applications, their outputs should be consistent when prompted with inputs that carry the same meaning or intent. Despite this need, there is no known mechanism to control and guide LLMs to be more consistent at inference time. In this paper, we introduce a novel alignment strategy to maximize semantic consistency in LLM outputs. Our proposal is based on \textbf{Chain of Guidance} (CoG), a multi-step prompting technique that generates highly consistent outputs from LLMs. For closed-book question-answering tasks, outputs generated using CoG are upto 2.5 times more consistent than outputs generated without using CoG. We use synthetic datasets comprised of consistent input-output pairs to finetune LLMs into producing consistent {\it and} correct outputs. Our finetuned models are more than twice as consistent compared to base models, and show strong generalization capabilities by producing consistent outputs over datasets not used in the finetuning process.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=asiBW1bB9b
Changes Since Last Submission: Added additional material to address reviewer comments.
Assigned Action Editor: ~Greg_Durrett1
Submission Number: 3446
Loading