Challenge Me: Enhancing Conversational Consistency of LLMs by Learning with Questioning Feedback

Yan Luo; Hao Huang; Congcong Wen; Minghan Li; Min Shi; Yi Fang; Mengyu Wang

Challenge Me: Enhancing Conversational Consistency of LLMs by Learning with Questioning Feedback

Yan Luo, Hao Huang, Congcong Wen, Minghan Li, Min Shi, Yi Fang, Mengyu Wang

24 Sept 2024 (modified: 15 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: AI Safety, LLM, Conversational Consistency

TL;DR: This work introduces a novel Conversationally Consistent Supervised Fine-Tuning (CC-SFT) method to enhance the reliability of Large Language Models in multi-turn dialogues by reducing contradictory responses across conversation turns.

Abstract: As Large Language Models (LLMs) increasingly integrate into critical decision-support systems, ensuring their conversational consistency becomes paramount for reliable and trustworthy AI-assisted services, especially in high-stakes domains such as healthcare and legal advice. In this work, we study the critical issue of conversational inconsistency in LLMs, where models provide contradictory information across multiple dialogue turns. We introduce a novel Conversationally Consistent Supervised Fine-Tuning (CC-SFT) method that explicitly accounts for two-turn conversations. Our approach combines a first-round loss, a second-round loss, and a consistency loss based on Wasserstein distance to encourage coherent responses across turns. We evaluate our method on three diverse datasets (OpenBookQA, GSM8K, and MedQA-USMLE) using three LLMs (Llama v3.1, Mistral AI, and Gemma). Experimental results demonstrate that CC-SFT significantly reduces conversational inconsistency compared to standard fine-tuning, with lower flipping rates and improved accuracy in second-round responses. We provide theoretical convergence guarantees for our method and analyze the impact of the consistency loss coefficient. Our code is publicly available at \url{https://github.com/anonymous4science/llm_conversational_consistency}.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3467

Loading