PlatoLM: Teaching LLMs via a Socratic Questioning User Simulator

23 Sept 2023 (modified: 12 Feb 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Large Language Model, User Simulation, Human Computer Interaction
Abstract: The unparalleled performance of closed-sourced ChatGPT has sparked efforts towards its democratization, with notable strides made by leveraging real user and ChatGPT conversations, as evidenced by Vicuna. However, due to challenges in gathering conversations involving human participation, current endeavors like Baize and UltraChat aim to automatically generate conversational data. They primarily rely on ChatGPT conducting roleplay to simulate human behaviors based on instructions rather than genuine learning from humans, resulting in limited scope, diminished diversity, and an absence of genuine multi-round conversational dynamics. To address the above issues, we target human questions extracted from genuine human-machine conversations as a learning goal and train a user simulator called Socratic to produce a high-quality human-centric synthetic conversation dataset. Subsequently, this dataset was used to train our assistant model, named PlatoLM. PlatoLM achieves the SOTA performance among 7B models (including LLaMA-2-7B-chat and Vicuna-7B) in both Vicuna-Bench and pairwise comparison in MT-Bench; the effectiveness of PlatoLM is also evidenced by manual evaluation.
Supplementary Material: zip
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7381
Loading