Coactive Learning for Large Language Models using Implicit User Feedback

ICLR 2024 Workshop DMLR Submission68 Authors

Published: 04 Mar 2024, Last Modified: 02 May 2024DMLR @ ICLR 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: coactive learning, large language models, reinforcement learning from human preferences, direct preference optimization
TL;DR: Implicit feedback gathered while users edit LLM outputs can be better feedback than explicitly solicited comparisons between two policy samples, even when the implicit feedback is suboptimal and noisy.
Abstract: We propose coactive learning as a model and feedback mechanism for training large language models (LLMs). The key insight is that users provide implicit feedback whenever they edit the text $y$ proposed by an LLM. While the edited text $\bar y$ is typically not a gold-standard example for supervised training, coactive learning merely requires that the edited text $\bar y$ is an improvement over the proposed text $y$. Note that such weak implicit preference feedback $\bar y \succ y$ is available in many application settings on a per-user basis, thus enabling the personalization of LLMs. In this paper, we develop the theoretical basis for coactive training of non-linear models, and we derive \algname\ as the first coactive learning algorithm for LLMs. Empirical results indicate that \algname\ is effective even for weak and noisy coactive preference feedback, making it a promising algorithm for training and personalization of LLMs from feedback that is naturally collected in many use cases.
Primary Subject Area: Data-centric approaches to AI alignment
Paper Type: Research paper: up to 8 pages
Participation Mode: In-person
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 68
Loading