The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values

Hannah Rose Kirk; Andrew Michael Bean; Bertie Vidgen; Paul Rottger; Scott A. Hale

The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values

Hannah Rose Kirk, Andrew Michael Bean, Bertie Vidgen, Paul Rottger, Scott A. Hale

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 MainEveryoneRevisionsBibTeX

Submission Type: Regular Long Paper

Submission Track: Theme Track: Large Language Models and the Future of NLP

Submission Track 2: Language Modeling and Analysis of Language Models

Keywords: Large language models, survey, review, human feedback, human preference, human values, alignment

TL;DR: A survey of papers using human feedback to steer language models, including recommendations for future work.

Abstract: Human feedback is increasingly used to steer the behaviours of Large Language Models (LLMs). However, it is unclear how to collect and incorporate feedback in a way that is efficient, effective and unbiased, especially for highly subjective human preferences and values. In this paper, we survey existing approaches for learning from human feedback, drawing on 95 papers primarily from the ACL and arXiv repositories. First, we summarise the past, pre-LLM trends for integrating human feedback into language models. Second, we give an overview of present techniques and practices, as well as the motivations for using feedback; conceptual frameworks for defining values and preferences; and how feedback is collected and from whom. Finally, we encourage a better future of feedback learning in LLMs by raising five unresolved conceptual and practical challenges.

Submission Number: 4548

Loading