Learning from Natural Language Feedback

Angelica Chen; Jérémy Scheurer; Jon Ander Campos; Tomasz Korbak; Jun Shern Chan; Samuel R. Bowman; Kyunghyun Cho; Ethan Perez

Learning from Natural Language Feedback

Angelica Chen, Jérémy Scheurer, Jon Ander Campos, Tomasz Korbak, Jun Shern Chan, Samuel R. Bowman, Kyunghyun Cho, Ethan Perez

Published: 01 Mar 2024, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: The potential for pre-trained large language models (LLMs) to use natural language feedback at inference time has been an exciting recent development. We build upon this observation by formalizing an algorithm for learning from natural language feedback at training time instead, which we call Imitation learning from Language Feedback (ILF). ILF requires only a small amount of human-written feedback during training and does not require the same feedback at test time, making it both user-friendly and sample-efficient. We further show that ILF can be seen as a form of minimizing the KL divergence to the target distribution and demonstrate proof-of-concepts on text summarization and program synthesis tasks. For code generation, ILF improves a Codegen-Mono 6.1B model's pass@1 rate by 38% relative (and 10% absolute) on the Mostly Basic Python Problems (MBPP) benchmark, outperforming both fine-tuning on MBPP and fine-tuning on repaired programs written by humans. For summarization, we show that ILF can be combined with learning from human preferences to improve a GPT-3 model's summarization performance to be comparable to human quality, outperforming fine-tuning on human-written summaries. Overall, our results suggest that learning from human-written natural language feedback is both more effective and sample-efficient than training exclusively on demonstrations for improving an LLM's performance on a variety of tasks.

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission: - New experiments in Appendix A.4 - New "Broader Impacts" section in Appendix D - various edits to wording and mathematical details, as suggested by reviewers - Changed which method we consider to be the baseline (i.e. fine-tuned on ground truth MBPP data instead of zero-shot) - Changed the format that we report results in (x%->y% instead of z% increase)

Code: Code + data for code generation portion: https://github.com/nyu-mll/ILF-for-code-generation; Data for text summarization portion: https://huggingface.co/datasets/JeremyAlain/SLF5K; Code for text summarization portion: https://github.com/JeremyAlain/imitation_learning_from_language_feedback

Assigned Action Editor: ~Alessandro_Sordoni1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 1662

Loading