Aligning Language Models with Preferences through f-divergence Minimization

Dongyoung Go, Tomasz Korbak, Germán Kruszewski, Jos Rozen, Nahyeon Ryu, Marc Dymetman

Published: 2023, Last Modified: 26 Mar 2024ICML 2023Readers: Everyone

Abstract: Aligning language models with preferences can be posed as approximating a target distribution representing some desired behavior. Existing approaches differ both in the functional form of the targe...

0 Replies