Published: 01 Jan 2023, Last Modified: 26 Mar 2024ICML 2023Readers: Everyone
Abstract:Aligning language models with preferences can be posed as approximating a target distribution representing some desired behavior. Existing approaches differ both in the functional form of the targe...