GFlowNets with Human FeedbackDownload PDF

01 Mar 2023 (modified: 31 May 2023)Submitted to Tiny Papers @ ICLR 2023Readers: Everyone
Keywords: GFlowNets, Human Feedback
TL;DR: We propose the GFlowNets with Human Feedback framework to improve the exploration of training language models.
Abstract: We propose the GFlowNets with Human Feedback (GFlowHF) framework to improve the exploration of training language models. For tasks where the reward is unknown, we fit the reward function through human evaluations on different trajectories. The goal of GFlowHF is to learn a policy that is strictly proportional to human ratings, instead of only focusing on human favorite ratings like RLHF. Experiments show that GFlowHF can achieve better exploration ability than RLHF, and thus is more suitable for large-scale language model tasks.
6 Replies

Loading