Keywords: Quality-Diversity, RLHF
Abstract: Diversity plays a significant role in many problems, such as ensemble learning, reinforcement learning, and combinatorial optimization. Though having great many successful applications in machine learning, most methods need to define a proper behavior space, which is, however, challenging for the human in many scenarios. In this paper, we propose the problem of learning a behavior space from human feedback and introduce a general method called Diversity from Human Feedback (DivHF) to solve it. DivHF learns a behavior descriptor function consistent with human preference by querying human feedback. The learned behavior descriptor can be combined with any distance measure to define a diversity measure. We demonstrate the effectiveness of DivHF by integrating it with the Quality-Diversity optimization algorithm MAP-Elites and conducting experiments on the QDax suite. The results show that DivHF learns a behavior space that aligns better with human requirements compared to direct data-driven approaches and leads to more diverse solutions under human preference. Our contributions include formulating the problem, proposing the DivHF method, and demonstrating its effectiveness through experiments.
Submission Number: 60
Loading