Learning User Preferences using Score-Based Diffusion Mode

10 Jan 2024 (modified: 23 Feb 2024)PKU 2023 Fall CoRe SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: diffusion model, object arrangement, utility
Abstract: Learning user preferences is an essential capability for robots assisting people in personalized daily tasks. The utility function is commonly used for modeling user preferences, but learning the utility function is challenging due to the complexity of human preferences, the context-dependent nature of the utility function, and the incomplete or inconsistent information from the demonstration. In this work, we propose a learning framework based on a score-based diffusion model that can learn utility functions from user demonstrations and can generate instances based on learned utility functions. We use object arrangement tasks to evaluate utility learning ability and generalization ability of our model, and achieved significant improvement compared to existing methods on these tasks. These results show that the score-based diffusion model is capable of learning user preferences modeled by utility functions, and can adapt to unseen environment changes.
Supplementary Material: zip
Submission Number: 234
Loading