Learning to Plan with Personalized Preferences

Manjie Xu; Xinyi Yang; Wei Liang; Chi Zhang; Yixin Zhu

Learning to Plan with Personalized Preferences

Manjie Xu, Xinyi Yang, Wei Liang, Chi Zhang, Yixin Zhu

21 Jan 2025 (modified: 18 Jun 2025)Submitted to ICML 2025EveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We develop agents that learn human preferences from few demonstrations and learn to adapt their planning strategies based on these preferences in multitasks.

Abstract: Effective integration of AI agents into daily life requires them to understand and adapt to individual human preferences, particularly in collaborative roles. Although recent studies on embodied intelligence have advanced significantly, they typically adopt generalized approaches that overlook personal **preferences in planning**. We address this limitation by developing agents that not only learn preferences from few demonstrations but also learn to adapt their planning strategies based on these preferences. Our research leverages the observation that preferences, though implicitly expressed through minimal demonstrations, can generalize across diverse planning scenarios. To systematically evaluate this hypothesis, we introduce PbP benchmark, an embodied benchmark featuring hundreds of diverse preferences spanning from atomic actions to complex sequences. Our evaluation of SOTA methods reveals that while symbol-based approaches show promise in scalability, significant challenges remain in learning to generate and execute plans that satisfy personalized preferences. We further demonstrate that incorporating learned preferences as intermediate representations in planning significantly improves the agent's ability to construct personalized plans. These findings establish preferences as a valuable abstraction layer for adaptive planning, opening new directions for research in preference-guided plan generation and execution.

Primary Area: General Machine Learning->Transfer, Multitask and Meta-learning

Keywords: preference, few-shot learning, planning

Submission Number: 4645

Loading