PersonalLLM: Tailoring LLMs to Individual Preferences

Published: 10 Oct 2024, Last Modified: 15 Nov 2024Pluralistic-Alignment 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Personalization, LLM, Alignment, benchmark, datasets, reinforcement learning from human feedback, language models, RLHF, preferences
TL;DR: We present a public benchmark, PersonalLLM, focusing on adapting LLMs to provide maximal benefits for a particular user.
Abstract: As LLMs become capable of complex tasks, there is growing potential for personalized interactions tailored to the subtle and idiosyncratic preferences of the user. We present a public benchmark, PersonalLLM, focusing on adapting LLMs to provide maximal benefits for a particular user. Departing from existing alignment benchmarks that implicitly assume uniform preferences, we curate open-ended prompts paired with many high-quality answers over which users would be expected to display heterogeneous latent preferences. Instead of persona prompting LLMs based on high-level attributes (e.g., user race or response length) that yields homogeneous preferences relative to humans, we develop a method that can simulate diverse preferences from a set of pre-trained reward models. Our dataset and generated personalities offer an innovative testbed for developing personalization algorithms that grapple with continual data sparsity---few relevant feedback from the particular user---by leveraging historical data from other (similar) users. We explore basic in-context learning and meta-learning baselines to illustrate the utility of PersonalLLM and highlight the need for future methodological development.
Submission Number: 34
Loading