Aligning LLMs by Predicting Preferences from User Writing Samples

Stéphane Aroca-Ouellette; Natalie Mackraz; Barry-John Theobald; Katherine Metcalf

Aligning LLMs by Predicting Preferences from User Writing Samples

Stéphane Aroca-Ouellette, Natalie Mackraz, Barry-John Theobald, Katherine Metcalf

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: PREDICT is a method that uses LLMs to infer user preferences as natural language descriptions for conditioning and guiding agent behaviors based on observations of user behaviors.

Abstract: Accommodating human preferences is essential for creating aligned LLM agents that deliver personalized and effective interactions. Recent work has shown the potential for LLMs acting as writing agents to infer a description of user preferences. Agent alignment then comes from conditioning on the inferred preference description. However, existing methods often produce generic preference descriptions that fail to capture the unique and individualized nature of human preferences. This paper introduces PROSE, a method designed to enhance the precision of preference descriptions inferred from user writing samples. PROSE incorporates two key elements: (1) iterative refinement of inferred preferences, and (2) verification of inferred preferences across multiple user writing samples. We evaluate PROSE with several LLMs (i.e., Qwen2.5 7B and 72B Instruct, GPT-mini, and GPT-4o) on a summarization and an email writing task. We find that PROSE more accurately infers nuanced human preferences, improving the quality of the writing agent's generations over CIPHER (a state-of-the-art method for inferring preferences) by 33\%. Lastly, we demonstrate that ICL and PROSE are complementary methods, and combining them provides up to a 9\% improvement over ICL alone. Code: https://github.com/apple/ml-predict

Lay Summary: Accommodating human preferences is essential for creating aligned LLM agents that deliver personalized and effective interactions. While recent methods have shown they can infer user preferences from writing samples, they often generate generic descriptions that miss individual nuances. We introduce PROSE, a method that enhances the precision of these inferred preferences through iterative refinement and cross-sample consistency verification. Tested on summarization and email writing tasks, PROSE improves generation quality by 33% over prior methods. It also complements in-context learning, yielding up to a 9% additional improvement. Code: https://github.com/apple/ml-predict

Link To Code: github.com/apple/ml-predict

Primary Area: Deep Learning->Large Language Models

Keywords: NLP, LLM, preference learning, user demonstrations, benchmark

Submission Number: 11193

Loading