CLAMP: A Chebyshev-Weighted Multi-Gradient Approach for Multi-Objective LLM Alignment

CLAMP: A Chebyshev-Weighted Multi-Gradient Approach for Multi-Objective LLM Alignment

ICLR 2026 Conference Submission15181 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM Alignment, Multi-Objective Alignment, Chebyshev-Weighted Multi-Gradient Approach

Abstract: Alignment in large language models (LLMs) is crucial for enhancing their capabilities to align with human preferences. To date, many existing alignment approaches, such as reinforcement learning from human feedback (RLHF)-based and reinforcement learning-free methods (e.g., direct preference optimization (DPO)), assume homogeneous human preferences. In practice, however, human preferences are inherently heterogeneous and even conflicting, rendering traditional LLM alignment techniques inapplicable. Toward this end, multi-objective alignment (MOA) methods have been developed to accommodate this diversity. Yet, most of them rely on simple heuristics to address conflicting objectives, hence struggling to efficiently explore the full Pareto front and handle non-convex LLM alignment objective landscapes. Although there have been other alignment techniques attempt to address these issues, they still depend heavily on reinforcement learning (RL) or pre-trained reward models, resulting in computational inefficiency and susceptibility to reward-model-induced biases. In this work, we propose the CLAMP (**C**hebyshev-weighted **L**LM **a**lignment with **m**ulti-objective **p**references), a new multi-objective alignment algorithmic framework that is both RL-free and reward-model-free. Our method integrates Chebyshev-weighted scalarization with multi-gradient descent algorithms, efficiently finding Pareto-stationary solutions and effectively capturing diverse human preference trade-offs. We theoretically establish finite-time convergence rate guarantees for our CLAMP framework, which is independent of the number of alignment objectives. Experimental results further validate the effectiveness of CLAMP in aligning LLMs to heterogeneous human preferences, significantly improving previous methods.

Supplementary Material: zip

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 15181

Loading