FedMOPA: Federated Multi-Objective Preference Alignment for Large Language Models

ACL ARR 2026 January Submission555 Authors

23 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Federated Learning, LLM Alignment
Abstract: Aligning Large Language Models (LLMs) with diverse and often conflicting human preferences is a critical challenge, particularly when preference data is distributed across privacy-sensitive silos. In this paper, we propose FedMOPA, a novel framework that integrates federated learning with multi-objective optimization to align LLMs with heterogeneous user preferences while preserving data privacy. Our approach introduces a unified, preference-conditioned model that can dynamically adapt to varying trade-offs among client preferences at inference time, eliminating the need for retraining. To address the communication overhead associated with fine-tuning LLMs in a federated setting, we propose TriLoRA, a novel conditional LoRA variant that efficiently incorporates preference information into low-rank updates. Furthermore, we design an alternating optimization strategy to mitigate aggregation errors inherent in the federated averaging of multiplicative parameters. We provide theoretical guarantees for the convergence of FedMOPA and its ability to achieve the Pareto front under certain conditions. Extensive experiments on real-world datasets, including safety and helpfulness alignment, demonstrate the effectiveness of our method. Our code is available at http://anonymous.4open.science/r/FedMOPA-555.
Paper Type: Long
Research Area: Safety and Alignment in LLMs
Research Area Keywords: Language Modeling
Languages Studied: English
Submission Number: 555
Loading