OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning

Published: 04 Mar 2024, Last Modified: 02 May 2024DPFM 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, Federated Learning, Instruction Tuning, Value Alignment
TL;DR: We introduce an integrated and concise framework for training LLMs via federated learning, provide a comprehensive empirical study, and point out future directions.
Abstract: Trained on massive publicly available data, large language models (LLMs) have demonstrated tremendous success across various fields. While more data contributes to better performance, a disconcerting reality is that high-quality public data will be exhausted in a few years. In this paper, we offer a potential next step for contemporary LLMs: collaborative and privacy-preserving LLM training on the underutilized distributed private data via federated learning (FL), where multiple data owners collaboratively train a shared model without transmitting raw data. To achieve this, we build a concise, integrated, and research-friendly framework/codebase, named OpenFedLLM. It covers federated instruction tuning for enhancing instruction-following capability, federated value alignment for aligning with human values, and 7 representative FL algorithms. Besides, OpenFedLLM supports training on diverse domains, where we cover 8 training datasets; and provides comprehensive evaluations, where we cover 30+ evaluation metrics. Through extensive experiments, we observe that all FL algorithms outperform local training on training LLMs, demonstrating a clear performance improvement across a variety of settings. Notably, in a financial benchmark, Llama2-7B fine-tuned by applying any FL algorithm can outperform GPT-4 by a significant margin while the model obtained through individual training cannot, demonstrating strong motivation for clients to participate in FL. Code is available at https://github.com/rui-ye/OpenFedLLM.
Submission Number: 62
Loading