LEGO: Language Model Building Blocks

ACL ARR 2024 June Submission1927 Authors

15 Jun 2024 (modified: 03 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large language models (LLMs) are essential in natural language processing (NLP) but are costly in fine-tuning and inference, and involve invasive data collection. Task-specific small language models (SLMs) offer a cheaper alternative but lack robustness and generalization. This paper proposes a novel technique to combine SLMs and construct a robust, general LLM. Using state-of-the-art LLM pruning strategies, we create task- and user-specific SLM building blocks that are efficient for fine-tuning and inference while also preserving user data privacy. Utilizing Federated Learning and a novel aggregation scheme, we can compile an LLM from distributed SLMs, maintaining robustness without high costs and preserving user data privacy.
Paper Type: Long
Research Area: Special Theme (conference specific)
Research Area Keywords: pruning, parameter-efficient-training, NLP in resource-constrained settings, Federated Learning, scaling, sparse models, privacy, robustness, fine-tuning, resource-efficient NLP, multi-task learning, transfer learning, domain adaptation
Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency
Languages Studied: English
Submission Number: 1927
Loading