Balancing Size and Sustainability: The Role of Compression in Large Language Models

NeurIPS 2023 Workshop CompSust Submission28 Authors

04 Oct 2023 (modified: 15 Dec 2023)Submitted to NeurIPS CompSust 2023EveryoneRevisionsBibTeX
Keywords: LLM, Llama-2, sparsity, distillation, quantization
TL;DR: LLMs like ChatGPT and GPT-4 are revolutionizing AI, but their use has environmental drawbacks. We suggest LLM compression, resource optimization, and monitoring to make sustainable LLM deployment in enterprises possible.
Abstract: The rise of powerful Large Language Models like ChatGPT and GPT-4 has transformed AI across domains. However, their widespread use comes with significant environmental costs. To address this challenge, we propose a multi-pronged approach, including LLM compression, resource optimization, and active monitoring. This paper focuses on evaluating compression methods for sustainable LLM deployment in enterprise settings.
Submission Number: 28
Loading