Balancing Size and Sustainability: The Role of Compression in Large Language Models

Balancing Size and Sustainability: The Role of Compression in Large Language Models

NeurIPS 2023 Workshop CompSust Submission28 Authors

04 Oct 2023 (modified: 15 Dec 2023)Submitted to NeurIPS CompSust 2023EveryoneRevisionsBibTeX

Keywords: LLM, Llama-2, sparsity, distillation, quantization

TL;DR: LLMs like ChatGPT and GPT-4 are revolutionizing AI, but their use has environmental drawbacks. We suggest LLM compression, resource optimization, and monitoring to make sustainable LLM deployment in enterprises possible.

Abstract: The rise of powerful Large Language Models like ChatGPT and GPT-4 has transformed AI across domains. However, their widespread use comes with significant environmental costs. To address this challenge, we propose a multi-pronged approach, including LLM compression, resource optimization, and active monitoring. This paper focuses on evaluating compression methods for sustainable LLM deployment in enterprise settings.

Submission Number: 28

Loading