Keywords: LLM, Llama-2, sparsity, distillation, quantization
TL;DR: LLMs like ChatGPT and GPT-4 are revolutionizing AI, but their use has environmental drawbacks. We suggest LLM compression, resource optimization, and monitoring to make sustainable LLM deployment in enterprises possible.
Abstract: The rise of powerful Large Language Models like ChatGPT and GPT-4 has transformed AI across domains. However, their widespread use comes with significant environmental costs. To address this challenge, we propose a multi-pronged approach, including LLM compression, resource optimization, and active monitoring. This paper focuses on evaluating compression methods for sustainable LLM deployment in enterprise settings.
Submission Number: 28
Loading