Carbon- and System-Aware LoRA Scaling for On-Device LLMs via Hierarchical Multi-Objective Reinforcement Learnin

Published: 24 Sept 2025, Last Modified: 24 Sept 2025NeurIPS 2025 LLM Evaluation Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Sustainable AI ; Carbon-Aware; LoRA; On-Device; LLM; Multi-Objective Reinforcement Learning
TL;DR: We introduce a hierarchical multi-objective reinforcement learning approach for dynamic Low-Rank Adaptation (LoRA) scaling that optimizes carbon and energy efficiency while maintaining acceptable performance and system budgets for on-device LMs.
Abstract: On-Device deployment of large and small language models (LLMs / SLMs) faces critical challenges in balancing performance, energy consumption, and carbon footprint on various mobile and wearable devices. We introduce a hierarchical multiobjective reinforcement learning approach for dynamic Low-Rank Adaptation (LoRA) scaling that optimizes carbon efficiency as the primary objective while maintaining acceptable performance and energy consumption. Our method employs Proximal Policy Optimization (PPO) with a carbon-first reward function that prioritizes carbon efficiency (inferences per mg CO$_2$) over traditional energy efficiency (inferences per Joule). In smartwatches, AR glasses, VR headsets and tablets using DistilGPT2, OPT-125M, DialoGPT-Small, and GPT-2, our approach achieves up to 35.1 inf / J and 0.412 perf / mg of CO$_2$, demonstrating the effectiveness of carbon-aware optimization for edge AI systems.
Submission Number: 41
Loading