Track: Regular papers (within 8 pages excluding appendix)
Keywords: Feature Engineering, Large Language Models, Cloud Resource Demand Forecasting, Reinforcement Learning
TL;DR: To improve GPU demand forecasting for bursty foundation-model workloads in enterprise clouds, we present Eureka: LLM-driven agentic framework that automates feature engineering with a RL Feedback Loop
Abstract: The rapid growth of foundation models (LLMs, VLMs) has surged enterprise cloud AI demand, where GPU resources must be dynamically allocated to meet evolving workloads. Accurate demand prediction is critical for efficient and reliable real-world deployment, yet traditional forecasting systems struggle due to sparse historical data and highly volatile workload behaviors. We present **Eureka**, an *LLM-driven agentic framework* that automates feature engineering. Our approach has three main components: a domain knowledge-driven *Expert Agent* that encodes cloud resource expertise to evaluate feature quality, an Automated Feature Generator that explores new feature spaces, lastly a RL Feedback Loop (reinforcement learning) connects the two components and enables continuous learning. Deployed and evaluated on real-world cloud provider datasets, Eureka improves demand fulfillment rate by 16\%, and reduces computing resource migration rates by 33\%. This work introduces a novel intelligent system for cloud resource prediction and AI supply chain management, advancing the efficiency, scalability, and deployability of foundation models in production environments.
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 5
Loading