LLM-Box : An Agentic Framework for Guided Black-Box Optimization in Mapping LLMs onto Specialized Hardware Accelerators

Sujay Pandit; Akanksha Jain; Rami Cohen; Zhijie Deng; Sagar Karandikar; Sagi Perel; Anand Raghunathan; Parthasarathy Ranganathan

LLM-Box : An Agentic Framework for Guided Black-Box Optimization in Mapping LLMs onto Specialized Hardware Accelerators

Sujay Pandit, Akanksha Jain, Rami Cohen, Zhijie Deng, Sagar Karandikar, Sagi Perel, Anand Raghunathan, Parthasarathy Ranganathan

Published: 30 Oct 2025, Last Modified: 04 Nov 2025MLForSys2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Design Space Exploration, Black-Box Optimization, Retrieval-Augmented Generation, Transfer Learning

TL;DR: LLM-Box is an LLM-guided black-box optimization framework to efficiently explore LLM-to-accelerator mapping strategies. It achieves near-optimal Pareto fronts with 40-150x fewer simulations and strong zero-shot transfer across models & accelerators.

Abstract: Identifying efficient execution strategies for Large Language Models (LLMs) on specialized hardware accelerators requires exploring a vast design space where exhaustive search is computationally prohibitive. Traditional black-box optimization (BBO) methods offer a principled alternative, but their efficiency degrades in high-dimensional, sparse spaces with many infeasible points. We propose LLM-Box, a framework that integrates an LLM agent to guide multi-objective BBO toward the Pareto frontier while significantly reducing sampling of infeasible points. By leveraging the LLM agent to retrieve and structure prior exploration data through retrieval-augmented generation (RAG), and by warm-starting and filtering BBO suggestions, our approach guides the search towards feasible and promising regions of the design space. As a result, LLM-Box identifies Pareto-optimal configurations with a hypervolume difference of less than 3\% using $40{-}150\times$ fewer simulations than an exhaustive search, and compared to a well-known BBO tool, achieves 2\% better accuracy with $20\times$ fewer trials. Moreover, the framework demonstrates zero-shot generalization, transferring knowledge from prior models and hardware to unseen targets.

Submission Number: 50

Loading