Keywords: planning under uncertainty, object search, LLM-informed planning, prompt selection
TL;DR: We enable fast deployment-time selection of best-performing prompts and LLMs for LLM-informed object search in partially-known environments.
Abstract: We present an approach for deployment-time selection of best-performing prompts and LLMs for LLM-informed object search in partially-known environments. Leveraging recent progress in both LLM-informed model-based planning and deployment-time behavior selection, we enable fast bandit-like selection of best-performing prompts and LLMs and demonstrate improved deployment-time performance in object search tasks. Experiments in simulated ProcTHOR household environments show that our bandit-like selection approach results in 6.1% lower average cost and 40.6% lower average cumulative regret over baseline UCB bandit selection.
Submission Number: 5
Loading