TextBO: Bayesian Optimization in Language Space for Eval-Efficient Self-Improving AI

Published: 05 Mar 2026, Last Modified: 05 Mar 2026ICLR 2026 Workshop RSI PosterEveryoneRevisionsCC BY 4.0
Keywords: Evaluation efficiency, Bayesian optimization
TL;DR: We propose TextBO, a provably evaluation-efficient prompt-optimization based self-improving AI algorithm that emulates Bayesian Optimization in language space.
Abstract: Large Language Models (LLMs) have enabled self-improving AI systems that iteratively generate, evaluate, and refine their outcomes. Recent studies show that prompt-optimization-based self-improvement can outperform state-of-the-art reinforcement-learning fine-tuning of LLMs, but performance is typically measured by \emph{generation} efficiency. However, in many applications, the constraint is \emph{evaluation} efficiency: obtaining reliable feedback is far more costly than generating candidates. In this paper, we propose \textsc{TextBO}, a self-improving algorithm that achieves evaluation-efficiency by provably emulating gradient-based UCB-BO in language space. We empirically validate \textsc{TextBO} on automated ad-alignment tasks agentic AI tasks, demonstrating superior performance per evaluation compared to \textsc{GEPA}. We also evaluate \textsc{TextBO}’s \textsc{Best‑of‑N} multi‑step textual‑gradient mechanism on agentic AI benchmarks by augmenting \textsc{GEPA} with it and show that it significantly outperforms standard \textsc{GEPA}. For the full paper access, refer to \href{https://arxiv.org/abs/2511.12063}{https://arxiv.org/abs/2511.12063}.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 30
Loading