Uncertainty-Guided Optimization on Large Language Model Search Trees

Published: 27 May 2024, Last Modified: 27 May 2024AABI 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, tree search, Bayesian optimization
TL;DR: We propose a Bayesian-Optimization-like tree search for token sequence generation with LLMs.
Abstract: Beam search is a standard tree search algorithm when it comes to finding sequences of maximum likelihood, for example, in the decoding processes of large language models. However, the algorithm is myopic---it does not take the whole path from the root to a leaf into account. Moreover, it is agnostic to prior knowledge available about the process: It does not consider that the objective being maximized is a likelihood and thereby has specific properties like boundedness in the unit interval. Taking a probabilistic approach, we define a Dirichlet prior over the transition probabilities and obtain a posterior distribution over the most promising paths in each iteration. These distributions are helpful to define a non-myopic Bayesian-optimization-like acquisition function that allows for a more data-efficient exploration scheme than standard beam search. We discuss how to select the prior and demonstrate in on- and off-model experiments with large language models that the method achieves as high a likelihood as beam search with much fewer node expansions, avoiding excessive (costly) LLM forward passes.
Submission Number: 19
Loading