Bayesian Optimization over Discrete Structured Inputs by Continuous Objective Relaxation

TMLR Paper5196 Authors

24 Jun 2025 (modified: 06 Jul 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: To optimize efficiently over discrete data from few available target observations is a challenge in Bayesian optimization. We propose a continuous relaxation of the objective function and show that inference and optimization is computationally tractable. The advantages are the continuous treatment of the problem and directly incorporating available prior knowledge over the inputs. Motivated by optimizing expensive biochemical properties from discrete sequences, we consider optimization with few observations and strict budgets. We leverage available and learned distributions from domain models for a weighting of the Hellinger distance, which we show to be a covariance function. Our results include a domain-model likelihood weighted kernel and acquisition function optimization with continuous and discrete algorithms. Lastly, we compare against state-of-the-art Bayesian optimization algorithms on sequence optimization tasks: 25 small-molecule tasks and two protein objectives.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~John_Timothy_Halloran1
Submission Number: 5196
Loading