Dueling Convex Optimization with General Preferences

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: convex optimization, dueling feedback, comparison, preference, transfer function, smooth convex functions, convergence rate, strongly convex functions
Abstract: We address the problem of convex optimization with dueling feedback, where the goal is to minimize a convex function given a weaker form of \emph{dueling} feedback. Each query consists of two points and the dueling feedback returns a (noisy) single-bit binary comparison of the function values of the two queried points. The translation of the function values to the single comparison bit is through a \emph{transfer function}. This problem has been addressed previously for some restricted classes of transfer functions, but here we consider a very general transfer function class which includes all functions that admit a series expansion about the origin. Our main contribution is an efficient algorithm with convergence rate of $O(\epsilon^{-4p})$ for smooth convex functions, and an optimal rate of $\widetilde O(\epsilon^{-2p})$ when the objective is both smooth and strongly convex, where $p$ is the minimal degree (with a non-zero coefficient) in the transfer's series expansion about the origin.
Lay Summary: We study how to solve optimization problems in situations where we can't directly measure how good each option is, but can only make noisy comparisons between pairs of options—like asking someone which of two choices they prefer, without knowing why. This kind of feedback is called "dueling feedback." While earlier research has only worked for limited types of such comparisons, our work allows for a much broader and more realistic range of comparison styles. We design a new method that can efficiently find the best choice even when these comparisons are noisy and indirect. Our approach works especially well when the problem has a smooth and well-behaved structure, in fact, optimally when the underlying function is smooth and strongly convex. This work pushes the boundaries of optimization with pairwise comparisons, offering a powerful and general solution for real-world decision-making with limited, noisy relative feedback.
Primary Area: Theory->Online Learning and Bandits
Keywords: convex optimization, dueling feedback, comparison feedback, preference learning, transfer function, convergence rate
Submission Number: 8849
Loading