Faster Parameter-Free Regret Matching Algorithms

Faster Parameter-Free Regret Matching Algorithms

ICLR 2026 Conference Submission16429 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Regret Matching, Parameter-Free, Nash Equilibrium

TL;DR: We propose the first parameter-free Regret Matching algorithm achieving an $O(1/T)$ convergence rate in learning Nash equilibra.

Abstract: Regret Matching (RM) and its variants are widely employed to learn a Nash equilibrium (NE) in large-scale games. However, most existing research only establishes a theoretical convergence rate of $O(1/\sqrt{T})$ for these algorithms in learning an NE. Recent studies have shown that smooth RM$^+$ variants, the advanced variants of RM, can achieve an improved convergence rate of $O(1/T)$. Despite this improvement, smooth RM$^+$ variants lose the parameter-free property, i.e., no parameters that need to be tuned, a highly desirable feature in practical applications. In this paper, we propose a novel smooth RM$^+$ variant called Monotone Increasing Smooth Predictive Regret Matching$^+$ (MI-SPRM$^+$), which retains the parameter-free property while still achieving a theoretical convergence rate of $O(1/T)$. To achieve these properties, MI-SPRM$^+$ employs a technology called Adaptive Regret Domain (ARD), which ensures that the lower bound for the 1-norm of accumulated regrets increases monotonically by adjusting the decision space at each iteration. This design is motivated by the observation that the range of step-sizes supporting the $O(1/T)$ convergence rate in existing smooth RM$^+$ variants is contingent on the lower bound for the 1-norm of accumulated regrets. Experimental results confirm that MI-SPRM$^+$ empirically attains an $O(1/T)$ convergence rate.

Primary Area: learning theory

Submission Number: 16429

Loading