ENHANCE THE DYNAMIC REGRET VIA OPTIMISMDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Keywords: online convex optimization, dynamic regret upper bound, normalized exponentiated gradient, adaptive trick
Abstract: In this paper, we study the enhancement method for dynamic regret in online convex optimization. Existing works have shown that adaptive learning for dynamic environment (Ader) enjoys an $O\big(\sqrt{\left(1+P_T\right)T}\,\big)$ dynamic regret upper bound, where $T$ is the number of rounds and $P_T$ is the path length of the reference strategy sequence. The basic idea of Ader is to maintain a group of experts, where each expert obtains the best dynamic regret of a specific path length by running Mirror Descent (MD) with specific parameter, and then tracks the best expert by Normalized Exponentiated Subgradient (NES). However, Ader is not environmental adaptive. By introducing the estimated linear loss function $\widehat{x}_{t}^*$, the dynamic regret for Optimistic Mirror Descent (OMD) is tighter than MD if the environment is not completely adversarial and $\widehat{x}_{t}^*$ is well-estimated. Based on the fact that optimism can enhance dynamic regret, we develop an algorithm to replace MD and NES in Ader with OMD and Optimistic Normalized Exponentiated Subgradient (ONES) respectively, and utilize the adaptive trick to achieve $O\big(\sqrt {\left(1+P_T\right)M_T}\,\big)$ dynamic regret upper bound, where $M_T\leqslant O\left(T\right)$ is a measure of estimation accuracy. In particular, if $\widehat{x}_t^*\in\partial\widehat{\varphi}_t$, where $\widehat{\varphi}_t$ represents the estimated convex loss function and $\partial\widehat{\varphi}_t$ is Lipschitz continuous, then the dynamic regret upper bound of OMD has a subgradient variation type. Based on this fact, we develop a variant algorithm whose upper bound has a subgradient variation type. All our algorithms are environmental adaptive.
6 Replies

Loading