Improved Regret Bounds for Non-Convex Online-Within-Online Meta Learning

Published: 16 Jan 2024, Last Modified: 05 Mar 2024ICLR 2024 posterEveryoneRevisionsBibTeX
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: meta learning theory, non-convex online meta learning, piecewise-Lipschitz/non-Lipschitz functions, regret bounds, PAC-Bayes generalization bound
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: This paper provides improved regret bounds for non-convex online-within-online meta learning and a novel PAC-Bayes bound for multi-task learning via regret analysis.
Abstract: Online-Within-Online (OWO) meta learning stands for the online multi-task learning paradigm in which both tasks and data within each task become available in a sequential order. In this work, we study the OWO meta learning of the initialization and step size of within-task online algorithms in the non-convex setting, and provide improved regret bounds under mild assumptions of loss functions. Previous work analyzing this scenario has obtained for bounded and piecewise Lipschitz functions an averaged regret bound $O((\frac{\sqrt{m}}{T^{1/4}}+\frac{(\log{m})\log{T}}{\sqrt{T}}+V)\sqrt{m})$ across $T$ tasks, with $m$ iterations per task and $V$ the task similarity. Our first contribution is to modify the existing non-convex OWO meta learning algorithm and improve the regret bound to $O((\frac{1}{T^{1/2-\alpha}}+\frac{(\log{T})^{9/2}}{T}+V)\sqrt{m})$, for any $\alpha \in (0,1/2)$. The derived bound has a faster convergence rate with respect to $T$, and guarantees a vanishing task-averaged regret with respect to $m$ (for any fixed $T$). Then, we propose a new algorithm of regret $O((\frac{\log{T}}{T}+V)\sqrt{m})$ for non-convex OWO meta learning. This regret bound exhibits a better asymptotic performance than previous ones, and holds for any bounded (not necessarily Lipschitz) loss functions. Besides the improved regret bounds, our contributions include investigating how to attain generalization bounds for statistical meta learning via regret analysis. Specifically, by online-to-batch arguments, we achieve a transfer risk bound for batch meta learning that assumes all tasks are drawn from a distribution. Moreover, by connecting multi-task generalization error with task-averaged regret, we develop for statistical multi-task learning a novel PAC-Bayes generalization error bound that involves our regret bound for OWO meta learning.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 1092
Loading