Towards General Function Approximation in Nonstationary Reinforcement Learning

Songtao Feng; Ming Yin; Ruiquan Huang; Yu-Xiang Wang; Jing Yang; Yingbin Liang

Towards General Function Approximation in Nonstationary Reinforcement Learning

Songtao Feng, Ming Yin, Ruiquan Huang, Yu-Xiang Wang, Jing Yang, Yingbin Liang

Published: 01 Jan 2024, Last Modified: 16 May 2025ISIT 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Function approximation has experienced significant success in the field of reinforcement learning (RL). Despite a handful of progress on developing theory for Nonstationary RL with function approximation under structural assumptions, existing work for nonstationary RL with general function approximation is still limited. In this work, we propose a UCB-type of algorithm LSVI-Nonstationary following the popular least-square-value-iteration (LSVI) framework. LSVI-Nonstationary features the restart mechanism and a new design of bonus term to handle nonstationarity, and performs no worse than the existing confidence-set based algorithm SW-OPEA in [1], which has been shown to outperform the existing algorithms for nonstationary linear and tabular MDPs in the small variation budget setting.

Loading