Offline Reinforcement Learning with Differentiable Function Approximation is Provably EfficientDownload PDF


22 Sept 2022, 12:41 (modified: 18 Nov 2022, 09:19)ICLR 2023 Conference Blind SubmissionReaders: Everyone
Keywords: Reinforcement Learning Theory
Abstract: \emph{Offline reinforcement learning}, which aims at optimizing sequential decision-making strategies with historical data, has been extensively applied in real-life applications. \emph{State-Of-The-Art} algorithms usually leverage powerful function approximators (\emph{e.g.} neural networks) to alleviate the sample complexity hurdle for better empirical performances. Despite all that, a more systematic understanding of the statistical complexity for function approximation remains lacking. Towards bridging the gap, we take a step by considering offline reinforcement learning with \emph{differentiable function class approximation} (DFA). This function class naturally incorporates a wide range of models with nonlinear/nonconvex structures. Most importantly, we show offline RL with differentiable function approximation is provably efficient by analyzing the \emph{pessimistic fitted Q-learning} (PFQL) algorithm, and our results provide the theoretical basis for understanding a variety of practical heuristics that rely on Fitted Q-Iteration style design. In addition, we further improve our guarantee with a tighter instance-dependent characterization. We hope our work could draw interest in studying reinforcement learning with differentiable function approximation beyond the scope of current research.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Theory (eg, control theory, learning theory, algorithmic game theory)
10 Replies