VIPer: Iterative Value-Aware Model Learning on the Value Improvement PathDownload PDF

28 May 2022, 15:36 (modified: 14 Jun 2022, 18:44)DARL 2022Readers: Everyone
Keywords: decision aware model learning, representation learning, model based reinforcement learning, mbrl, value improvement path, value equivalent model
TL;DR: A work-in-progress report that connects the idea of value-equivalent model learning with insights from the representation learning community to build robust value aware models on meaningful value function sets
Abstract: We propose a practical and generalizable Decision-Aware Model-Based Reinforcement Learning algorithm. We extend the frameworks of VAML (Farahmand et al., 2017) and IterVAML (Farahmand, 2018), which have been shown to be difficult to scale to high-dimensional and continuous environments (Lovatto et al., 2020a; Modhe et al., 2021; Voelcker et al., 2022). We propose to use the notion of the Value Improvement Path (Dabney et al., 2020) to improve the generalization of VAML-like model learning. We show theoretically for linear and tabular spaces that our proposed algorithm is sensible, justifying extension to non-linear and continuous spaces. We also present a detailed implementation proposal based on these ideas.
0 Replies