System Identification as a Reinforcement Learning Problem

Jose Antonio Martin H.; Oscar Fernández Vicente; Sergio Perez; Anas Belfadil; Cristina Ibanez-Llano; Freddy Perozo; Jose Javier Valle; Javier Arechalde Pelaz

System Identification as a Reinforcement Learning Problem

Jose Antonio Martin H., Oscar Fernández Vicente, Sergio Perez, Anas Belfadil, Cristina Ibanez-Llano, Freddy Perozo, Jose Javier Valle, Javier Arechalde Pelaz

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: System Identification, Reinforcement Learning, Offline Reinforcement Learning, Forward Models

TL;DR: System Identification as a Reinforcement Learning Problem

Abstract: System identification, also known as learning forward models, transfer functions, system dynamics, etc., has a long tradition both in science and engineering in different fields. Particularly, it is a recurring theme in Reinforcement Learning research, where forward models approximate the state transition function of a Markov Decision Process by learning a mapping function from current state and action to the next state. This problem is commonly defined as a Supervised Learning problem in a direct way. This common approach faces several difficulties due to the inherent complexities of the dynamics to learn, for example, delayed effects, high non-linearity, non-stationarity, partial observability and, more important, error accumulation when using bootstrapped predictions (predictions based on past predictions), over large time horizons. Here we explore the use of Reinforcement Learning in this problem. We elaborate on why and how this problem fits naturally and sound as a Reinforcement Learning problem, and present some experimental results that demonstrate RL is a promising technique to solve these kind of problems.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

8 Replies

Loading