Backtracking for More Efficient Large Scale Dynamic Programming

Charles Tripp, Ross D. Shachter

2012 (modified: 14 Jun 2022)ICMLA (1) 2012Readers: Everyone

Abstract: Reinforcement learning algorithms are widely used to generate policies for complex Markov decision processes. We introduce backtracking, a modification to reinforcement learning algorithms that can significantly improve their performance, particularly for off-line policy generation. Backtracking waits to perform update calculations until the successor's value has been updated, allowing immediate reuse of update calculations. We demonstrate the effectiveness of backtracking on two benchmark processes using both Q-learning and real-time dynamic programming.

0 Replies