Keywords: dynamic discrete choice models, IRL
Abstract: In dynamic discrete choice models, a commonly studied problem is estimating parameters of agent reward functions (also known as “structural” parameters) using agent behavioral data. This task is also known as inverse reinforcement learning. Maximum likelihood estimation for such models requires dynamic programming, which is limited by the curse of dimensionality [Bellman, 1957]. In this work, we present a novel algorithm that provides a data-driven method for selecting and aggregating states, which lowers the computational and sample complexity of estimation. Our method works in two stages. First, we estimate agent Qfunctions, and leverage them alongside a clustering algorithm to select a subset of states that are most pivotal for driving changes in Q-functions. Second, with these selected "aggregated" states, we conduct maximum likelihood estimation using a popular nested fixed-point algorithm [Rust, 1987]. The proposed two-stage approach mitigates the curse of dimensionality by reducing the problem dimension. Theoretically, we derive finite-sample bounds on the associated estimation error, which also characterize the trade-off of computational complexity, estimation error, and sample complexity. We demonstrate the empirical performance of the algorithm in two classic dynamic discrete choice estimation applications.
Supplementary Material: pdf
Other Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/a-data-driven-state-aggregation-approach-for/code)
0 Replies
Loading