Error Propagation in Dynamic Programming: From Stochastic Control to Option Pricing

Andrea Della Vecchia; Damir Filipovic

Error Propagation in Dynamic Programming: From Stochastic Control to Option Pricing

Andrea Della Vecchia, Damir Filipovic

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: dynamic programming, stochastic optimal control, kernel methods, statistical learning theory, american options pricing

TL;DR: We analyze how errors propagate in kernel-based dynamic programming for stochastic control, with American option pricing as a case study

Abstract: This paper investigates theoretical and methodological foundations for stochastic optimal control (SOC) in discrete time. We start formulating the control problem in a general dynamic programming framework, introducing the mathematical structure needed for a detailed convergence analysis. The associate value function is estimated through a sequence of approximations combining nonparametric regression methods and Monte Carlo subsampling. The regression step is performed within reproducing kernel Hilbert spaces (RKHSs), exploiting the classical KRR algorithm, while Monte Carlo sampling methods are introduced to estimate the continuation value. To assess the accuracy of our value function estimator, we propose a natural error decomposition and rigorously control the resulting error terms at each time step. We then analyze how this error propagates backward in time-from maturity to the initial stage-a relatively underexplored aspect of the SOC literature. Finally, we illustrate how our analysis naturally applies to a key financial application: the pricing of American options.

Primary Area: learning theory

Submission Number: 13220

Loading