Deterministic Value Iteration for Perpetual American Put Options

Jaegi Jeon

Published: 16 Dec 2025, Last Modified: 27 Jan 2026AIMS MathematicsEveryoneCC BY 4.0

Abstract: We introduce a deterministic, policy-targeted Bellman value-iteration framework for computing the optimal exercise boundary of perpetual American put options. Our method replaces path sampling in the Bellman operator with Gauss–Hermite quadrature and employs shape-preserving interpolation for off-grid evaluations, eliminating sampling noise and reducing computational cost. Under the Black–Scholes (BS) model, our approach recovers the analytic boundary with a mean absolute percentage error below 1.5% in approximately 19–56 seconds. The resulting policy values, evaluated via Monte Carlo simulation, deviate from the analytic benchmark by less than 0.07%. For the Heston model, where no closed-form solution exists, our method produces boundaries that differ from a high-resolution finite-difference benchmark by 1–5%. Despite these boundary deviations, the expected payoffs from the policies are remarkably close, with relative policy value gaps well below 0.2%. Notably, our method computes the boundary in about 127–180 seconds, a significant speedup compared to the 2,103–3,119 seconds required by the finite-difference method. This work presents a practical and robust alternative for optimal stopping problems, offering a compelling balance of speed and accuracy, particularly when partial differential equation (PDE) solvers are cumbersome or Monte Carlo simulation is prohibitively expensive.