Abstract: We introduce a deterministic, policy-targeted Bellman value-iteration framework for
computing the optimal exercise boundary of perpetual American put options. Our method replaces
path sampling in the Bellman operator with Gauss–Hermite quadrature and employs shape-preserving
interpolation for off-grid evaluations, eliminating sampling noise and reducing computational cost.
Under the Black–Scholes (BS) model, our approach recovers the analytic boundary with a mean
absolute percentage error below 1.5% in approximately 19–56 seconds. The resulting policy values,
evaluated via Monte Carlo simulation, deviate from the analytic benchmark by less than 0.07%. For
the Heston model, where no closed-form solution exists, our method produces boundaries that differ
from a high-resolution finite-difference benchmark by 1–5%. Despite these boundary deviations, the
expected payoffs from the policies are remarkably close, with relative policy value gaps well below
0.2%. Notably, our method computes the boundary in about 127–180 seconds, a significant speedup
compared to the 2,103–3,119 seconds required by the finite-difference method. This work presents a
practical and robust alternative for optimal stopping problems, offering a compelling balance of speed
and accuracy, particularly when partial differential equation (PDE) solvers are cumbersome or Monte
Carlo simulation is prohibitively expensive.
Loading