Predictability Enables Parallelization of Nonlinear State Space Models

Xavier Gonzalez; Leo Kozachkov; David M. Zoltowski; Kenneth L. Clarkson; Scott Linderman

Predictability Enables Parallelization of Nonlinear State Space Models

Xavier Gonzalez, Leo Kozachkov, David M. Zoltowski, Kenneth L. Clarkson, Scott Linderman

Published: 18 Sept 2025, Last Modified: 21 Apr 2026NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: dynamical systems, optimization, largest Lyapunov exponent, contraction analysis, RNNs, Newton's method, Parallel algorithms, Scalability, Numerical Stability, state-space models, DEER

TL;DR: Nonlinear systems whose future behavior is not overly sensitive to small perturbations can be efficiently parallelized; whereas unpredictable dynamical systems cannot be efficiently parallelized.

Abstract: The rise of parallel computing hardware has made it increasingly important to understand which nonlinear state space models can be efficiently parallelized. Recent advances like DEER and DeepPCR recast sequential evaluation as a parallelizable optimization problem, sometimes yielding dramatic speedups. However, the factors governing the difficulty of these optimization problems remained unclear, limiting broader adoption. In this work, we establish a precise relationship between a system's dynamics and the conditioning of its corresponding optimization problem, as measured by its Polyak-Łojasiewicz (PL) constant. We show that the predictability of a system, defined as the degree to which small perturbations in state influence future behavior and quantified by the largest Lyapunov exponent (LLE), impacts the number of optimization steps required for evaluation. For predictable systems, the state trajectory can be computed in at worst $\mathcal{O}((\log T)^2)$ time, where $T$ is the sequence length: a major improvement over the conventional sequential approach. In contrast, chaotic or unpredictable systems exhibit poor conditioning, with the consequence that parallel evaluation converges too slowly to be useful. Importantly, our theoretical analysis shows that predictable systems always yield well-conditioned optimization problems, whereas unpredictable systems lead to severe conditioning degradation. We validate our claims through extensive experiments, providing practical guidance on when nonlinear dynamical systems can be efficiently parallelized. We highlight predictability as a key design principle for parallelizable models.

Primary Area: Optimization (e.g., convex and non-convex, stochastic, robust)

Submission Number: 24846

Loading