Keywords: generative models, autoregressive models, state space models, memory, generalization
Abstract: Autoregressive generative models must sometimes continue from histories containing bindings that are transformed by subsequent causal operations, where extrapolative success may reflect fixed-coordinate memorization, generic recurrent capacity, or an inductive bias for transported structure. We introduce SHiPPO (Sylvester HiPPO), a pathwise transported online-projection memory prior that lifts HiPPO-style coefficient memories to a moving channel frame. For any fixed or realized right-transport path, SHiPPO jointly transports the channel metric and approximation family, so the coefficient state is ordinary HiPPO in a tied moving frame and obeys Sylvester dynamics. To instantiate this prior in selective sequence layers, we derive a restricted group-local realization with controller-compatible right transport, exponential-adjusted updates, exact block-affine scan, and a collapse criterion for simultaneously reducible right-action families. On Transport-MQAR, a finite-field multi-query associative recall (MQAR) diagnostic for transported recall under length extrapolation, full-split SHiPPO improves coordinate-wise recovery over structural controls, while the Generic multi-input multi-output (MIMO) control remains competitive and stronger on exact recovery. We therefore position SHiPPO as a structured memory prior and diagnostic object for studying when autoregressive models generalize by transporting memories rather than memorizing fixed-coordinate associations.
Submission Number: 214
Loading