Keywords: multivariate time series, deep & cross networks, linear attention
Abstract: Many multivariate forecasters model additive effects well but miss non-additive interactions among temporal bases, variables, and exogenous drivers, which harms long-horizon accuracy and attribution. We present time-series interaction machine (${TIM}$), an all-MLP forecaster designed from the ANOVA/Hoeffding target: the regression function is decomposed into main effects and an orthogonal interaction component. TIM assigns the interaction to a DCN-style cross stack that explicitly synthesizes bounded-degree polynomial crosses with controllable CP rank, while lightweight branches capture main effects. Axis-wise linear self-attention (time and variables) transports information without increasing polynomial degree and maintains linear time and memory complexity. A decomposition regularizer encourages orthogonality and yields per-component attributions. We establish degree and rank guarantees and a risk identity showing that the additive error gap equals the energy of the interaction subspace. TIM achieves state-of-the-art accuracy on long-term benchmarks with clear cross-term interpretability.
Primary Area: learning on time series and dynamical systems
Submission Number: 25513
Loading