A Set-Sequence Model for Time Series

Published: 07 Mar 2025, Last Modified: 07 Apr 2025OpenReview Archive Direct UploadEveryoneCC BY 4.0
Abstract: In many financial prediction problems the time-series behavior of a unit of interest (a loan, bond, share) depends on observed unit-level factors and macro-economic variables as well as latent cross-sectional features. For example, the behavior of a mortgage borrower depends on factors such as the borrower's credit score, market interest rates, and the behavior of other borrowers in a local neighborhood (e.g. contagion effects). Predicting the time-series behavior of a unit requires estimating the latent cross-sectional features to augment the other ones, and then estimating outcomes given this augmented feature set. This is the approach we propose in our \emph{Set-Sequence model} wherein the Set model first estimates a shared cross-sectional summary at each time and the Sequence model then consumes the summary-augmented time series for each unit independently to predict its outcome. The entire model is learned jointly over arbitrary sets sampled during training. Our approach harnesses the set nature of the cross-section, is computationally efficient as the set summaries are generated in linear time over the number of units, and eliminates the need to hand-craft cross-sectional features. It is flexible in allowing the use of existing sequence models and in allowing a variable number of units at inference. On a synthetic task mimicking the complex real-world behavior of borrowers, the model showcases near-optimal performance and the set-level summaries are found to closely track the true joint effect (correlation of 95\%). On a real-world mortgage dataset of over 5 million loan-month samples, our model outperforms baselines by 4 AUC points and shows a 67\% correlation between the learned set representation and the lagged foreclosure rate, a known source of cross-sectional dependence. Code will be released.
Loading