Learn Bullish Moves via EigenCluster Tokens

ICLR 2026 Conference Submission677 Authors

01 Sept 2025 (modified: 23 Dec 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Time Series Tokenization, Financial Transformer, Bullish Signal Prediction, Eigen-Cluster Analysis, Multi-scale Representation
TL;DR: A clustering-based tokenization method for financial time series that improves bullish signal prediction in Transformers.
Abstract: Conventional tokenization schemes, such as point-wise and patch-wise methods, are poorly suited for financial time series data due to excessive token counts, sparse distributions, and heightened out-of-vocabulary risks---an issue not explicitly addressed in prior work. This paper introduces a novel tokenization approach for financial time series. By clustering scalar projections of eigenvectors from multi-window Open-High-Low-Close (OHLC) price matrices, our method generates compact and semantically meaningful tokens, enabling Transformer-based models to effectively identify next-day close price increase patterns. Extensive experiments on S\&P 500 and CSI 300 datasets show our approach outperforms market baselines by 6--9\% in precision, while reducing token vocabulary size to 51--101 tokens and sequence length by 75\% versus point-wise.
Primary Area: learning on time series and dynamical systems
Submission Number: 677
Loading