Digger-Guider: High-Frequency Factor Extraction for Stock Trend Prediction

Published: 01 Jan 2024, Last Modified: 06 Dec 2024IEEE Trans. Knowl. Data Eng. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recent years have witnessed increasing attention being paid to AI-based quantitative investment. Compared to traditional low-frequency data (e.g., daily, weekly), high-frequency data (e.g., minute-level) is often underutilized for low-frequency stock trend prediction, leaving the vast potential for improvement. However, valuable and noisy information coexist in high-frequency data. The learning process of high-frequency factor extractors can easily be overwhelmed by noise, leading to overfitting. Moreover, common techniques used to prevent overfitting often result in poor performance on this task since they usually roughly restrict the model’s capacity, making it challenging to model complex trading signals in high-frequency data. When designing high-frequency factor extractors, we face a tough dilemma. A high-capacity model may easily overfit to noise, while a simple but robust model may not capture complex high-frequency patterns. To address these problems, we propose maintaining model capacity while preventing overfitting by constructing two components that balance information and noise through interactions between them. Specifically, we propose a novel learning framework called Digger-Guider to extract informative stock representations from noisy high-frequency data. We develop a high-capacity model called Digger to extract local and detailed features from the high-frequency data, and we design a robust model called Guider to capture global tendency features and help the Digger overcome the noise. The Digger and Guider enhance each other through mutual distillation during training, serving as data-driven regularizations that work well on this task. Extensive experiments on real-world datasets demonstrate that our framework can produce powerful high-frequency stock factors that significantly improve stock trend prediction performance and our understanding of the finance market.
Loading