Abstract: Open-ended Event Forecasting (OEEF) is vital in various real-world applications. However, it faces challenges, including limited availability of datasets that enhance LLM's predictive capabilities and crude methods of organizing forecast-related information. In this work, we construct a large-scale dataset NewsForest that contains 12,406 prediction chains reflecting the drivers of event development. To effectively extract information from the prediction background, we propose a prediction method, ForestCast. ForestCast organizes all relevant news into a story tree and predicts each branch based on the story tree. ForestCast has five main steps: (1) collecting and cleaning news, (2) clustering news into event nodes, (3) constructing the news story tree, (4) mining the semantic structure of the news story tree, (5) predicting the next node and evaluating the quality of the predictions. Experiments demonstrate that the NewsForest dataset enhances the model’s ability to predict these structures. The ForestCast method improves the accuracy and quality of predictions.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: quantitative analyses of news and/or social media
Contribution Types: Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Keywords: Future prediction, News story tree, NewsForest dataset
Submission Number: 5468
Loading