A Multi-Signal Graph-Based Approach for Real-World Event Detection in News Articles

Published: 16 Mar 2026, Last Modified: 06 May 2026PreprintEveryoneRevisionsCC BY 4.0
Abstract: Identifying relationships between news articles in order to cluster them into real- world events is a fundamental task for analyzing the news media ecosystem. Many existing approaches rely primarily on semantic similarity, which can lead to incorrect groupingswhenarticlessharesimilartopicsbutrefertodifferentevents. Inthiswork, we propose a multi-signal graph-based pipeline that integrates several sources of in- formationtobettermodelrelationshipsbetweennewsarticles. Usingagold-standard dataset of worldwide news events, the proposed method extracts multiple similarity signals, including semantic representations and entity-based information, which are combined to compare articles and identify event-level relationships. Experimental evaluation demonstrates that the proposed pipeline significantly improves clustering performance compared to a semantic similarity baseline and traditional approaches to this task. The method achieves 94.7% homogeneity and 85.8% completeness while maintaining 89.4% article coverage in the final clusters. These results indicate that combining multiple signals enables more accurate identification of relationships be- tween articles, leading to more reliable clustering of news into meaningful real-world events.
Loading