TrendRep: A Long Context Embedding-Based Trend Representation for Weak Signal Detection

ACL ARR 2025 February Submission2621 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Weak signal detection traditionally relies on counting-based representations of the data, tracking feature frequencies, such as keywords or topics, over time. However, these methods struggle with adaptability and often fail to detect trends at an early stage. In this work, we propose TrendRep, a novel embedding-based trend representation that leverages long context embeddings to encode richer semantics within time windows, providing a more robust and adaptable approach to weak signal detection. To evaluate TrendRep, we construct a new dataset and introduce a quantitative evaluation framework with defined ground truth and key performance metrics. Experimental results show that TrendRep outperforms conventional approaches, demonstrating the effectiveness of embedding-based representations and highlighting the potential of long context embeddings for weak signal detection.\footnote{The implementation of TrendRep and the Trends2025 dataset are available at \url{https://anonymous.4open.science/r/TrendRep-EE16}. We will make the repository public upon acceptance.}
Paper Type: Long
Research Area: Information Retrieval and Text Mining
Research Area Keywords: document representation; weak signal detection; emerging trend detection
Contribution Types: NLP engineering experiment, Data resources, Data analysis
Languages Studied: English
Submission Number: 2621
Loading