See it, Think it, Sorted: Multimodal Large Language Models are Few-shot Time Series Anomaly Analyzers

ACL ARR 2025 February Submission2781 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Time series anomaly detection (TSAD) has become increasing important across diverse domains. In TSAD task, while Large Language Models (LLMs) have demonstrated remarkable generalization, few-shot reasoning capabilities in time series tasks, they still fail to match the performance of task-specific methods due to the inherent numerical insensitivity of LLMs’ textual tokenizers. With the advancement of LLMs, Multimodal Large Language Models (MLLMs) have emerged as promising candidates for addressing TSAD. Leveraging their exceptional visual reasoning capabilities, MLLMs might analyze time series data by interpreting it in a visual modality, such as plotted graphs, mimicking the way humans perceive and understand visualized information. In this paper, we introduce TAMA, a novel framework that pioneers the integration of MLLMs’ image-modality reasoning capabilities into TSAD. Experimental results demonstrate that TAMA's design significantly ehances MLLMs in TSAD task, achieving state-of-the-art performance. Additionally, we contribute one of the first open-source datasets featuring both anomaly classification labels and contextual descriptions, thereby facilitating broader exploration and advancement in this critical field.
Paper Type: Long
Research Area: Special Theme (conference specific)
Research Area Keywords: Multimodal Large Language Model, NLP Application
Contribution Types: Position papers
Languages Studied: English
Submission Number: 2781
Loading