Keywords: AI safety, adversarial attack, large language model, time series forecasting
TL;DR: This study introduces a novel token disruption attack, exploiting vulnerabilities in LLM-based time series forecasting by strategically manipulating the tokenization process.
Abstract: Although Large Language Models (LLMs) have demonstrated substantial potential as powerful zero-shot time series forecasters, recent evidence shows that even small adversarial perturbations can significantly degrade their performance under strict black-box settings. However, existing attacks typically rely on repeated queries to the target LLM forecaster, making them easily detectable and anomalous in real-world scenarios. To overcome this limitation, we introduce the Token Disruption Attack (TDA) that generates perturbations by solely querying the local tokenizer rather than directly interfering with the model. We first formulate the attack as a non-convex optimization problem that maximizes the divergence in encodings produced by the target tokenizer, and then design a dynamic programming–based method to solve it efficiently. By injecting subtle perturbations into the raw time series, TDA induces substantial distortions during tokenization, which subsequently propagate through the model and ultimately result in severe forecasting errors. Extensive experiments on ten LLM-based and two non-LLM-based forecasters across six applications demonstrate that minor perturbations can cause large downstream distortions, leading to forecasting errors that increase by nearly 20%.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 3946
Loading