Time series Augmented Generation for Financial Applications

Time series Augmented Generation for Financial Applications

ACL ARR 2025 May Submission581 Authors

14 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Financial analysis demands bridging natural language queries with complex quantitative time series (TS) computations. While Large Language Models (LLMs) excel at language, they often falter on precise numerical reasoning and grounding in volatile financial data. We propose Time Series Augmented Generation (TSAG), an alternative to conventional Retrieval-Augmented Generation (RAG) approach. TSAG relies on tool-based infrastructure provided by LangChain framework and utilizes an LLM agent to parse natural language queries, select and invoke appropriate predefined time series analysis tools, and synthesize the tool outputs into coherent, accurate responses. We implement and evaluate TSAG, initially focusing on a proof-of-concept (POC) with cryptocurrency data and robust, predefined tools for seasonality, volatility, price, and correlation analysis. We compare multiple LLM agents (Llama 3.x, Qwen2 variants, GPT-4o variants, DeepSeek-V3 API). We provide evaluation benchmark which includes set on typical questions with expected answers and framework to evaluate the LLM performance against them. The framework evaluates metrics such as Return Rate, Match Accuracy, LLM-accessed Accuracy, Hallucination Rate, and query latency as Seconds per Query (SPQ). Results demonstrate TSAG, particularly with capable agents like GPT-4o and Qwen2 (7B), achieves high levels of accuracy and low hallucination rates, validating the tool-based LLM integration approach for financial applications.

Paper Type: Long

Research Area: Dialogue and Interactive Systems

Research Area Keywords: evaluation and metrics, applications, multi-modal dialogue systems, grounded dialog, financial/business NLP, conversational QA

Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: English

Keywords: evaluation and metrics, applications, multi-modal dialogue systems, grounded dialog, financial/business NLP, conversational QA

Submission Number: 581

Loading