Abstract: Financial analysis demands bridging natural language queries with complex quantitative time series (TS) computations. While Large Language Models (LLMs) excel at language, they often falter on precise numerical reasoning and grounding in volatile financial data. We propose Time Series Augmented Generation (TSAG), an alternative to conventional Retrieval-Augmented Generation (RAG) approach. TSAG relies on tool-based infrastructure provided by LangChain framework and utilizes an LLM agent to parse natural language queries, select and invoke appropriate predefined time series analysis tools, and synthesize the tool outputs into coherent, accurate responses. We implement and evaluate TSAG, initially focusing on a proof-of-concept (POC) with cryptocurrency data and robust, predefined tools for seasonality, volatility, price, and correlation analysis. We compare multiple LLM agents (Llama 3.x, Qwen2 variants, GPT-4o variants, DeepSeek-V3 API). We provide evaluation benchmark which includes set on typical questions with expected answers and framework to evaluate the LLM performance against them. The framework evaluates metrics such as Return Rate, Match Accuracy, LLM-accessed Accuracy, Hallucination Rate, and query latency as Seconds per Query (SPQ). Results demonstrate TSAG, particularly with capable agents like GPT-4o and Qwen2 (7B), achieves high levels of accuracy and low hallucination rates, validating the tool-based LLM integration approach for financial applications.
Paper Type: Long
Research Area: Dialogue and Interactive Systems
Research Area Keywords: evaluation and metrics, applications, multi-modal dialogue systems, grounded dialog, financial/business NLP, conversational QA
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: English
Keywords: evaluation and metrics, applications, multi-modal dialogue systems, grounded dialog, financial/business NLP, conversational QA
Submission Number: 581
Loading