Investigating the Ability of Large Language Models to Explain Causal Relationships in Time Series Data

Elizabeth Healey; Isaac S. Kohane

Investigating the Ability of Large Language Models to Explain Causal Relationships in Time Series Data

Elizabeth Healey, Isaac S. Kohane

Published: 10 Oct 2024, Last Modified: 07 Dec 2024CaLM @NeurIPS 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLMs, time series

Abstract: Large language models (LLMs) can enhance how individuals interact with and process information from large amounts of data. In many settings, the ability to explain the causal reasons behind observations in data is important. In this work, we investigate the ability of LLMs to provide accurate explanations about causal relationships in time series data. We generated synthetic datasets based on three distinct directed acyclic graphs (DAGs) representing causal relationships between multiple time series variables, and we evaluated how state-of-the-art LLMs answer questions related to causal effects within the observed data. Initially, we used abstract variable names in the analysis and later assigned real-world meanings to these variables to align with the DAG structures. We tested how accurately the LLMs identified the variables that caused specific observations in an outcome variable and found shortcomings with state-of-the-art models. We highlight challenges and opportunities for research in this space.

Submission Number: 44

Loading