A Modular Multi-task Reasoning Framework Integrating Spatio-temporal Models and LLMs

Kethmi Hirushini Hettige; Jiahao Ji; Cheng Long; Shili Xiang; Gao Cong; Jingyuan Wang

A Modular Multi-task Reasoning Framework Integrating Spatio-temporal Models and LLMs

Kethmi Hirushini Hettige, Jiahao Ji, Cheng Long, Shili Xiang, Gao Cong, Jingyuan Wang

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Spatio-temporal Reasoning, Complex Query Decomposition, Modular LLM, Function Pool

TL;DR: This paper presents STReason, a modular framework combining LLMs and spatio-temporal models for multi-task inference. It generates interpretable outputs without fine-tuning and achieves SOTA on a new reasoning dataset with strong human evaluations.

Abstract: Spatio-temporal data mining plays a pivotal role in informed decision making across diverse domains. However, existing models are often restricted to narrow tasks, lacking the capacity for multi-task inference and complex long-form reasoning that require generation of in-depth, explanatory outputs. These limitations restrict their applicability to real-world, multi-faceted decision scenarios. In this work, we introduce STReason, a novel framework that integrates the reasoning strengths of large language models (LLMs) with the analytical capabilities of spatio-temporal models for multi-task inference and execution. Without requiring task-specific finetuning, STReason leverages in-context learning to decompose complex natural language queries into modular, interpretable programs, which are then systematically executed to generate both solutions and detailed rationales. To facilitate rigorous evaluation, we construct a new benchmark dataset and propose a unified evaluation framework with metrics specifically designed for long-form spatio-temporal reasoning. Experimental results show that STReason significantly outperforms advanced LLM baselines across all metrics, particularly excelling in complex, reasoning-intensive spatio-temporal scenarios. Human evaluations further validate STReason’s credibility and practical utility, demonstrating its potential to reduce expert workload and broaden the applicability to real-world spatio-temporal tasks. We believe STReason provides a promising direction for developing more capable and generalizable spatio-temporal reasoning systems. Our code is available at: https://anonymous.4open.science/r/STReason-B0B2/

Supplementary Material: zip

Primary Area: interpretability and explainable AI

Submission Number: 18231

Loading