RASTeR: Robust, Agentic, and Structured Temporal Reasoning

RASTeR: Robust, Agentic, and Structured Temporal Reasoning

ACL ARR 2025 July Submission1026 Authors

29 Jul 2025 (modified: 06 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Temporal question answering (TQA) remains a persistent challenge for large language models (LLMs), particularly in retrieval-augmented generation (RAG) settings where retrieved content may be irrelevant, outdated, or temporally inconsistent. This is especially critical in applications like clinical event ordering, policy tracking, and real-time decision-making, which require reliable temporal reasoning even under noisy or misleading context. To address this challenge, we introduce RASTeR: Robust, Agentic, and Structured, Temporal Reasoning, an agentic prompting framework that separates context evaluation from answer generation. RASTeR first assesses the relevance and temporal coherence of retrieved context, then constructs a structured temporal knowledge graph (TKG) to better facilitate reasoning. When inconsistencies are detected, RASTeR selectively corrects or discards context before generating an answer. Across multiple datasets and LLMs, RASTeR consistently improves robustness: defined here as the model's ability to generate correct predictions despite suboptimal context. We further validate our approach through a ``needle-in-the-haystack'' study, in which relevant context is buried among irrelevant distractors. Even with forty distractors, RASTeR achieves 75% accuracy, compared to the runner-up model, which reaches only 62%.

Paper Type: Long

Research Area: Question Answering

Research Area Keywords: temporal question answering, retrieval-augmented generation, temporal robustness, temporal knowledge graph, structured reasoning, parametric knowledge, context evaluation, model robustness

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis

Languages Studied: English

Previous URL: https://openreview.net/forum?id=WaGrn3IF98

Explanation Of Revisions PDF: pdf

Reassignment Request Area Chair: Yes, I want a different area chair for our submission

Reassignment Request Reviewers: Yes, I want a different set of reviewers

Justification For Not Keeping Action Editor Or Reviewers: The paper has changed a lot since the original submission last year. We want fresh eyes.

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: N/A

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: See Results section

B2 Discuss The License For Artifacts: N/A

B3 Artifact Use Consistent With Intended Use: Yes

B3 Elaboration: Yes, we describe all standard datasets in the Results section with their intended use.

B4 Data Contains Personally Identifying Info Or Offensive Content: No

B4 Elaboration: We used open access datasets in Results section.

B5 Documentation Of Artifacts: N/A

B6 Statistics For Data: Yes

B6 Elaboration: Yes, we provide basic stats for the datasets in the Appendix.

C Computational Experiments: Yes

C1 Model Size And Budget: Yes

C1 Elaboration: We discuss model size of all models in method and results section.

C2 Experimental Setup And Hyperparameters: Yes

C2 Elaboration: See Results and Appendix

C3 Descriptive Statistics: Yes

C3 Elaboration: See Appendix

C4 Parameters For Packages: N/A

D Human Subjects Including Annotators: No

D1 Instructions Given To Participants: N/A

D2 Recruitment And Payment: N/A

D3 Data Consent: N/A

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: N/A

E Ai Assistants In Research Or Writing: Yes

E1 Information About Use Of Ai Assistants: No

E1 Elaboration: We just used it to check and fix writing mistakes

Author Submission Checklist: yes

Submission Number: 1026

Loading