TIME: Temporally Intelligent Meta-reasoning Engine for Context Triggered Explicit Reasoning

ACL ARR 2026 January Submission9243 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Adaptive Reasoning, Reasoning Efficiency, Model Interpretability, Reasoning Control, Behavioral Alignment, Temporal Reasoning, Evaluation Benchmarks
Abstract: Reasoning oriented large language models often expose explicit “thinking” as long, turn-global traces at the start of every response, either always on or toggled externally at inference time. While useful for arithmetic, programming, and problem solving, this design is costly, blurs claim-level auditability, and cannot re-trigger explicit reasoning once the model begins presenting. Dialogue models are also largely blind to temporal structure, treating replies after seconds and replies after weeks as equivalent unless time is stated in text. We introduce TIME, the Temporally Intelligent Meta-reasoning Engine, a behavioral alignment framework that treats explicit reasoning as a context-sensitive resource driven by discourse and temporal cues. TIME augments dialogue with optional ISO 8601 ⟨time⟩ tags, tick turns that represent silent gaps, and short ⟨think⟩ blocks that can appear anywhere in a reply. A four-phase curriculum, including a small, maximally diverse full-batch alignment step, trains Qwen3 dense models to invoke brief, in-place reasoning bursts and keep user-facing text compact. We evaluate with TimeBench, a temporally grounded dialogue benchmark probing chronology, commonsense under gaps and offsets, anomaly detection, and continuity. Across 4B to 32B scales, TIME improves TimeBench scores over base Qwen3 in both thinking and no-thinking modes while reducing reasoning tokens by about an order of magnitude.
Paper Type: Long
Research Area: Discourse, Pragmatics, and Reasoning
Research Area Keywords: dialogue, conversation, discourse-level inference, pragmatic inference and reasoning, inter-sentential reasoning, coherence, communication
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Submission Number: 9243
Loading