Your LLM Agents are Temporally Blind: The Misalignment Between Tool Use Decisions and Human Time Perception
Keywords: LLM/AI agents, Temporal Alignment, LLM Multi-turn Interaction
Abstract: Large language model (LLM) agents are increasingly used to interact with and execute tasks in dynamic environments. However, a critical yet overlooked limitation of these agents is that they, by default, assume a stationary context, failing to account for the real-world time elapsed between messages. We refer to this as "temporal blindness". This limitation hinders decisions about when to invoke tools, leading agents to either over-rely on stale context and skip needed tool calls, or under-rely on it and redundantly repeat tool calls.
To study this challenge, we constructed TicToc, a diverse dataset of multi-turn user–agent message trajectories across 76 scenarios, spanning dynamic environments with high, medium, and low time sensitivity.
We collected human preferences between "calling a tool" and "directly answering" on each sample, and evaluated how well LLM tool-calling decisions align with human preferences under varying amounts of elapsed time.
Our analysis reveals that existing models display poor alignment with human temporal perception, with no models achieving a normalized alignment rate better than 65% when given time stamp information.
We also show that naive, prompt-based alignment techniques have limited effectiveness for most models, but specific post-training alignment can be a viable way to align multi-turn LLM tool use with human temporal perception.
Our data and findings provide a first step toward understanding and mitigating temporal blindness, offering insights to foster the development of more time-aware and human-aligned agents.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: LLM/AI agents, alignment
Contribution Types: Model analysis & interpretability, Data resources
Languages Studied: English
Submission Number: 7788
Loading