Tool Selection Bias Amplifies in Multi-turn User–Agent Interactions
Keywords: agentic AI, interpretability, bias, tool-augmented llm
TL;DR: Tool Selection Bias Amplifies in Multi-turn User–Agent Interactions
Abstract: Large language models (LLMs) are increasingly deployed as agents that interact with external tools via APIs.
However, tool selection relies on natural language descriptions and is susceptible to bias from textual variations.
Prior work has studied tool selection bias in single-turn settings, whereas its behavior in multi-turn interactions remains underexplored.
In this work, we introduce TBMT, a novel benchmark to systematically evaluate tool selection bias in multi-turn settings. Using TBMT, we investigate multi-turn tool selection bias and demonstrate that prior tool choices influence subsequent selections, leading to repeated reuse.
Furthermore, this effect persists despite intervening user-agent interactions and can even lead to incorrect tool selections.
To resolve these problems, we propose a robust bias mitigation method, TBFAIR, which identifies and perturbs neurons contributing to biased behavior.
Our approach significantly reduces bias and improves fairness by modifying only a small fraction of neurons, while preserving valid tool-calling functionality.
Track: Regular Paper (9 pages)
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 150
Loading