Keywords: Large Language Models; Deep Research; Isolation; Agent Systems
Abstract: Web search agent frameworks built on Large Language Models (LLMs) can leverage multi-source, real-time external information, demonstrating strong potential. However, for deep research tasks that require the integration of massive information, existing agent frameworks not only generate substantial noise during search and context management, which affects the output of the final answer, but also suffer from the inherent limitations of relying solely on web browsing for information acquisition. To address the noise generated during the search process, we design a new web search filtering module that can isolate irrelevant webpages at an early stage. To mitigate the contextual noise that impacts the final answer, we introduce an isolation-based stepwise verification module, which reduces LLM contextual bias and ensures more neutral and reliable outputs. Moreover, to expand information acquisition and address the challenge of insufficient reasoning, we introduce a new agent that enables scalable information acquisition through APIs and programmatic automation, while also supporting efficient deep reasoning. Experimental results show that on the GAIA and WebWalkerQA benchmarks, IAgent achieves state-of-the-art performance among open-source deep research agents. Furthermore, IAgent reduces token cost by 33\% compared to our baseline.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 9168
Loading