AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes

AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes

ICLR 2026 Conference Submission15179 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Agent Distillation, Training-free Capability Transfer, Model–Context–Protocols

Abstract: While knowledge distillation has become a mature field for compressing large language models (LLMs) into smaller ones by aligning their outputs or internal representations, the distillation of LLM-based agents, which involve planning, memory, and tool use, remains relatively underexplored. Existing agent distillation methods typically replay full teacher trajectories or imitate step-by-step teacher tool usage, but they often struggle to train student agents to dynamically plan and act in novel environments. We propose AgentDistill, a novel, training-free agent distillation framework that enables efficient and scalable knowledge transfer via direct reusage of Model–Context–Protocols (MCPs)—structured and reusable task-solving modules autonomously generated by teacher agents. The reuse of these distilled MCPs enables student agents to generalize their capabilities across domains and solve new problems with minimal supervision or human intervention. Experiments on biomedical and mathematical benchmarks demonstrate that our distilled student agents with small language models can achieve performance comparable to advanced systems with strong LLMs such as OctoTools (GPT-4o), highlighting the effectiveness of our framework in building scalable and cost-efficient intelligent agents.

Primary Area: generative models

Submission Number: 15179

Loading