Keywords: model merging, training-free merging, LLM agents, transfer learning, domain generalization, robustness, activation tracing, neuron transplantation, functional neurons, multi-task generalization, out-of-domain evaluation
Abstract: Interactive large language model agents have advanced rapidly, but most remain specialized to a single environment and fail to adapt robustly to other environments.
Model merging offers a training-free alternative by integrating multiple experts into a single model.
In this paper, we propose Agent-Role Merging (ARM), an activation-guided, role-conditioned neuron transplantation method for model merging in LLM agents.
ARM improves existing merging methods from static natural language tasks to multi-turn agent scenarios, and over the generalization ability across various interactive environments.
This is achieved with a well designed 3-step framework: 1) constructing merged backbones, 2) selection based on it role-conditioned activation analysis, and 3) neuron transplantation for fine-grained refinements.
Without gradient-based optimization, ARM improves cross-benchmark generalization while enjoys efficiency.Across diverse domains, the model obtained via ARM merging outperforms prior model merging methods and domain-specific expert models, while demonstrating strong out-of-domain generalization.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: Model merging, Tool use, Interactive agents, Multi-turn decision making, Domain generalization, Robustness, Activation tracing
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Submission Number: 8141
Loading