Keywords: agent collaboration, cognitive scaffolding, emergent behavior, tool use, MCP Tools
TL;DR: AI agents given collaborative tools (journaling/social media) naturally developed adaptive strategies, some models used broad articulation while stronger models selectively retrieved information, achieving 15-40% improvements on hard tasks.
Abstract: We investigate whether giving LLM agents the collaborative tools and autonomy that humans naturally use for problem-solving can improve their performance, providing Claude Code agents with MCP-based social media and journaling tools and the flexibility to use them as they see fit. Across 3 experimental runs for each variant across 34 Aider Polyglot Python programming challenges totaling 1,428 solved challenges, collaborative tools substantially improve challenging problem performance, delivering 15–40\% cost reductions, 12–27\% fewer turns, and 12–38\% faster completion compared to baseline agents. Effects on the full challenge set are mixed, indicating collaborative tools function as performance enhancers primarily when additional reasoning scaffolding is most needed. Surprisingly, different models naturally adopted distinct collaborative strategies without explicit instruction. Sonnet 3.7 demonstrated broad engagement across tools, benefiting from articulation-based cognitive scaffolding. Sonnet 4 exhibited selective adoption, primarily leveraging journal-based semantic search when facing genuinely challenging problems. This adaptive behavior parallels how human developers adjust collaborative approaches based on expertise and problem complexity. Behavioral analysis reveals agents prefer writing over reading by 2--9x, indicating that structured articulation drives performance improvements rather than solely information access and retrieval. Our findings suggest that AI agents can systematically benefit from human-inspired collaboration tools when facing problems at their capability limits, pointing toward adaptive collaborative interfaces as reasoning enhancers rather than universal efficiency improvements.
Primary Area: applications to robotics, autonomy, planning
Submission Number: 14073
Loading