Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution

Jiahao Qiu; Xuan Qi; Tongcheng Zhang; Xinzhe Juan; Jiacheng Guo; Yifu Lu; Yimin Wang; Zixin Yao; Qihan Ren; Dongrui Liu; Ling Yang; Yue Wu; Shilong Liu; xun jiang; Kaixuan Huang; Hongru WANG; Mengdi Wang

Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution

Jiahao Qiu, Xuan Qi, Tongcheng Zhang, Xinzhe Juan, Jiacheng Guo, Yifu Lu, Yimin Wang, Zixin Yao, Qihan Ren, Dongrui Liu, Ling Yang, Yue Wu, Shilong Liu, xun jiang, Kaixuan Huang, Hongru WANG, Mengdi Wang

19 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Generalist agent, Self-evolution

Abstract: Recent advances in large language models (LLMs) have enabled agents to autonomously perform complex, open-ended tasks. However, many existing frameworks depend heavily on manually predefined tools and workflows, which hinder their adaptability, scalability, and generalization across domains. In this work, we introduce $\textbf{Alita}$—a generalist agent designed with the principle of $\textit{Simplicity is the ultimate sophistication,}$ enabling scalable agentic reasoning through $\textit{minimal predefinition}$ and $\textit{maximal self-evolution}$. For minimal predefinition, Alita is equipped with only one component for direct problem-solving, making it much simpler and neater than previous approaches that relied heavily on hand-crafted, elaborate tools and workflows. This clean design enhances its potential to generalize to challenging questions, without being limited by tools. For $\textit{Maximal self-evolution}$, we enable the creativity of Alita by providing a suite of general-purpose components to autonomously construct, refine, and reuse external capabilities by generating task-related model context protocols (MCPs) from open source, which contributes to scalable agentic reasoning. Notably, Alita achieves 72.73\% pass@1 and 86.06\% pass@3 accuracy, which ranks top 1 among all open-source frameworks temporarily, on the GAIA benchmark, 74.00\% and 52.00\% pass@1, respectively, on Mathvista and PathVQA, outperforming many agent systems with far greater complexity. Our code is open-sourced.

Primary Area: generative models

Submission Number: 15521

Loading