Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System

ICLR 2025 Conference Submission13477 Authors

28 Sept 2024 (modified: 13 Oct 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: llm agent, multi-agent, inference scaling law
TL;DR: We present Optima, a framework for training LLM-based multi-agent systems that boosts communication efficiency and task effectiveness. Using iterative training, we achieve significant token reduction and performance gains across diverse tasks.
Abstract: Large Language Model (LLM) based multi-agent systems (MAS) show remarkable potential in collaborative problem-solving, yet they still face critical challenges: low communication efficiency, poor scalability, and a lack of effective parameter-updating optimization methods for multi-agent collaboration. We present Optima, a novel framework that addresses these issues by significantly enhancing both communication efficiency and task effectiveness in LLM-based MAS through LLM training. At its core, Optima employs an iterative generate, rank, select, and train paradigm, incorporating a reward function that balances task performance, token efficiency, and communication readability. We explore various RL algorithms, including Supervised Fine-Tuning, Direct Preference Optimization, and their hybrid approaches, providing insights into their effectiveness-efficiency trade-offs for iterative LLM-based MAS training. Additionally, we integrate Monte Carlo Tree Search-inspired techniques for DPO data generation, conceptualizing conversation turns as tree nodes to explore diverse interaction trajectories. We evaluate Optima on common multi-agent tasks, including information-asymmetric question answering and complex reasoning. Our method demonstrates consistent and substantial improvements over single-agent baselines and vanilla MAS based on Llama 3 8B, achieving up to 2.8x performance gain with less than 10\% tokens on tasks requiring heavy multi-agent information exchange. Moreover, Optima's efficiency gains open new possibilities for leveraging inference-compute more effectively, potentially leading to improved inference-time scaling laws. By addressing fundamental challenges in multi-agent collaboration and providing a novel optimization framework, Optima shows the potential towards scalable, efficient, and effective LLM-based MAS.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13477
Loading