LUMOS: Towards Language Agents that are Unified, Modular, and Open Source

Da Yin; Faeze Brahman; Abhilasha Ravichander; Khyathi Chandu; Kai-Wei Chang; Yejin Choi; Bill Yuchen Lin

LUMOS: Towards Language Agents that are Unified, Modular, and Open Source

Da Yin, Faeze Brahman, Abhilasha Ravichander, Khyathi Chandu, Kai-Wei Chang, Yejin Choi, Bill Yuchen Lin

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Supplementary Material: zip

Primary Area: representation learning for computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: language agent, interactive NLP, tool-augmented LLM

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: In this paper, we present LUMOS, **L**anguage agents with **U**nified formats, **M**odular design, and **O**pen **S**ource LLMs. LUMOS features a modular architecture consisting of planning, grounding, and execution modules built based on open-source LLMs such as LLAMA-2. The planning module decomposes a task into a sequence of high-level subgoals; the grounding module then grounds the generated subgoals to a series of low-level actions that can then be executed by the execution module. To obtain high-quality annotations for training these modules, we leverage LLMs to convert ground-truth intermediate reasoning steps in existing benchmarks into a unified format that can be used in the LUMOS framework. LUMOS achieves competitive or superior performance compared to the state of the art on a variety of complex interactive tasks. We observe: (1) LUMOS is competitive with the LLM agents that are 2 − 4× larger on maths tasks, and outperforms GPT-4/3.5-based agents on complex QA and web agent tasks; (2) LUMOS shows superior performance against open-source agent baseline formulations including chain-of-thoughts fine-tuning and unmodularized training; (3) LUMOS surpasses larger LLM-based agents on an unseen interactive task, WebShop, and achieves 5-10 reward improvement over domain-specific agents.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 8483

Loading