AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions

Ziming Li; Qianbo Zang; David Ma; Jiawei Guo; Tianyu Zheng; minghao liu; Xinyao Niu; Xiang Yue; Yue Wang; Jian Yang; Jiaheng Liu; Wanjun Zhong; Wangchunshu Zhou; Wenhao Huang; Ge Zhang

AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions

Ziming Li, Qianbo Zang, David Ma, Jiawei Guo, Tianyu Zheng, minghao liu, Xinyao Niu, Xiang Yue, Yue Wang, Jian Yang, Jiaheng Liu, Wanjun Zhong, Wangchunshu Zhou, Wenhao Huang, Ge Zhang

28 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, language agents, multi-agent

TL;DR: We propose AutoKaggleMaster, a robust and user-friendly framework that solves Kaggle problems through a multi-agent collaborative system.

Abstract: Data science competitions on Kaggle, which represent real-world programming challenges, require sophisticated problem-solving approaches. While LLM-based agents demonstrate potential in various fields, their application to data science tasks often falls short due to difficulties in adapting to data changes in multi-stage reasoning and the need for precise reasoning. To address this, we propose AutoKaggle, a robust and user-centric framework that solves Kaggle problems through a collaborative multi-agent cooperative system. AutoKaggle implements an iterative development process that combines code interpretation, debugging, and comprehensive unit testing covering over 30 tests, ensuring code correctness and quality through LLM-based evaluation. It prioritizes user experience by generating detailed reports that elucidate feature engineering processes, data transformations, model selection criteria, and the reasoning behind each decision. It offers customizable workflows, allowing users to intervene and modify each stage of the process, thus combining the advantages of automated intelligence with human expertise. Additionally, we build a universal data science tool library, including carefully verified functions for data cleaning, feature engineering, and modeling, which form the foundation of this solution. We evaluate the framework on 8 carefully selected Kaggle competitions, achieve 83.8\% in average completion rate and 42.8\% average rank in Kaggle.

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 14027

Loading