Keywords: AI agents, automation, computer vision
Abstract: Computer vision is a critical component in a wide range of real-world applications, including plant monitoring in agriculture and handwriting classification in digital systems. However, developing high-quality computer vision systems traditionally requires both machine learning (ML) expertise and domain-specific knowledge, making the process labor-intensive, costly, and inaccessible to many. To address these challenges, we introduce AutoModel, an LLM agent framework that autonomously builds and optimizes image classification models. By leveraging the collaboration of specialized LLM agents, AutoModel removes the need for ML practitioners or domain experts for model development, streamlining the process and democratizing image classification. In this work, we evaluate AutoModel across a diverse range of datasets consisting of varying sizes and domains, including standard benchmarks and Kaggle competition datasets, demonstrating that it consistently outperforms zero-shot LLM-generated pipelines and achieves human practitioner-level performance.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4118
Loading