Designing Deep Learning Programs with Large Language Models

Seokhoon Jeong; Mijung Kim; Taehwan Kim

Designing Deep Learning Programs with Large Language Models

Seokhoon Jeong, Mijung Kim, Taehwan Kim

27 Sept 2024 (modified: 21 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Program Synthesis, Large Language Model Agents, Dataset, Benchmark

TL;DR: Introduces a new task of deep learning program design and proposes corresponding datasets and benchmarks.

Abstract: The process of utilizing deep neural architectures to solve tasks differs significantly from conventional programming due to its complexity and the need for specialized knowledge. While code generation technologies have made substantial progress, their application in deep learning programs requires a distinct approach. Although previous research has shown that large language model agents perform well in areas such as data science, neural architecture search, and hyperparameter tuning, the task of proposing and refining deep neural architectures at a high level remains largely unexplored. Current methods for automating the synthesis of deep learning programs often rely on basic code templates or API calls, which restrict the solution space to predefined architectures. In this paper, we aim to bridge the gap between traditional code generation and deep learning program synthesis by introducing the task of Deep Learning Program Design (DLPD), a task of designing an effective deep learning program for the task, along with appropriate architectures and techniques. We propose Deep Ones, a comprehensive solution for DLPD. Our solution includes a large-scale dataset and a lightweight benchmark specifically designed for DLPD. On our benchmark, Llama-3.1 8B, fine-tuned on our dataset, demonstrates better architecture suggestion capability than GPT-4o and better performance than Claude-3.5-Sonnet, showcasing that Deep Ones effectively addresses the challenge of DLPD. Deep Ones will be publicly available, including the dataset, benchmark, codes, and model weights.

Supplementary Material: zip

Primary Area: datasets and benchmarks

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9950

Loading