GenBen:A Genarative Benchmark for LLM-Aided Design

Gwok-Waa Wan; Wang yubo; SamZaak Wong; jingyi zhang; Mengnv Xing; Zhe jiang; Nan Guan; ying wang; Ning Xu; Qiang Xu; Xi Wang

GenBen:A Genarative Benchmark for LLM-Aided Design

Gwok-Waa Wan, Wang yubo, SamZaak Wong, jingyi zhang, Mengnv Xing, Zhe jiang, Nan Guan, ying wang, Ning Xu, Qiang Xu, Xi Wang

28 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: GenBen; Benchmark; LLM-Aided Design; LLM; Hardware Design

TL;DR: An Open Source Benchmark for LLM-Aided Hardware Design

Abstract: This paper introduces GenBen, a generative benchmark designed to evaluate the capabilities of large language models (LLMs) in hardware design. With the rapid advancement of LLM-aided design (LAD), it has become crucial to assess the effectiveness of these models in automating hardware design processes. Existing benchmarks primarily focus on hardware code generation and often neglect critical aspects such as Quality-of-Result (QoR) metrics, design diversity, modality, and test set contamination. GenBen is the first open-source, generative benchmark tailored for LAD that encompasses a range of tasks, from high-level architecture to low-level circuit optimization, and includes diverse, silicon-proven hardware designs. We have also designed a difficulty tiering mechanism to provide fine-grained insights into enhancements of LLM-aided designs. Through extensive evaluations of several state-of-the-art LLMs using GenBen, we reveal their strengths and weaknesses in hardware design automation. Our findings are based on 10,920 experiments and 2,160 hours of evaluation, underscoring the potential of this work to significantly advance the LAD research community. In addition, both GenBen employs an end-to-end testing infrastructure to ensure consistent and reproducible results across different LLMs. The benchmark is available at https://anonymous.4open.science/r/GENBEN-2812.

Primary Area: datasets and benchmarks

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 14237

Loading