RTL-OPT: Rethinking the Generation of PPA-Optimized RTL Code and A New Benchmark

15 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: RTL optimization, LLM for hardware design, benchmark dataset, Electronic Design Automation, VLSI Design
Abstract: The rapid advancements of AI rely on the support of integrated circuits (ICs). Recently, large language models (LLMs) have been increasingly explored in the generation of IC designs, mostly in Register-Transfer Level (RTL) code format, such as Verilog or VHDL. However, most existing benchmarks focus primarily on the accuracy of RTL code generation, rather than the optimization of IC design quality in terms of power, performance, and area (PPA). This work critically examines RTL optimization benchmarks and highlights the challenges of assessing RTL code quality. Our findings show that optimization assessments are complex and existing works yield misleading results, as the perceived superiority of RTL code often depends on the downstream synthesis tool and setup. To address these issues, we introduce RTL-OPT, a benchmark comprising 36 digital IC designs handcrafted by our human designers. These designs incorporate diverse optimization patterns derived from proven industry-standard RTL practices. Such optimization opportunities are not utilized by automated downstream logic synthesis, making them meaningful RTL code improvements. In addition, RTL-OPT covers a wide range of RTL implementation types, including combinational logic, pipelined datapath, finite-state machines, and memory interfaces, making it sufficiently representative. For each design task, RTL-OPT provides a pair of RTL codes: a carefully designed suboptimal (i.e., to-be-optimized) RTL code and an optimized RTL code as the golden reference. LLMs are expected to take the suboptimal RTL code as input, then generate a more optimized RTL code that leads to better ultimate PPA quality. The golden references, as a comparison baseline, reflect optimizations at the human-expert level. RTL-OPT further provides an integrated evaluation framework to automatically verify functional correctness and quantify PPA improvements of the LLM-optimized RTL code. This framework enables a standardized assessment of generative AI's ability in hardware design optimization. RTL-OPT is available at https://anonymous.4open.science/r/RTL-OPT-20C5.
Primary Area: datasets and benchmarks
Submission Number: 5946
Loading