Abstract: Lookup-table (LUT) mapping has been extensively utilized in logic synthesis, including being an indispensable step in FPGA design, serving as a building block in high-effort synthesis flows, and providing an algorithmic framework for logic optimization. Hence, a fast mapping algorithm is vital to satisfying the demand for synthesizing high-quality, large-scale modern VLSI designs. This article proposes two efficient GPU-parallel algorithms, namely LUT mapping and and-inverter graph (AIG) optimization using a precomputed database, which rely on a common parallel mapping framework that consists of novel fine-grained parallel mapping passes with high degree of parallelism. The mapping pass is enhanced by specifically tailored cut evaluation and memory management methods for GPUs that enable fast mapping of large circuits with limited GPU memory. Parallel timing analysis passes and parallel cut expansion passes are also proposed for constructing a fully GPU-accelerated LUT mapping flow. The core of parallel AIG optimization is a plugin of the mapping framework, which contains a self-adaptive parallel candidate structure evaluation procedure with high time efficiency and low hardware resource usage. Experiments show that on average, GPU LUT mapping and AIG optimization achieve $34.6\times $ and $99.9\times $ speedup with similar result quality, compared with the high-performance LUT mapper and AIG optimization algorithm with a database implemented in ABC, respectively, on large benchmarks. When combining the two algorithms with other GPU logic optimization algorithms, a GPU-based sequence targeting LUT network synthesis achieves $46.7\times $ speedup with 4.7% smaller area and 0.2% smaller delay over ABC.
External IDs:dblp:journals/tcad/LiuSCLYY25
Loading