AscendKernelGen: LLM-Driven Kernel Generation for NPUs

AscendKernelGen: LLM-Driven Kernel Generation for NPUs

ACL ARR 2026 January Submission1454 Authors

30 Dec 2025 (modified: 07 Jun 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, Kernel Generation, Neural Processing Units (NPUs)

Abstract: Neural Processing Units (NPUs) are critical for AI infrastructure, yet developing kernels remains a bottleneck due to the complexity of vendor-specific Domain-Specific Languages (DSLs). While LLMs excel in general coding, they fail to meet the stringent constraints of NPU development, showing a near-zero success rate on complex kernels in our preliminary study. To address these challenges, we present AscendKernelGen, the first comprehensive framework for NPU kernel development, marking a pioneering effort in this field. This framework consists of three interconnected components: (1) Ascend-CoT, the first dataset in the NPU kernel domain that incorporates chain-of-thought reasoning from real-world kernel implementations; (2) KernelGen-LM, a domain-adaptive model trained on this novel dataset using supervised fine-tuning and reinforcement learning; and (3) NPUKernelBench, the first benchmark platform designed to evaluate the compilation, correctness, and performance of generated NPU kernels. Experimental results demonstrate that our approach dramatically bridges the gap in hardware-specific coding: compilation success on complex Level-2 kernels improves from 0\% to 95.5\% (Pass@10), with 64\% functional correctness. AscendKernGen is available at \href{https://anonymous.4open.science/r/NPUKernelBench-45C2/}

Paper Type: Long

Research Area: Code Models

Research Area Keywords: code generation, code reasoning, evaluation of code models, compiler-assisted modeling, safety and reliability of code models

Contribution Types: NLP engineering experiment, Data resources, Data analysis

Languages Studied: English, Chinese

Submission Number: 1454

Loading