Abstract: System software bridges hardware platforms and high-level applications. As new hardware platforms emerge, developers must customize code to support various system software, a process known as “retargeting”. This process is time-consuming and poorly automated. While large language models (LLMs) are proficient in general code generation tasks, their effectiveness in retargeting is limited by code complexity and abstract function descriptions. This paper presents TeSyn, a novel framework to enhance the code generation capabilities for system software retargeting. TeSyn comprises three steps: target-specific value extraction, common code clustering, and template synthesis. To evaluate TeSyn's effectiveness, we intro-duce SysRetar, the first dataset for system software retargeting, covering four types of system software and 195 hardware platforms. In our experiments, we select five LLMs and fine-tune CodeLLaMA-7B-Instruct on SysRetar to create SysRetar-LLM. Results show that TeSyn significantly enhances retargeting performance across five LLMs. Furthermore, code generated by SysRetar- LLM requires substantially less modification than the manual retargeting approach (Fork-Flow), suggesting potential improvements in efficiency. Given these promising results, we outline future research directions for advancing retargeting through LLMs. The dataset and code are publicly available at https://huggingface.co/doczll05/SysRetar-LLM.
Loading