Abstract: In the field of cybersecurity, analyzing malicious software or programs is crucial for preventing network attacks. Malicious code often exists in a stripped binary form to thwart analysis, presenting challenges for analysts. This study investigates inferring function names from stripped binary to aid security researchers in analyzing malicious code. We propose PTGFI, a Prompt-based Two-stage Generative framework for Function name Inference. The PTGFI framework transforms the task of inferring function names into a two-stage semantic generation problem. By capturing function descriptions of assembly functions and introducing prompt learning, effective inference of function names is achieved. In experiments, PTGFI outperforms the state-of-the-art model by 2.96 % in precision. Moreover, ablation studies demonstrate the effectiveness of advanced components within the PTGFI framework. We further validate the utility and reliability of function names generated by the PTGFI framework through case studies.
Loading