Abstract: In recent years, network security has become the main factor that threatens the development of the Internet. Among the network security threats, advanced persistent threat (APT) is one of the most representative attacks and has brought unprecedented security challenges. APT attacks mainly depend on malicious code. At present, the homology analysis of malicious code for APT mainly converts the malicious code into a gray image or semantic fragment, which is realized by pre-training models such as neural network. The effect of the method based on pre-training depends heavily on the training process of the model and the form of the data set, which may lead to misjudgment of the organization of the malicious code in an APT real-time attack. In this paper, we propose a homology analysis of malicious code for APT groups based on Asm2Vec. The basic function blocks are obtained by disassembling and removing unimportant functions from the malicious code. The semantic representation model Asm2Vec is used to analyze and find out the possible APT group for targeted malware. The experimental results show that the Energetic Bear group classification accuracy of this paper is 91.30% and the F1-Score is 95.46%.
Loading