Deep Neural Networks for Malicious JavaScript Detection Using Bytecode Sequences

Muhammad Fakhrur Rozi, Sangwook Kim, Seiichi Ozawa

Published: 01 Jan 2020, Last Modified: 07 Jun 2025IJCNN 2020EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: JavaScript is a dynamic computer programming language that has been used for various cyberattacks on client-side web applications. Malicious behaviors in JavaScript are injected on purpose as the outputs of web applications, such as redirection and pop-up texts or images. It exploits vulnerabilities by using a variety of methods such as drive-by download or cross-site scripting. To protect users from such cyberattacks, we propose a deep neural network for detecting malicious JavaScript codes by examining their bytecode sequences. We use the V8 JavaScript compiler to generate a bytecode sequence, which corresponds to an abstract form of machine codes. The benefit of using bytecode representation is that we can easily break complex obfuscation in JavaScript. To identify the attacker's malicious intention, We adopt a deep pyramid convolutional neural network (DPCNN) combining with recurrent neural network models, which can handle long-range associations in a bytecode sequence. In our experiment, various recurrent networks are testified to encode temporal features of code behaviors, and our results show that the proposed approach provides high accuracy in detection of malicious JavaScript.