Abstract: Domain names are an essential entry point for Internet access. Blocking the domain names of illegal web pages can effectively prevent the spread of illegal online content such as gambling and pornography. Due to the low efficiency and accuracy of the current illegal web page detection algorithms, on the one hand, it is not conducive to security prevention if the detection is not timely. On the other hand, the misjudgment and omission of domain name blocking will lead to adverse consequences for Internet access. Therefore, this paper focuses on domain name resolution records to improve the detection accuracy and efficiency of illegal web pages, so that domain name servers can timely block access to lousy domain names and improve the security of Internet infrastructure. Firstly, a data pre-processing process is established based on an empirical analysis of the world’s largest passive DNS database. Secondly, we propose a hybrid temporary random domain name filtering algorithm based on LSTM-ParallelCNN. Finally, an illegal web-page detection algorithm based on combined features is proposed to balance the speed and the accuracy of the detection. In the end, experiments show that our algorithms can improve the detection accuracy to 99.35% while reducing the overall detection time by about 80%.
Loading