Abstract: One of the severest threats to cyber security is botnet, which typically uses domain names generated by Domain Generation Algorithms (DGAs) to communicate with their Command and Control (C&C) infrastructure. DGA detection and classification play an important role of assisting cyber security researchers to detect botnet C&C servers. However, many of the existing DGA detection models only focus on single scale word embedding method, and very few models are specially designed to extract more effective features for DGA detection from multiple scales word embedding. To alleviate above questions, first we propose a hybrid word embedding method, which combines character level embedding and bigram level embedding to make full use of the domain names information, and then, we design a deep neural network with hybrid embedding method to distinguish DGA domains from known legitimate domains. Finally, we evaluate our hybrid embedding method and the proposed model on ONIST dataset and compare our methods with several state-of-the-art DGA classification methods.
Loading