Abstract: We propose a novel scheme for Android malware detection. The scheme has two extremely fast phases. First term-frequency simhashing (tf-simhashing) extracts a fixed sized vector for each binary file. The hashing algorithm embeds the frequency of n-grams of bytes into the output vector which can be reshaped into an image representation. In the second phase, we propose a convolutional extreme learning machine (CELM) learns to distinguish between hashes of malicious and clean files as a two class classification task. This scalable scheme is extremely fast in both learning and predicting. The results show that tf-simhashing in an image-shape representation together with CELM provides better performance than three non-parametric models and one state-of-the-art parametric model.
0 Replies
Loading