Abstract: The large volume of unwanted email (spam) traffic wastes network resources. We have previously proposed SpaDeS, a method for spammer detection at the source network, which uses only network-layer metrics. We here present an extension of SpaDeS, focusing on its diversity and adaptability to new behavior patterns of spammers. To that end, we propose the use of a new active-learning-based strategy to select new, very informative, training samples, aiming at reducing the loss of effectiveness over time. The new method was applied to a real data set and the results show that, despite some variation in performance, the use of active learning to better select the training set improves the classification of legitimate users by as much as 21%, with just a small performance loss (less than 3%) in spammer classification.
Loading