Abstract: Extensive labeled training data for anomaly detection is enormously expensive and often unavailable in data-sensitive applications due to privacy constraints. We propose TransForest, a transductive forest for anomaly detection, in the semi-supervised setting where few labels are available. Guided by little label information, TransForest pushes classification boundaries toward sensitive areas where abnormal and normal points are located, increasing learning capacity. Empirically, TransForest is competitive with other unsupervised and semi-supervised representative detectors given a small number of labeled points. TransForest also offers a feature importance ranking consistent with the rankings provided by popular supervised forests on low-dimensional data sets. Our code is available at https://github.com/jzha968/transForest.
Loading