Faster Neural Net Inference via Forests of Sparse Oblique Decision Trees

Yerlan Idelbayev; Arman Zharmagambetov; Magzhan Gabidolla; Miguel A. Carreira-Perpinan

Faster Neural Net Inference via Forests of Sparse Oblique Decision Trees

Yerlan Idelbayev, Arman Zharmagambetov, Magzhan Gabidolla, Miguel A. Carreira-Perpinan

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone

Keywords: neural network compression, tree-based compression, decision trees, decision forests

Abstract: It is widely established that large neural nets can be considerably compressed by techniques such as pruning, quantization or low-rank factorization. We show that neural nets can be further compressed by replacing layers of it with a special type of decision forest. This consists of sparse oblique trees, trained with the Tree Alternating Optimization (TAO) algorithm, using a teacher-student approach. We find we can replace the fully-connected and some convolutional layers of standard architectures with a decision forest containing very few, shallow trees so that the prediction accuracy is preserved or improved, but the number of parameters and especially the inference time is greatly reduced. For example, replacing last 7 layers of VGG16 with a single tree reduces the inference FLOPs by 7440$\times$ with a marginal increase in the test error, and a boosted ensemble of nine trees can match the network's performance while still reducing the FLOPs 6289$\times$. The idea is orthogonal to other compression approaches, which can also be used on other parts of the net not being replaced by a forest.

One-sentence Summary: Decision trees, if trained with a suitable optimization algorithm, can act as an efficient compression mechanism for neural networks.

5 Replies

Loading