Keywords: deep learning, tabular data, gradient boosting, Hopfield networks, associative memory
Abstract: While Deep Learning excels in structured data as encountered in vision and natural language processing, it failed to meet its expectations on tabular data. In real world, however, most machine learning applications face tabular data with less than 10,000 samples. For tabular data, Support Vector Machines (SVMs), Random Forests, and Gradient Boosting are the best performing techniques, where Gradient Boosting has the lead. Recently, we saw a surge of Deep Learning methods that were tailored to tabular data. However, these methods still underperform compared to Gradient Boosting. We suggest "Hopular" to learn from tabular data with hundreds or thousands of samples. Hopular is a Deep Learning architecture, where each layer is equipped with continuous modern Hopfield networks. The modern Hopfield networks can store two types of data: (i) the whole training set and (ii) the feature embedding vectors of the actual input. The stored data allow the identification of feature-feature, feature-target, sample-sample, and sample-target dependencies. The stored training set enables to find similarities across input vectors and targets, while the stored actual input enables to determine dependencies between features and targets. Hopular's novelty is that the original training set and the original input are provided at each layer. Therefore, Hopular can improve the current prediction at every layer by re-accessing the original training set like standard iterative learning algorithms. In experiments on small-sized tabular datasets with less than 1,000 samples, Hopular surpasses Gradient Boosting, Random Forests, SVMs, and in particular several Deep Learning methods. In experiments on medium-sized tabular data with about 10,000 samples, Hopular outperforms XGBoost and a state-of-the art Deep Learning method designed for tabular data. Although Hopular needs more training time than Gradient Boosting, Random Forests, and SVMs, it is a strong alternative to these methods on small-sized and medium-sized tabular datasets as it yields higher performance.
One-sentence Summary: A novel deep learning architecture based on continuous modern Hopfield networks is proposed for tackling small tabular datasets.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2206.00664/code)
16 Replies
Loading