MultiHot Embedding: A Multiple Activation Embedding Model for Numerical Features in Deep Learning

Pengfei Zhang; Zhenliang Ma; Zhenlin Qin

MultiHot Embedding: A Multiple Activation Embedding Model for Numerical Features in Deep Learning

Pengfei Zhang, Zhenliang Ma, Zhenlin Qin

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Supplementary Material: pdf

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Feature Representation, Numerical Feature, Embedding, Deep Learning

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: Numerical feature learning has long been a challenging problem in deep learning. Deep learning models exhibit sub-optimal performance in many numerical-feature-intense learning tasks. This paper proposes a simple but effective method, i.e. MultiHot Embedding, for numerical feature representation in deep learning models. The MultiHot Embedding discretizes the numerical data into bins and extends the one-hot embedding by allowing multiple activations of neighbor bits. The multiple neighbor activation mechanism enables the MultiHot Embedding to use small bin widths for discretization which overcomes the information loss problem as well as the inadequate training issue. The experiments on 6 numerical feature learning tasks validate the effectiveness and generalization capabilities of the proposed MultiHot Embedding method. Compared to the baseline models, the MultiHot Embedding model significantly improves the prediction performance. Specifically, it outperforms the state-of-the-art numerical feature representation model which has a much more complex structure. Furthermore, the sensitivity analysis shows that the MultiHot Embedding is capable of handling small width discretization width, which effectively reduces the information loss during the discretization process.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2325

Loading