Hidden units for tabular data representing intervals

Jongga Lee; Changwon Lim; Minjong Kim

Hidden units for tabular data representing intervals

Jongga Lee, Changwon Lim, Minjong Kim

18 Sept 2025 (modified: 19 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Neural Network, Hidden units, Tabular data, Generalization

TL;DR: Non-affine hidden units that resemble intervals, tailored for tabular data learning

Abstract: Tree-based boosting remains a strong baseline for tabular data, partly because standard neural units impose overly smooth inductive biases. We revisit exponential-centered units (ExU) through the lens of Lipschitz bounds and introduce Double-centered Units (DcU), which parameterize soft intervals via learnable left/right centers and preserve informative gradients outside the interval. Building on DcU, we propose the Soft Interval Neural Network (SINN)—an encoder-MLP architecture with max pooling and interval sparsity regularization. Across 15 public datasets, SINN delivers competitive or superior accuracy to XGBoost on classification, while performance on regression is more mixed; we hypothesize that this gap reflects the implicit bias of neural networks. We further examine common generalization proxies—spectral/Lipschitz bounds, Hessian-based flatness, and dropout-based ensembling—and find that smoothness-oriented regularization is not consistently predictive of tabular performance. These results suggest that non-affine, interval-like representations provide a useful inductive bias for tabular classification, and motivate theoretical analyses beyond affine assumptions.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 10145

Loading