DRIFT: DATA REDUCTION VIA INFORMATIVE FEATURE TRANSFORMATION – GENERALIZATION BEGINS BEFORE DEEP LEARNING STARTS
Keywords: Feature representation, Neural network, Dimensionality reduction, Generalization gap.
Abstract: Despite the remarkable optimization power of modern deep neural networks, robust generalization
remains critically dependent on the quality of input representations. High-dimensional pixel data
is plagued by noise, redundancy, and spurious correlations that hinder stable learning and widen
the train-test generalization gap. We introduce DRIFT (Data Reduction via Informative Feature
Transformation), a lightweight, physics-informed preprocessing method that reinterprets images as
static displacement fields of a thin elastic plate under simply supported boundary conditions. By
projecting each image onto the analytically derived orthogonal basis of vibrational mode shapes,
low-frequency sinusoidal patterns governed by the biharmonic equation, DRIFT yields compact,
interpretable, and intrinsically smooth features that emphasize energetically dominant spatial deformations
while suppressing high-frequency noise. Extensive experiments on MNIST, CIFAR100, and
CelebA demonstrate that DRIFT enables classifiers to achieve equal or superior test accuracy compared
to raw pixels, PCA, DCT, and convolutional autoencoders, while using dramatically fewer
features. DRIFT consistently exhibits smaller generalization gaps, smoother training trajectories,
and markedly reduced sensitivity to noise perturbations. These gains arise from the physical prior
of smoothness and boundary compatibility, which imposes an explicit inductive bias toward generalizable,
low-energy image structure. To our knowledge, DRIFT is the first method to successfully
leverage classical vibration mode analysis for machine learning feature extraction, opening a principled,
data-efficient avenue for physics-informed representation learning
Supplementary Material: pdf
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 543
Loading