FD-Loss: Supervised Feature Decorrelation as a Scale-Invariant Replacement for Random Dropout

Ashraf Hamid Mojumder; Noor Ahmad Faiz Khan; Khandaker Mohammad Mohi Uddin

FD-Loss: Supervised Feature Decorrelation as a Scale-Invariant Replacement for Random Dropout

Ashraf Hamid Mojumder, Noor Ahmad Faiz Khan, Khandaker Mohammad Mohi Uddin

Published: 14 Jun 2026, Last Modified: 21 Jun 2026ICML 2026 Workshop MusIML PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Neural Network Regularization, Feature Decorrelation, Cross-Correlation Matrix, Dropout Techniques, Representation Learning

Abstract: Standard random dropout regularizes neural networks by stochastically deactivating units, yet remains fundamentally blind to representational redundancy: when two neurons converge on identical features, masking one does not generate a corrective gradient toward diversity. We propose Feature Decorrelation Loss (FD-Loss), a supervised regularization objective that explicitly penalizes the off-diagonal entries of the per-featurenormalized cross-correlation matrix of hidden activations. A mandatory per-feature ℓ2 normalization step resolves the gradient instability that caused prior covariance penalties (e.g., DeCov) to diverge on unscaled tabular data, bounding all correlation values to [−1, +1]. Extensive evaluation across 20 datasets spanning tabular, image, and text domains shows that FD-Loss achieves a 65% win rate over dropout, with accuracy improvements up to +5.35 pp on correlated tabular benchmarks and +4.12 pp on complex visual hierarchies, while incurring negligible computational overhead.

Track: Track 2: ML Research by Muslim Authors

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Non Archival Confirmation: I understand that submissions to MusIML are non-archival and can be submitted to other venues.

Submission Number: 14

Loading