LIT-LVM: Structured Regularization for Interaction Terms in Linear Predictors using Latent Variable Models
Abstract: Some of the simplest, yet most frequently used predictors in statistics and machine learning use weighted linear combinations of features. Such linear predictors can model non-linear relationships between features by adding interaction terms corresponding to the products of all pairs of features. We consider the problem of accurately estimating coefficients for interaction terms in linear predictors. We hypothesize that the coefficients for different interaction terms have an approximate low-dimensional structure and represent each feature by a latent vector in a low-dimensional space. This low-dimensional representation can be viewed as a structured regularization approach that further mitigates overfitting in high-dimensional settings beyond standard regularizers such as the lasso and elastic net. We demonstrate that our approach, called LIT-LVM, achieves superior prediction accuracy compared to the elastic net, hierarchical lasso, and factorization machines on a wide variety of simulated and real data, particularly when the number of interaction terms is high compared to the number of samples. LIT-LVM also provides low-dimensional latent representations for features that are useful for visualizing and analyzing their relationships.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: 1. We have replaced the link to the anonymized code repository in Section 4 with a link to a de-anonymized and publicly accessible repository: https://github.com/MLNS-Lab/LIT-LVM
2. We have added discussion of the hierarchical lasso (Bien et al., 2013) in the introduction and related work sections. We also added experiment results using the hierarchical lasso on the OpenML datasets in Tables 1 and 2 along with discussion in Section 5.1.
3. We renamed Section 2.2.3 of the related work to "Factorization Machines (FMs) and Related Models" and added discussion of factorized structured regression (Rügamer et al., 2022) and additive higher-order factorization machines (Rügamer, 2024) and how they relate to our proposed work.
4. We provided more information on the baseline methods we use for comparison in Section 5.1, arranging the models in increasing order of flexibility, and expanded the discussion on hyperparameter tuning, which we moved to Section C.1.2 of the appendix.
Code: https://github.com/MLNS-Lab/LIT-LVM
Supplementary Material: zip
Assigned Action Editor: ~David_Rügamer1
Submission Number: 5087
Loading