Abstract: Interpretable models can have advantages over black-box models, and interpretability is essential for the application of machine learning in critical settings, such as aviation or medicine. In this work, we introduce the LASSO-Clip-EN (LCEN) algorithm for nonlinear, interpretable feature selection and machine learning modeling. LCEN is tested on a wide variety of artificial and empirical datasets, creating sparse and frequently more accurate models than other methods, including sparse, nonlinear methods, on tested datasets. LCEN is robust against many issues typically present in datasets and modeling, including noise, multicollinearity, and data scarcity. As a feature selection algorithm, LCEN matches or surpasses the thresholded elastic net but is, on average, 10.3-fold faster based on our experiments. LCEN for feature selection can also rediscover multiple physical laws from empirical data. As a machine learning algorithm, when tested on processes with no known physical laws, LCEN achieves better results than many other dense and sparse methods --- including being comparable to or better than ANNs on multiple datasets.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Stephen_Becker1
Submission Number: 5392
Loading