forester: A Novel Approach to Accessible and Interpretable AutoML for Tree-Based ModelingDownload PDF

Published: 15 Aug 2023, Last Modified: 15 Aug 2023AutoML 2023 (ABCD Track) asistoworkshopReaders: Everyone
TL;DR: The AutoML package in R is designed for tabular data regression and binary classification tasks, which focuses on the simplicity of use and interpretability of the results.
Abstract: The majority of AutoML solutions are developed in Python. However, a large percentage of data scientists are associated with the R language. Unfortunately, there are limited R solutions available with high entry level which means they are not accessible to everyone. To fill this gap, we present the $\textit{forester}$ package, which offers ease of use regardless of the user's proficiency in the area of machine learning. The $\textit{forester}$ package is an open-source AutoML package implemented in R designed for training high-quality tree-based models on tabular data. It supports regression and binary classification tasks. A single line of code allows the use of unprocessed datasets, informs about potential issues concerning them, and handles feature engineering automatically. Moreover, hyperparameter tuning is performed by Bayesian optimization, which provides high-quality outcomes. The results are later served as a ranked list of models. Finally, the $\textit{forester}$ package offers a vast training report, including the ranked list, a comparison of trained models, and explanations for the best one.
Keywords: machine learning, automated machine learning, tree-based models, automated reporting
Abcd Fit: Applications
Submission Checklist: Yes
Broader Impact Statement: Yes
Paper Availability And License: Yes
Code Of Conduct: Yes
CPU Hours: 0
GPU Hours: 0
TPU Hours: 0
5 Replies

Loading