AutoTable: Effective and Efficient Automated Feature Transformation for Tabular Data

Shanshan Huang, Junpeng Zhu, Fengyan Zhang, Peng Cai, Qiwen Dong

Published: 2024, Last Modified: 26 Jul 2025WISE (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Mission-critical data are commonly organized as tables within relational databases, and feature transformation from these tabular data is a pivotal component of the machine learning pipeline for business intelligence. However, automating this process poses a significant challenge, primarily due to the exponential growth in the size of the search space with an increase in the number of features (i.e., columns or dimensions of the table) and transform functions. This paper presents AutoTable, an effective and efficient feature transformation framework for tabular data using reinforcement learning. Specifically, AutoTable formulates the feature transformation problem as a search process carried out on a transformation tree, which offers a more well-structured search space and facilitates fine-grained exploration, empowering domain experts to perform AFT tasks with minimal statistical and machine learning expertise. To further improve search performance, we propose merging and lazy loading mechanisms. Experimental results demonstrate that AutoTable outperforms state-of-the-art approaches in terms of both efficiency and effectiveness.