PromptFE: Automated Feature Engineering by Prompting

ICLR 2026 Conference Submission24755 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Automated Feature Engineering, Large Language Models
TL;DR: A novel AutoFE framework that leverages LLMs to automatically construct features in a compact string format and generate semantic explanations utilizing dataset descriptions and performance feedback.
Abstract: Automated feature engineering (AutoFE) liberates data scientists from the burden of manual feature construction. The semantic information of datasets contains rich context information for feature engineering but has been underutilized in many existing AutoFE works. We present PromptFE, a novel AutoFE framework that leverages large language models (LLMs) to automatically construct features in a compact string format and generate semantic explanations based on dataset descriptions. By learning the performance of constructed features in context, the LLM iteratively improves feature construction. We demonstrate through experiments on real-world datasets the superior performance of PromptFE over state-of-the-art AutoFE methods. We verify the impact of dataset semantic information and provide comprehensive study on the LLM-based feature construction process.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 24755
Loading