Synthesizing Feature Extractors: An Agentic Approach for Algorithm Selection

Synthesizing Feature Extractors: An Agentic Approach for Algorithm Selection

ICLR 2026 Conference Submission24962 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: constraint solving, algorithm selection, LLM, combinatorial optimization, feature extraction

Abstract: Feature engineering remains a critical bottleneck in machine learning, often requiring significant manual effort and domain expertise. While end-to-end deep learning models can automate this process by learning latent representations, they do so at the cost of interpretability. We propose a gray-box paradigm for automated feature engineering that leverages Large Language Models for program synthesis. Our framework treats the LLM as a meta-learner that, given a high-level problem description for constraint optimization, generates executable Python scripts that function as interpretable feature extractors. These scripts construct symbolic graph representations and calculate structural properties, combining the generative power of LLMs with the transparency of classical features. We validate our approach on algorithm selection across 227 combinatorial problem classes. Our synthesized feature extractors achieve 58.8\% accuracy, significantly outperforming the 48.6 \% of human-engineered extractors, establishing program synthesis as an effective approach to automating the ML pipeline.

Supplementary Material: zip

Primary Area: optimization

Submission Number: 24962

Loading