PDFormer: Progressive Dual-Head Transformer for Behavioral Choice Prediction

PDFormer: Progressive Dual-Head Transformer for Behavioral Choice Prediction

ICLR 2026 Conference Submission18435 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Behavioral choice prediction, Progressive Dual-Head Transformer, Urban Mobility, Tabular data

Abstract: Many applications require joint prediction of interdependent behavioral choices, yet existing models often treat each choice independently (e.g., through parallel prediction heads), overlooking the influence of one on the other. In this work, we propose Progressive Dual-Head Transformer (PDFormer), a novel framework that performs two-step prediction: the model first estimates one choice and then conditions the second on this upstream estimate through an explicit head-to-head pathway. A shared encoder captures the common structure of two prediction tasks, while the dual-head module explicitly reflect cross-choice dependence. A gated residual mechanism integrated into the embedding layer and the dual-head modules further improves the training stability and the prediction performance. Extensive experiments on an urban mobility behavioral choice dataset and a real-world manufacturing dataset demonstrate that PDFormer consistently outperforms state-of-the-art machine learning models, deep tabular models, as well as parallel-head Transformer variants across multiple metrics. Moreover, our ablation study confirms that both the proposed progressive dual-head and gated residual mechanism are key contributors to the observed gains in different prediction tasks.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 18435

Loading