LLM-OAP: LLM-based Data Augmentation Framework for Enhancing Order Acceptance Prediction in Mobility-on-Demand Systems

LLM-OAP: LLM-based Data Augmentation Framework for Enhancing Order Acceptance Prediction in Mobility-on-Demand Systems

ICLR 2026 Conference Submission17621 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Mobility, LLM, Data Augmentation

Abstract: In Mobility-on-Demand (MoD) systems, drivers’ order acceptance behaviour directly influences matching, pricing, and thus overall system efficiency. Traditional discrete choice models rely on pre-specified utility functions and error structures. This introduces specification risk and limits their ability to capture complex nonlinear interactions, and correlated choices, reducing effectiveness for modelling driver decisions. Meanwhile, behavioural data often come from stated-preference (SP) surveys; these datasets are typically small-scale and based on hypothetical responses, which can be subjective and limit external validity, reducing predictive performance and generalisability. This paper proposes LLM-OAP, a novel framework that integrates large language model (LLM)-based data augmentation with machine learning (ML) to improve the estimation of drivers' order acceptance behaviour. Our method leverages an LLM to generate synthetic samples based on the real SP data and employs a curation scheme to mitigate implausibility, reduce bias, and maintain diversity. The augmented dataset is used to train ML models beyond fixed utility specifications. Evaluations on two types of SP datasets (covering full- and limited-information settings) show that our framework significantly enhances the performance of state-of-the-art ML models in order acceptance behaviour estimation, while maintaining good generalizability and explainability.

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 17621

Loading