Abstract: In this work, we develop machine learning algorithms for pricing used vehicles for B2B sales at Volkswagen Financial Services. The B2B pricing dataset is structured as Tabular data, however unlike commonly available tabular datasets, it is also time-indexed. To exploit the temporal component and improve prediction accuracy, we design a Sequential network that processes sequences of vehicles. We transform the data from a Tabular representation to a sequential representation by appending previously sold vehicles and their prices as additional features for a target vehicle to be priced. The sequential network, PriceNet, embeds each of the multiple vehicles through a dedicated Embedding module, and uses a series of Convolutional layers that learn sequential features relating to price trends and seasonalities. We show that PriceNet can improve performance over the state-of the-art deep-learning based Tabular baselines, Tab-Transformer and FT-Transformer.Additionally, this paper also covers related aspects such as Chronological Validation strategies for hyperparameter tuning models for time-indexed data. Notably, we also observed that Gradient Boosted Decision Tree algorithms, outperformed all the models in prediction accuracy. For this class of models, we therefore also designed a new Quantile level tuning approach that tunes the quantile level based on Out-of-sample chronological validation data. By tuning the quantile level, we can probabilistically determine whether to overshoot or undershoot in the case of temporal covariate shifts for out-of-sample testing data observed after the validation split.
Loading