Dynamic Pricing with Monotonicity Constraint under Unknown Parametric Demand ModelDownload PDF

Published: 31 Oct 2022, Last Modified: 12 Oct 2022NeurIPS 2022 AcceptReaders: Everyone
Keywords: dynamic pricing, monotonicity constraint, multi-armed bandits, online learning, markdown pricing
TL;DR: We present optimal regret bounds for markdown pricing with unknown parametric demand functions, improving upon previous result without the parametric assumption
Abstract: We consider the Continuum Bandit problem where the goal is to find the optimal action under an unknown reward function, with an additional monotonicity constraint (or, "markdown" constraint) that requires that the action sequence be non-increasing. This problem faithfully models a natural single-product dynamic pricing problem, called "markdown pricing", where the objective is to adaptively reduce the price over a finite sales horizon to maximize expected revenues. Jia et al '21 and Chen '21 independently showed a tight $T^{3/4}$ regret bound over $T$ rounds under *minimal* assumptions of unimodality and Lipschitzness in the reward (or, "revenue") function. This bound shows that the demand learning in markdown pricing is harder than unconstrained (i.e., without the monotonicity constraint) pricing under unknown demand which suffers regret only of the order of $T^{2/3}$ under the same assumptions (Kleinberg '04). However, in practice the demand functions are usually assumed to have certain functional forms (e.g. linear or exponential), rendering the demand-learning easier and suggesting lower regret bounds. We investigate two fundamental questions, assuming the underlying demand curve comes from a given parametric family: (1) Can we improve the $T^{3/4}$ regret bound for markdown pricing, under extra assumptions on the functional forms of the demand functions? (2) Is markdown pricing still harder than unconstrained pricing, under these additional assumptions? To answer these, we introduce a concept called markdown dimension that measures the complexity of the parametric family and present tight regret bounds under this framework, thereby completely settling the aforementioned questions.
Supplementary Material: pdf
13 Replies