Logarithmic Regret in Feature-based Dynamic PricingDownload PDF

21 May 2021, 20:48 (modified: 24 Jan 2022, 18:08)NeurIPS 2021 SpotlightReaders: Everyone
Keywords: dynamic pricing, online learning, adversarial features, optimal regret, affine invariant, distribution-free.
TL;DR: We present algorithms that guarantees logarithmic (minimax) regrets in both stochastic and adversarial feature-based dynamic pricing problems with market noises.
Abstract: Feature-based dynamic pricing is an increasingly popular model of setting prices for highly differentiated products with applications in digital marketing, online sales, real estate and so on. The problem was formally studied as an online learning problem [Javanmard & Nazerzadeh, 2019] where a seller needs to propose prices on the fly for a sequence of $T$ products based on their features $x$ while having a small regret relative to the best ---"omniscient"--- pricing strategy she could have come up with in hindsight. We revisit this problem and provide two algorithms (EMLP and ONSP) for stochastic and adversarial feature settings, respectively, and prove the optimal $O(d\log{T})$ regret bounds for both. In comparison, the best existing results are $O\left(\min\left\{\frac{1}{\lambda_{\min}^2}\log{T}, \sqrt{T}\right\}\right)$ and $O(T^{2/3})$ respectively, with $\lambda_{\min}$ being the smallest eigenvalue of $\mathbb{E}[xx^T]$ that could be arbitrarily close to $0$. We also prove an $\Omega(\sqrt{T})$ information-theoretic lower bound for a slightly more general setting, which demonstrates that "knowing-the-demand-curve" leads to an exponential improvement in feature-based dynamic pricing.
Supplementary Material: pdf
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
Code: https://github.com/Xu-JY/log-regret-in-feature-based-dynamic-pricing
14 Replies