Stochastic Optimization Schemes for Performative Prediction with Nonconvex Loss

Qiang LI, Hoi To Wai

Published: 25 Sept 2024, Last Modified: 07 Oct 2024OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: This paper studies a risk minimization problem with decision dependent data distribution. The problem pertains to the performative prediction setting where a trained model can affect the outcome that the model estimates. Such dependency creates a feedback loop that influences the stability of optimization algorithms such as stochastic gradient descent (SGD). We present the first study on performative prediction with smooth but possibly non-convex loss. We analyze a greedy deployment scheme with SGD (SGD-GD). Note that in the literature, SGD-GD is often studied with strongly convex loss. We first propose the definition of stationary performative stable (SPS) solutions through relaxing the popular performative stable condition. We then prove that SGD-GD converges to a biased SPS solution in expectation. We consider two conditions of sensitivity on the distribution shifts: (i) the sensitivity is characterized by Wasserstein-1 distance and the loss is Lipschitz w.r.t.~data samples, or (ii) the sensitivity is characterized by $\chi^2$-divergence and the loss is bounded. In both conditions, the bias levels are proportional to stochastic gradient's variance and sensitivity level. Our analysis is extended to a lazy deployment scheme where models are deployed once per several SGD updates, and we show that it converges to a bias-free SPS solution. Numerical experiments corroborate our theories.