Stochastic Optimization Schemes for Performative Prediction with Nonconvex Loss

Published: 25 Sept 2024, Last Modified: 06 Jan 2025NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Non-convex optimization, Performative prediction, Stochastic optimization algorithm
TL;DR: This paper studies the performative prediction with smooth but possibly non-convex loss and analyzes a greedy deployment scheme with stochastic gradient descent algorithm.
Abstract: This paper studies a risk minimization problem with decision dependent data distribution. The problem pertains to the performative prediction setting in which a trained model can affect the outcome estimated by the model. Such dependency creates a feedback loop that influences the stability of optimization algorithms such as stochastic gradient descent (SGD). We present the first study on performative prediction with smooth but possibly non-convex loss. We analyze a greedy deployment scheme with SGD (SGD-GD). Note that in the literature, SGD-GD is often studied with strongly convex loss. We first propose the definition of stationary performative stable (SPS) solutions through relaxing the popular performative stable condition. We then prove that SGD-GD converges to a biased SPS solution in expectation. We consider two conditions of sensitivity on the distribution shifts: (i) the sensitivity is characterized by Wasserstein-1 distance and the loss is Lipschitz w.r.t.~data samples, or (ii) the sensitivity is characterized by total variation (TV) divergence and the loss is bounded. In both conditions, the bias levels are proportional to the stochastic gradient's variance and sensitivity level. Our analysis is extended to a lazy deployment scheme where models are deployed once per several SGD updates, and we show that it converges to an SPS solution with reduced bias. Numerical experiments corroborate our theories.
Supplementary Material: zip
Primary Area: Optimization (convex and non-convex, discrete, stochastic, robust)
Submission Number: 8303
Loading