On the Computational Complexity of Private High-dimensional Model Selection

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Best Subset Selection, Differential Privacy, Exponential Mechanism, Metropolis-Hastings, Model Consistency, Variable Selection.
TL;DR: This paper proposes computationally efficient Metropolis-Hastings algorithm for private best subset selection in high-dimensional sparse regression setup.
Abstract: We consider the problem of model selection in a high-dimensional sparse linear regression model under privacy constraints. We propose a differentially private (DP) best subset selection method with strong statistical utility properties by adopting the well-known exponential mechanism for selecting the best model. To achieve computational expediency, we propose an efficient Metropolis-Hastings algorithm and under certain regularity conditions, we establish that it enjoys polynomial mixing time to its stationary distribution. As a result, we also establish both approximate differential privacy and statistical utility for the estimates of the mixed Metropolis-Hastings chain. Finally, we perform some illustrative experiments on simulated data showing that our algorithm can quickly identify active features under reasonable privacy budget constraints.
Primary Area: Privacy
Submission Number: 11529
Loading