Differentially Private Reward Estimation from Preference Based Feedback

Sayak Ray Chowdhury; Xingyu Zhou

Differentially Private Reward Estimation from Preference Based Feedback

Sayak Ray Chowdhury, Xingyu Zhou

Published: 29 Jun 2023, Last Modified: 04 Oct 2023MFPL PosterEveryoneRevisionsBibTeX

Keywords: differential privacy, estimation, preference feedback, reward learning

TL;DR: We present the first results on parameter estimation from preference-based feedback under label DP constraints

Abstract: Preference-based reinforcement learning (RL) has gained attention as a promising approach to align learning algorithms with human interests in various domains. Instead of relying on numerical rewards, preference-based RL uses feedback from human labelers in the form of pairwise or $K$-wise comparisons between actions. In this paper, we focus on reward learning in preference-based RL and address the issue of estimating unknown parameters while protecting privacy. We propose two estimators based on the Randomized Response strategy that ensure label differential privacy. The first estimator utilizes maximum likelihood estimation (MLE), while the second estimator employs stochastic gradient descent (SGD). We demonstrate that both estimators achieve an estimation error of $\widetilde O(1/\sqrt{n})$ with $n$ number of samples. The additional cost of ensuring privacy for human labelers is proportional to $\frac{e^\epsilon +1 }{e^\epsilon -1}$ in the best case, where $\epsilon>0$ is the privacy

Submission Number: 48

Loading