Effective truth discovery under local differential privacy by leveraging noise-aware probabilistic estimation and fusion
Abstract: Truth discovery is an effective way to eliminate data inconsistency by integrating different workerprovided
values. Although directly conducting non-private truth discovery approaches based on
uploaded noisy values after adding Laplace noise for continuous inputs guarantees rigorous local
differential privacy (LDP), it may result in poor performance due to the lot of contained noise. First, the
injected noise for privacy protection randomly sampled from Laplace distribution may be excessive
even with a large privacy budget, as the above distribution is unbounded and drops sharply with
respect to the x-axis. Built-in Gaussian noise also usually exists within these uploaded noisy values,
which may also have a negative effect on the aggregated truths under LDP and makes the problem
investigated in this paper far more challenging. In this paper, we focus on obtaining accurate truths in
the above cases under rigorous LDP for continuous inputs, and present a novel solution TESLA. The key
idea of this solution is that we let injected noise for privacy protection and inherent Gaussian noise
only weakly negatively affect the weight estimation and true aggregation. In particular, we design a
runtime filtering mechanism (RFM) to obtain the supremum and infimum for the values after adding
Laplace noise by considering these two types of noise together. Moreover, we develop a probabilistic
fusion mechanism (PFM) to get the fused values by adaptively using the obtained supremum and
infimum. Furthermore, we devise a probabilistic weight mechanism (PWM) to obtain a more accurate
weight for each worker. Therefore, truth discovery can be conducted based on the new weight of
each worker and the filtered values. We provide theoretical analyses of TESLA’s utility, privacy and
complexity. Experimental results demonstrate the effectiveness and efficiency of TESLA. We also extend
and verify TESLA over typical mean estimation as well as standard deviation calculation, and various
machine learning tasks (e.g., logistic regression, support vector machine (SVM) and neural network).
Experimental results also demonstrate its superiority.
0 Replies
Loading