Abstract: The commoditization of private data has become an attractive research topic with the emergence of Big Data era. In this paper, we study the trading of high-dimensional private data with differential privacy guarantee. We propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Cheap</i> , which is a novel Correlated data trading framework for High-dimEnsionAl Private data. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Cheap</i> first models data correlations among high-dimensional user attributes, and builds an initial attribute clustering scheme. Combined with this scheme, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Cheap</i> devises a novel data perturbation mechanism by solving optimal attribute clustering ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">OAC</i> ) problem, in order to improve data utility of traded data and further generate a privacy-preserving high-dimensional dataset with close joint distribution with the original one. It then quantifies privacy loss based on near-optimal attribute cluster scheme due to the NP-hardness of the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">OAC</i> problem, and further compensates data owners by running auction in a cost-effective way. We evaluate the performance of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Cheap</i> on <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">UserBehavior</i> dataset and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Obesity</i> dataset, respectively. Our evaluation and analysis demonstrate that <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Cheap</i> well balances data utility and privacy protection, and achieves all desired economic properties of budget balance, individual rationality and truthfulness.
0 Replies
Loading