Abstract: We propose a system for privacy-aware machine learning. The data provider encodes each record in way that avoids revealing information about the record’s field values or about the ordering of values from different records. A data center stores the encoded records and uses them to perform classification on queries consisting of encoded input field values. The encoding provides privacy for the data provider from the data center and from a third party issuing unauthorized queries. But the encoding makes regression-based and many tree-based classifiers impossible to implement. It does allow histogram-type classifiers that are based on category membership, and we present one such classification method that ensures data sufficiency on a per-classification basis.
0 Replies
Loading