Graph Anonymization Using Machine Learning

Maria Laura Maag, Ludovic Denoyer, Patrick Gallinari

2014 (modified: 17 Apr 2025)AINA 2014Readers: Everyone

Abstract: Data privacy is a major problem that has to be considered before releasing datasets to the public or even to a partner company that would compute statistics or make a deep analysis of these data. This is insured by performing data anonymization as required by legislation. In this context, many different anonymization techniques have been proposed in the literature. These methods are usually specific to a particular de-anonymization procedure-or attack-one wants to avoid, and to a particular known set of characteristics that have to be preserved after the anonymization. They are difficult to use in a general context where attacks can be of different types, and where measures are not known to the anonymizer. The paper proposes a novel approach for automatically finding an anonymization procedure given a set of possible attacks and a set of measures to preserve. The approach is generic and based on machine learning techniques. It allows us to learn directly an anonymization function from a set of training data so as to optimize a trade off between privacy risk and utility loss. The algorithm thus allows one to get a good anonymization procedure for any kind of attacks, and any characteristic in a given set. Experiments made on two datasets show the effectiveness and the genericity of the approach.

0 Replies