Efficient k-Anonymization through Constrained Collaborative Clustering

Sarah Zouinina, Nistor Grozavu, Younès Bennani, Abdelouahid Lyhyaoui, Nicoleta Rogovschi

Published: 2018, Last Modified: 15 May 2025SSCI 2018EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The problem with anonymization is to provide a balance between the amount of the information omitted from a data set and the complete disclosure of individual identities. In this paper, we introduce a novel technique to anonymize data using topological collaborative clustering and constrained clustering. The main idea behind the paper is to provide anonymous data sets without extensive hand engineering. To do so use a clustering based on the Self Organizing Map (SOM) and instead of identifying only the best matching unit (BMU) of the input, we determine a linear mixture of the reference vectors of the SOM that approximates the input vector the most we then use ak-constrained SOM to provide ak anonymous data set.