Privacy Estimation on Twitter: Modelling the Effect of Latent Topics on Privacy by Integrating XGBoost, Topic and Generalized Additive Models

Published: 01 Jan 2022, Last Modified: 31 May 2024SmartWorld/UIC/ScalCom/DigitalTwin/PriComp/Meta 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Securing their users’ privacy is a central duty of Online Social Networks (OSN), but the complex non-linear effects of social media content on privacy is not well understood. We propose a novel framework that integrates XGBoost, Latent Dirichlet Allocation (LDA) topic models and Generalized Additive Models (GAM) to perform statistical inference about the complex non-linear relationship between the topics and privacy of tweets. First, XGBoost is used to predict the privacy of tweets. Then, the predictions are improved by analyzing the classified tweets with LDA topic models. Finally, we model the nonlinear relationship between topics and privacy with GAMs by using (penalized) splines. Instead of being limited to predictive modeling, our approach enables us to model the non-linear relationship between latent topics and the privacy of tweets.
Loading