Keywords: Counterfactual Text Generation, Large Language Models, Social Media Dynamics, Causality
Abstract: We propose CausalT5, a generative framework for estimating causal effects in social media timelines. Given a user’s posting history, the model estimates how an event at time i, for example, engaging with a low-credibility news outlet, influences subsequent posts. Building on neural architectures for causal inference, CausalT5 departs from outcome-only prediction by generating counterfactual post-treatment messages. A T5 language model is trained with three objectives: conditional generation of observed posts, treatment assignment classification, and outcome prediction via a differentiable count of attributes in generated posts. We evaluate CausalT5 on semi-synthetic data with known effects, finding that (a) generated posts are linguistically plausible and consistent with real post-intervention behavior, (b) CausalT5 estimates average treatment effects as accurately as strong outcome-prediction baselines, and (c) it captures heterogeneous effects and remains robust under topical shifts. These results suggest generative counterfactual modeling with CausalT5 is a promising tool for causal analysis of social media dynamics.
Paper Type: Long
Research Area: Computational Social Science, Cultural Analytics, and NLP for Social Good
Research Area Keywords: NLP tools for social analysis; quantitative analyses of news and/or social media; human behavior analysis
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis
Languages Studied: english
Submission Number: 6453
Loading