Measuring Stereotypes using Entity-Centric Data

Published: 01 Jan 2023, Last Modified: 07 Jun 2024CoRR 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Social media users on sites like Twitter, Instagram, and Tiktok use the profile description, or bio, field of user profiles to present themselves to the world. In contrast to the ``offline'' world, where social context often encourages us to adopt a single identity, the profile description is a free-text field in which users are encouraged to present the self using multiple, sometimes conflicting, social identities. While sociologists, social psychologists, sociolinguists, and increasingly computational social scientists, have developed a large and growing array of methods to estimate the meaning of individual social identities, little work has attended to the ways in which social meanings emerge from the collections of social identities present in social media bios. The present work proposes and evaluate three novel, identity-based methods to measure the social dimensions of meaning expressed in Twitter bios. We show that these models outperform reasonable baselines with respect to 1) predicting which sets of identities are more likely to co-occur within a single biography and 2) quantifying perceptions of entire social media biographies along salient dimensions of social meaning on Twitter, in particular partisanship. We demonstrate the utility of our method in a computational social science setting by using model outputs to better understand how self presentation along dimensions of partisanship, religion, age, and gender are related to the sharing of URLs on Twitter from low versus high quality news sites.
Loading