Developing Age and Gender Predictive Lexica over Social MediaDownload PDF

2014 (modified: 16 Jul 2019)EMNLP 2014Readers: Everyone
Abstract: Demographic lexica have potential for widespread use in social science, economic, and business applications. We derive predictive lexica (words and weights) for age and gender using regression and classification models from word usage in Facebook, blog, and Twitter data with associated demographic labels. The lexica, made publicly available,1 achieved state-of-the-art accuracy in language based age and gender prediction over Facebook and Twitter, and were evaluated for generalization across social media genres as well as in limited message situations.
0 Replies

Loading