Districts by Demographics: Predicting U.S. House of Representative Elections using Machine Learning and Demographic Data

Published: 01 Jan 2020, Last Modified: 14 May 2024ICMLA 2020EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Using a selection of machine learning methods and characteristic demographics of U.S. Congressional districts, we aim to determine if party outcome of a House election can be predicted from the demographics of the corresponding district. Utilizing demographics from the Census Bureau's annual American Community Survey and training models on these features we show that while it is possible to predict with high accuracy the party elected in a district, the strongest predictor remains the incumbent party. Further, we observe that certain demographic characteristics such as race and education have much stronger predictive power than age, income, and marital status. Nonetheless, even these weak predictors can enhance model accuracy when properly combined. We also consider model interpretability and informativeness, focusing on decision trees.
Loading