Keywords: Rape Reporting, Criminology, Gaussian Process, Random Forest, Gradient Boosting, Spatial Modeling, Computational Social Science, Urban Informatics
TL;DR: We predict rape reporting delays using spatial, temporal and social features which, while working with very noisy data, shows some promising results
Abstract: We present a novel approach to estimate the delay observed between the occurrence and reporting of rape crimes. We explore spatial, temporal and social effects in sparse aggregated (area-level) and high-dimensional disaggregated (event-level) data for New York and Los Angeles. Focusing on inference, we apply Gradient Boosting and Random Forests to assess predictor importance, as well as Gaussian Processes to model spatial disparities in reporting times. Our results highlight differences and similarities between the two cities. We identify at-risk populations and communities which may be targeted with focused policies and interventions to support rape victims, apprehend perpetrators, and prevent future crimes.