Using GNNs to Model Biased Crowdsourced Data for Urban Applications

Sidhika Balachandar; Shuvom Sadhuka; Bonnie Berger; Emma Pierson; Nikhil Garg

Using GNNs to Model Biased Crowdsourced Data for Urban Applications

Sidhika Balachandar, Shuvom Sadhuka, Bonnie Berger, Emma Pierson, Nikhil Garg

27 Sept 2024 (modified: 15 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Graph Neural Networks, Biased Outcome Data, Urban Planning

TL;DR: We propose a method to fit GNNs with both sparsely observed, unbiased data and densely observed, biased data. We apply our method to New York City 311 reporting.

Abstract: Graph neural networks (GNNs) are widely used to make predictions on graph-structured data in urban spatiotemporal forecasting applications, such as predicting infrastructure problems and weather events. In urban settings, nodes have a true latent state (e.g., street condition) that is sparsely observed (e.g., via government inspection ratings). We more frequently observe biased proxies for the latent state (e.g., via crowdsourced reports) that correlate with resident demographics. We introduce a GNN-based model that uses both unbiased rating data and biased reporting data to predict the true latent state. We show that our approach can both recover the latent state at each node and quantify the reporting biases. We apply our model to a case study of urban incidents using reporting data from New York City 311 complaints across 141 complaint types and rating data from government inspections. We show (i) that our model predicts more correlated ground truth latent states compared to prior work which trains models only on the biased reporting data, (ii) that our model's inferred reporting biases capture known demographic biases, and (iii) that our model's learned ratings capture correlations across locations and between complaint types. Especially in urban crowdsourcing applications, our analysis reveals a widely applicable approach for using GNNs and sparse ground truth data to estimate latent states.

Supplementary Material: pdf

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10602

Loading