Automatically Inferring Data Quality for Spatiotemporal Forecasting

Sungyong Seo, Arash Mohegh, George Ban-Weiss, Yan Liu

Feb 15, 2018 (modified: Feb 27, 2018) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Spatiotemporal forecasting has become an increasingly important prediction task in machine learning and statistics due to its vast applications, such as climate modeling, traffic prediction, video caching predictions, and so on. While numerous studies have been conducted, most existing works assume that the data from different sources or across different locations are equally reliable. Due to cost, accessibility, or other factors, it is inevitable that the data quality could vary, which introduces significant biases into the model and leads to unreliable prediction results. The problem could be exacerbated in black-box prediction models, such as deep neural networks. In this paper, we propose a novel solution that can automatically infer data quality levels of different sources through local variations of spatiotemporal signals without explicit labels. Furthermore, we integrate the estimate of data quality level with graph convolutional networks to exploit their efficient structures. We evaluate our proposed method on forecasting temperatures in Los Angeles.
  • TL;DR: We propose a method that infers the time-varying data quality level for spatiotemporal forecasting without explicitly assigned labels.
  • Keywords: spatiotemporal data, graph convolutional network, data quality