Abstract: The rise of GPS-equipped mobile devices has led to the emergence of big trajectory data. The collected raw
data usually contain errors and anomalies information caused by device failure, sensor error, and environment
influence. Low-quality data fails to support application requirements and therefore raw data will be
comprehensively cleaned before usage. Existing methods are suboptimal to detect GPS data errors and do the
repairing. To solve the problem, we propose a framework called GPSClean to analyze the anomalies data and
develop effective methods to repair the data. There are primarily four modules in GPSClean: (i) data preprocessing,
(ii) data filling, (iii) data repairing, and (iv) data conversion. For (i), we propose an approach named
MDSort (Maximum Disorder Sorting) to efficiently solve the issue of data disorder. For (ii), we propose a
method named NNF (Nearest Neighbor Filling) to fill missing data. For (iii), we design an approach named
RCSWS (Range Constraints and Sliding Window Statistics) to repair anomalies and also improve the
accuracy of data repairing by mak7ing use of driving direction.We use 45 million real trajectory data to evaluate
our proposal in a prototype database system SECONDO. Experimental results show that the accuracy of
RCSWS is three times higher than an alternative method SCREEN and nearly an order of magnitude higher
than an alternative method EWMA.
0 Replies
Loading