Approximation-Based Efficient Query Processing with the Earth Mover's Distance

Published: 2016, Last Modified: 01 Oct 2024DASFAA (2) 2016EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The Earth Mover’s Distance (EMD) is an effective distance-based similarity measure which determines the dissimilarity between data objects by the minimum amount of work required to transform one signature into another one. Although the EMD has been proven to reflect the human perceptual similarity very well in prevalent applications and domains, its high computational time complexity hinders its application to large-scale datasets where the user is rather interested in receiving an answer from the underlying application within a short period of time than requesting an exact and complete query result set. To this end, we propose to improve the efficiency of the query processing with the EMD on signature databases by utilizing signature compression approximations. We introduce an efficient signature compression algorithm to alleviate query computation cost. Furthermore, we theoretically explicate and analyze the approximation-based EMD and the relationship between the proposal and the original EMD. Moreover, our extensive experiments on 4 real world datasets point out the accuracy and efficiency of our approach.
Loading