MapDupReducer: detecting near duplicates over massive datasetsOpen Website

Published: 2010, Last Modified: 16 May 2023SIGMOD Conference 2010Readers: Everyone
Abstract: Near duplicate detection benefits many applications, e.g., on-line news selection over the Web by keyword search. The purpose of this demo is to show the design and implementation of MapDupReducer, a MapReduce based system capable of detecting near duplicates over massive datasets efficiently.
0 Replies

Loading