Abstract: Highlights•Traditional schemes overlook chunk-context, limiting effectiveness in resemblance detection.•Proposing a chunk-context aware approach to improve the accuracy and efficiency in resemblance detection.•N-sub-chunk shingles and BP-neural network enabling qualified chunk representation.•Achieving up to 75.03% more data removal and 5.6x–86.7x faster detection than state-of-the-art work.
Loading