Reversible Sketch Based on the XOR-Based Hashing

Published: 01 Jan 2006, Last Modified: 31 Oct 2024APSCC 2006EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We provided a novel reversible sketch data structure which had exactly sub-linear finding time proportional to the sketch length. We first introduced the XOR-based hash functions over the Galois field GF({O,1},square,*), defined the full-ranked boolean matrix over GF({0,1}, square, *) and the maximum dispersion among the hash functions. Then, we chose d non-singular boolean matrices randomly to implement the random projection from the source address space {0,1} n to the hash address space {0, 1} m , and used the inverse matrix of one of the randomly chosen nonsingular matrices to implement the reversal mapping. Based on the reversible sketch, we implemented an algorithm that finds and estimates the frequent items online with good accuracy. The estimate procedure used a two-stage strategy which includes identification step and verification step. The identification step generates the candidate frequent items and the verification step further verifies these items. Using a large amount of real Internet traffic data, the experiments demonstrated great improvement at the finding speed and some improvement at the accuracy than the current representative sketch, e.g. Count-Min sketch. Our preliminary results hint at the possibility of using the reversible sketch as a building block for network anomaly detection and distributed real-time traffic analysis
Loading