Keywords: perceptual hash; ResNet-18; feature binarization; fuzzy extractor; min-entropy; approximate nearest neighbor; HMAC; SHA-3; error-correcting codes; cryptographic tag
Abstract: We present a two-layer construction for image hashing.
First, a \emph{perceptual} binary code $c(x)$ is derived from a ResNet-18 embedding
(after global average pooling, $d=512$) via a linear projection and sign quantization;
optionally, a real-valued serialization of length $n=8ds$ bits is used.
The code $c(x)$ enables fast approximate nearest-neighbor search:
we empirically measure robustness to permissible transforms (low intra-BER),
separability of unrelated pairs (inter distances near $n/2$), bit balance and weak inter-bit correlations,
and we estimate a lower bound on the source min-entropy.
Second, $c(x)$ serves as a noisy source for a \emph{fuzzy extractor} producing a reproducible secret $R$ and public data $P$;
a cryptographic tag $T$ is then derived via KDF and HMAC/SHA-3.
This preserves similarity search over $c(x)$ while assigning cryptographic guarantees
(preimage/second-preimage/collision) to $T$, which reduce to the security of the underlying primitives
given sufficient post-publication min-entropy $H_\infty(C\,|\,P)$.
We discuss limitations of perceptual hashes (adversarial examples) and parameter selection
($n$, error-correction radius $t$, secret length $|R|$) driven by measured BER distributions and min-entropy estimates.
Submission Number: 49
Loading