Hashing with Uncertainty Quantification via Sampling-based Hypothesis Testing

TMLR Paper2963 Authors

04 Jul 2024 (modified: 23 Sept 2024)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: To quantify different types of uncertainty when deriving hash-codes for image retrieval, we develop a probabilistic hashing model (ProbHash). Sampling-based hypothesis testing is then derived for hashing with uncertainty quantification (HashUQ) in ProbHash to improve the granularity of hashing-based retrieval by prioritizing the data with confident hash-codes. HashUQ can drastically improve the retrieval performance without sacrificing computational efficiency. For efficient deployment of HashUQ in real-world applications, we discretize the quantified uncertainty to reduce the potential storage overhead. Experimental results show that our HashUQ can achieve state-of-the-art retrieval performance on three image datasets. Ablation experiments on model hyperparameters, different model components, and effects of UQ are also provided with performance comparisons.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: 1. We have changed the organization of Section 2.4 to present two variant hashing model constructs of our method separately for clearer presentation. We have also added one more paragraph to discuss the connection to variational information bottleneck. 2. We have moved the literature review section to the end of the paper and added one more paragraph to discuss differences and novelty of our work compared to existing works. 3. We have moved the definition of mean average precision from the *Appendix* to Section 5.1 of the *Main Text* and added one more section to define the Hadamard matrix in *Appendix* A.1. 4. We have added new experimental results to empirically compare MC Dropout to Fully Factorized Gaussian in the newly added section, *Appendix*B.5, as the support for our model choice. 5. We have changed the notations to a double subscript to avoid clashing as suggested by reviewer **Qmeu**. 6. We include histograms and Q-Q plots of the difference between paired samples of each Bernoulli success rate of hash vectors. 7. We have included a comparison between paired sample t-test and Shannon's entropy based measure of uncertainty in *Appendix* C.1. 8. We have added a discussion about the choice of likelihood function for "center-targeted" construction in *Appendix A.3*.
Assigned Action Editor: ~Shiyu_Chang2
Submission Number: 2963
Loading