FlowHash: Accelerating Audio Search with Balanced Hashing via Normalizing Flow

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Audio fingerprinting, Indexing, Normalizing Flows, Information Retrieval, Self-supervised learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Nearest neighbor search on context representation vectors is a formidable task due to challenges posed by high dimensionality, scalability issues, and potential noise within query vectors. Our novel approach leverages normalizing flow within a self-supervised learning framework to effectively tackle these challenges, specifically in the context of audio fingerprinting tasks. Audio fingerprinting systems incorporate two key components: audio encoding and indexing. The existing systems consider these components independently, resulting in suboptimal performance. Our approach optimizes the interplay between these components, facilitating the adaptation of vectors to the indexing structure. Additionally, we distribute vectors in the latent $\mathbb{R}^K$ space using normalizing flow, resulting in balanced $K$-bit hash codes. This allows indexing vectors using a balanced hash table, where vectors are uniformly distributed across all possible $2^K$ hash buckets. This significantly accelerates retrieval, achieving speedups of up to 3$\times$ compared to the Locality-Sensitive Hashing (LSH). We empirically demonstrate that our system is scalable, highly effective, and efficient in identifying short audio queries ($\leq$2s), particularly at high noise and reverberation levels.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4907
Loading