Enabling Efficient and Privacy-Preserving Sequence Similarity Query on Encrypted Genomes

Published: 2025, Last Modified: 08 Jan 2026IEEE Trans. Dependable Secur. Comput. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Over the past decades, Sequence Similarity Query (SSQ) has been widely used in genomic analysis. Several privacy-preserving SSQ schemes have been proposed to protect sensitive genomic data but struggle to balance security and efficiency. This paper proposes a Privacy-preserving Genomic SSQ (PGSSQ) scheme to address the above issue. Specifically, we first design two fundamental privacy-preserving genomic matching methods, Edit-distance Threshold Match ($\mathsf {ETM}$) and Dual-Threshold Match ($\mathsf {DTM}$), to support approximate edit-distance based threshold SSQ matches and range-constrained SSQ matches on encrypted high-dimensional genomic sequences, respectively. Then, we present a genetic index structure called Genomic Evaluation Tree (GE-Tree) based on the $\mathsf {ETM}$ and $\mathsf {DTM}$. GE-Tree enables dynamic pruning of query paths without disclosing any genomic information from encrypted nodes, thereby supporting privacy-preserving SSQ on encrypted genomic data with sublinear computational complexity. Security analysis proves that PGSSQ is secure under selective chosen-plaintext attacks. Experiments on a real-world dataset show that PGSSQ is efficient compared to state-of-the-art schemes.
Loading