A Locality-Sensitive-Hashing-Based Collaborative Recommendation Method for Responsible AI-Driven Recommender Systems

Wenmin Lin, Xinyi Zhou, Lu Sun, Lianyong Qi, Sang-Bing Tsai, Yihong Yang, Hanwen Liu, Huaizhen Kou, Lingzhen Kong

Published: 2025, Last Modified: 14 Jan 2026IEEE Trans. Artif. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: As one of the most representative recommendation solutions, traditional collaborative filtering (CF) models typically have limitations in dealing with large-scale, sparse data to capture complex relationships between users and items. The rise of artificial intelligence (AI) provides powerful tools such as deep neural networks to overcome the data sparsity issue with typical CF models. Existing works on AI-driven recommender systems are focusing on improving recommendation accuracy by extracting rich user/item features from multisource data with deep learning tools to build more accurate user preference model. How to ensure the responsibility of AI-driven recommender systems is still a big challenge. On one hand, AI-driven recommender systems need to access raw data such as user profiles to train recommendation model, which face the risk of leaking user privacy information. On the other hand, the ever-increasing volume of user/item interaction records raises the efficiency challenge to provide recommendation results in a real-time manner. In view of those observations, we propose a collaborative recommendation method by adopting the locality-sensitive-hashing (LSH) technique (i.e., ${\text{DisRec}_{\text{LSH}}}$), which aims to achieve the goal of privacy protection and calculation efficiency for AI-driven recommender systems. More specifically, we split ${\text{DisRec}_{\text{LSH}}}$ into three phases: 1) offline feature extraction: deep neural networks are applied to extract user/item features from multisource user/item data on each local dataset; 2) offline user index building: LSH technique is adopted to map user features to hash codes to quickly recall correlated similar users for a given target user; and 3) online top-N items calculation: with similar users selected by phase 2, top-N items are calculated based on the ranking of predicted user–item rating score. Finally, extensive experiments are conducted on three public datasets to evaluate the efficiency of our proposal.

External IDs:dblp:journals/tai/LinZSQTYLKK25