Learn to Walk with Continuous-action for Knowledge-enhanced Recommendation System

Jiahao Sun, Yu Liu, Xianjie Zhang, Xiujuan Xu, Li Hong, Kai Wang

Published: 01 Jan 2024, Last Modified: 27 Sept 2024IJCNN 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Knowledge graphs are more widely utilized to enhance recommendability and explainability. Reinforcement learning agents built to wander around the knowledge graph have been successfully applied in recommendation systems in a form of multi-hop relation reasoning. Some previous multi-hop methods relied on reinforcement learning of discrete actions, making agent space design challenging and a lack of clarity in the meaning of actions because of inconsistent action. To solve the aforementioned issues, we propose Continuous-action Walking-tendency Interest-oriented Path Reasoning (CWIPR), a novel and pioneering method that uses continuous actions provided by reinforcement learning agents to predict inference relations and the next entity. Meanwhile, to better interact with the knowledge graph through continuous actions, we firstly propose a graph search algorithm called the walking tendency algorithm. Moreover, we introduce an interest-oriented reward as the intrinsic reward that encourages the agent to balance the tendency between exploring the most similar entities and exploring the correct recommendation type to achieve more precise recommendations. We extensively evaluate our method on three real-world datasets from Amazon and obtain favorable performance compared with state-of-the-art methods.