Searching Society Over Large Heterogeneous Information Networks

Published: 2025, Last Modified: 15 Jan 2026ICDE 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Community Search in heterogeneous information networks HINs has received great attention recently, which aims to group community members extensively connected via derived relationships based on a meta-path since there exists no real relationship between members. However, in many applications, it is desired that the derived relationships are more focused on certain query requirements and that the community members are seriously engaged. What is more, to ensure sufficient flexibility, i.e., query requirements could be closely related or relatively loose, we may need multiple communities to collectively cover all query requirements. To the best of our knowledge, there is no existing work providing such flexibility. In this paper, we propose a novel model called society. It first ensures that each derived relationship is related to query requirements, and each community member should be involved at least k in such derived relationships, i.e., the community members exhibit high homogeneous cohesiveness. To the best of our knowledge, there is no existing work providing such flexibility. Then, to ensure the serious engagement of each member, we propose a novel constraint set called heterogeneous constraints, which ensures each member seriously interacts with heterogeneous vertices consisting of the derived relationship. At last, the society model allows for finding a set of communities that collectively cover all requirements. The main challenge of searching society is to efficiently and dynamically maintain the derived relationships since the deletion of a heterogeneous vertex against a heterogeneous constraint can induce dramatic changes over the derived relationships. We propose a novel unified peeling algorithm so that we can control deletions of vertices against homogeneous and heterogeneous cohesiveness and, therefore, provide opportunities for dynamically maintaining the derived relationships. An effective dynamic data structure is then proposed to avoid re-computations of the derived relationships. After that, batch update techniques are studied, which ensure that the time complexity of updating a batch is equivalent to a single update. Extensive experimental studies are conducted on real datasets to justify the effectiveness of our proposed model and the efficiency of the proposed techniques.
Loading