Abstract: Multi-view bipartite graph clustering (MVBGC) is an active pipeline in unsupervised learning to tackle the limited scalability issue of traditional graph clustering. Despite improved performance, numerous variants still fall under conventional modeling that plugs additional modules, which however induces increasingly intricate models and fails to reveal the inherent variable relationship. We make the first attempt to introduce probabilistic graphical models for modeling the multi-view bipartite graph clustering task, reformulating it as a maximum likelihood estimation (MLE) problem. Such a setting uncovers the underlying probabilistic correlations among the commonality, view-specific variables, and noisy components. By pruning redundancy and disturbance collectively referred to as noise, we prove that minimizing the total noise is an approximation of the lower bound of MLE for multi-view data observations. We further generalize the MLE setting with clustering-suited constraints, deriving a Generalized Probabilistic Graphical Modeling framework (GProM), achieving an interpretable, concise, and flexible MVBGC framework. Extensive experiments verify the effectiveness of our framework. Furthermore, statistical significance analysis reveals the effectiveness of different distribution assumptions, providing valuable insights for model design.
External IDs:dblp:journals/pami/LiPYZLZLLTL25
Loading