Abstract: Graph clustering is an important unsupervised learning technique for partitioning graphs with attributes and detecting communities. However, current methods struggle to accurately capture true community structures and intra-cluster relations, be computationally efficient, and identify smaller communities. We address these challenges by integrating coarsening and modularity maximization, effectively leveraging both adjacency and node features to enhance clustering accuracy. We propose a loss function incorporating log-determinant, smoothness, and modularity components using a block majorization-minimization technique, resulting in superior clustering outcomes. The method is theoretically consistent under the Degree-Corrected Stochastic Block Model (DC-SBM), ensuring asymptotic error-free performance and complete label recovery. Our provably convergent and time-efficient algorithm seamlessly integrates with Graph Neural Networks (GNNs) and Variational Graph AutoEncoders (VGAEs) to learn enhanced node features and deliver exceptional clustering performance. Extensive experiments on benchmark datasets demonstrate its superiority over existing state-of-the-art methods for both attributed and non-attributed graphs.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=wvySSGOLzf
Changes Since Last Submission: Based on the feedback received from the reviewers, we have made the following changes. We have uploaded a PDF with the changes colored (as the main submission PDF). A side-by-side diff PDF of the changes highlighted from the last version is available at https://anonymous.4open.science/api/repo/MAGC-8880/file/TMLR%20Final%20Diff.pdf
## Major Changes
- Merged Background and Related Works sections to make it concise and improve readability.
- Moved the detailed literature survey to Appendix B.
- Explain why log-det term helps with inter-connectivity.
- Added small proof for deriving the majorized function (Equations 7-10).
- Added a proof sketch for the convergence analysis.
- Added reasons for choosing DC-SBM for the consistency analysis in Section 3.
- Moved the consistency proof to Appendix G and added a proof sketch in the main paper.
- Clarified the importance of the consistency analysis in the Introduction and Section 3.
- Added a datasets table in the main paper (Table 1).
- Added hyperparameter tuning details in Appendix K.1
- Moved figure showing evolution of various loss terms with training from the appendix to the main paper (Figure 3b).
- Added a hyperparameter sensitivity ablation study in Appendix K.2
- Added a discussion on the visualization of latent spaces.
- Added a discussion on the performance of Q-GCN/VGAE/GMM-VGAE.
- Reordered the appendices based on first occurence.
## Minor Changes
- Improved the captions of the figures and increased sizes.
- Clarified meaning of Q-MAGC in the Introduction.
- Fixed minor grammatical and typing errors in the paper.
---
### **Previous changelog**
We have made the following changes which incorporate Reviewer BtxT's feedback:
- Clarified the first reference of Q-MAGC.
- Defined the soft and hard versions of the cluster assignment matrix C more clearly.
- Corrected grammatical errors.
- Added the reason behind using DC-SBM in Appendix I.
Here is the [PDF diff (anonymized link)](https://anonymous.4open.science/api/repo/MAGC-8880/file/diffchecker%20rebuttal.pdf?v=afcb6801) for convenience.\
On the left side is the old version, and on the right side is the new version.\
Text in red is removed, in green is added and in blue is just moved without edits.
Assigned Action Editor: ~Seungjin_Choi1
Submission Number: 3038
Loading