Abstract: In this paper, we introduce the concept of principal communities and propose a principal graph encoder embedding method that concurrently detects these communities and achieves vertex embedding. Given a graph adjacency matrix with vertex labels, the method computes a sample community score for each community, ranking them to measure community importance and estimate a set of principal communities. The method then produces a vertex embedding by retaining only the dimensions corresponding to these principal communities. Theoretically, we define the population version of the encoder embedding and the community score based on a random Bernoulli graph distribution. We prove that the population principal graph encoder embedding preserves the conditional density of the vertex labels and that the population community score successfully distinguishes the principal communities. We conduct a variety of simulations to demonstrate the finite-sample accuracy in detecting ground-truth principal communities, as well as the advantages in embedding visualization and subsequent vertex classification. The method is further applied to a set of real-world graphs, showcasing its numerical advantages, including robustness to label noise and computational scalability.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Updated based on reviewer suggestions:
1. Revised the abstract, introduction, and method sections to better highlight the main contributions of the proposed method.
2. Added Section 3.5 on the population community score to address the theoretical gap identified in the initial submission.
3. Introduced Section 5.3 on the noisy setting to more effectively illustrate the numerical advantages of the proposed method.
Assigned Action Editor: ~Varun_Kanade1
Submission Number: 2791
Loading