Propagate & Distill: Towards Effective Graph Learners Using Propagation-Embracing MLPs

Published: 18 Nov 2023, Last Modified: 29 Nov 2023LoG 2023 PosterEveryoneRevisionsBibTeX
Keywords: Graph neural network; Knowledge distillation; Propagation; Multilayer perceptron
TL;DR: Distillation from teacher GNNs to MLPs with propagation
Abstract: Recent studies attempted to utilize multilayer perceptrons (MLPs) to solve semi-supervised node classification on graphs, by training a student MLP by knowledge distillation from a teacher graph neural network (GNN). While previous studies have focused mostly on training the student MLP by matching the output probability distributions between the teacher and student models during distillation, it has not been systematically studied how to inject the structural information in an explicit and interpretable manner. Inspired by GNNs that separate feature transformation $T$ and propagation $\Pi$, we re-frame the distillation process as making the student MLP learn both $T$ and $\Pi$. Although this can be achieved by applying the inverse propagation $\Pi^{-1}$ before distillation from the teacher, it still comes with a high computational cost from large matrix multiplications during training. To solve this problem, we propose Propagate \& Distill (P\&D), which propagates the output of the teacher before distillation, which can be interpreted as an approximate process of the inverse propagation. We demonstrate that P\&D can readily improve the performance of the student MLP.
Submission Type: Extended abstract (max 4 main pages).
Agreement: Check this if you are okay with being contacted to participate in an anonymous survey.
Poster: jpg
Poster Preview: jpg
Submission Number: 62
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview