TopoPool: An Adaptive Graph Pooling Layer for Extracting Molecular and Protein Substructures

Published: 25 Oct 2023, Last Modified: 10 Dec 2023AI4D3 2023 PosterEveryoneRevisionsBibTeX
Keywords: graph pooling, molecular representation learning, protein representation learning
TL;DR: We introduce the first learnable graph pooling layer that makes no assumptions about the number or size of the learned pools, a particularly desirable quality in molecular and protein representation learners.
Abstract: Within molecules and proteins, discrete substructures affect high level properties and behavior in distinct ways. As such, explicitly locating and accounting for these substructures is a central problem when learning molecular or protein representations. Typically represented as graphs, this task falls under the umbrella of graph pooling, or segmentation. Given the highly variable size, number, and topology of these substructures, an ideal pooling algorithm would would adapt on a graph-by-graph basis and use local context to locate optimal pools. However, this poses a challenge where differentiability is concerned, and each of the learnable graph pooling methods proposed to date must make strong a priori assumptions in regards to the number or size of the learned pools. As such, demand remains for a graph pooling algorithm that can maintain differentiability while retaining adaptability in the size and number of learned pools. To meet this demand, we introduce the Topographical Pooling Layer (TopoPool): a differentiable, hierarchical graph pooling layer that learns an arbitrary number of varying sized pools without making any a priori assumptions about their number or size. Additionally, it naturally uncovers only connected substructures, increasing the interpretability of the learned pools and obviating the need for exogenous regularizers to enforce connectedness. We evaluate TopoPool on diverse molecular and protein property prediction tasks, where we achieve competitive performance against existing methods. Taken together, TopoPool represents a novel addition to the graph pooling toolbox, and is particularly relevant to areas like drug design where locating and optimizing discrete, connected molecular substructures is of central importance.
Submission Number: 21
Loading