Higher-Order Molecular Learning: The Cellular Transformer

Published: 06 Mar 2025, Last Modified: 26 Apr 2025GEMEveryoneRevisionsBibTeXCC BY 4.0
Track: Machine learning: computational method and/or computational results
Nature Biotechnology: No
Keywords: Topological Deep Learning, Attention, Higher Order, Molecule, Representation
TL;DR: We introduce the Cellular Transformer (CT) that extends transformers to cell complexes, incorporating higher-order attention for a more native modeling of molecular structures.
Abstract:

We present the Cellular Transformer (CT), a novel topological deep learning (TDL) framework that extends graph transformers to regular cell complexes (CCs), enabling improved modeling of higher-order molecular structures. Representing complex biomolecules effectively is a notorious challenge due to the delicate interplay between geometry (the physical conformation of molecules) and topology (their connectivity and higher-order relationships). Traditional graph-based models often struggle with these complexities, either ignoring higher-order topological features or addressing them in ad-hoc ways. In this work, we introduce a principled cellular transformer mechanism that natively incorporates topological cues (e.g., higher-order bonds, loops, and fused rings). To complement this, we propose the notion of augmented molecular cell complex, a novel and richer representation of molecules able to leverage ring-level motifs and features. Our evaluations on the MoleculeNet benchmark and graph datasets lifted into CCs reveal consistent performance gains over GNN- and transformer-based architectures. Notably, our approach achieves these without relying on graph rewiring, virtual nodes, or in-domain structural encodings, indicating the power of topologically informed attention to capture subtle, global interactions vital to drug discovery and molecular property prediction.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Presenter: Tolga Birdal
Format: Yes, the presenting author will definitely attend in person because they attending ICLR for other complementary reasons.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 116
Loading