JUST ADD STRUCTURE: PROTEIN LANGUAGE MODELS COMBINED WITH STRUCTURAL EQUIVARIANCE EXCEL AT PROTEIN TASKS
Keywords: sequence, structure, protein, egnn, equivariance, language models
TL;DR: We contend that prioritizing higher- fidelity representations and biology-aware architectures will yield significantly greater dividends in protein modeling than indiscriminate parameter scaling or elaborate adaptation of PLMs.
Abstract: Accurate in silico prediction of protein properties, functional fitness, and muta-
tional effects remains a central challenge in protein engineering and therapeutic
design. While Protein Language Models (PLMs) successfully capture rich evo-
lutionary and functional constraints from sequence data, they only indirectly en-
code the spatial and geometric information that fundamentally governs protein
function. Consequently, state-of-the-art approaches typically rely on extensive
fine-tuning, ensembling, or the incorporation of handcrafted structural features to
achieve competitive accuracy, making them computationally expensive and dif-
ficult to scale. In this work, we demonstrate that explicit geometric modeling
can substitute for, and in most cases outperform, large-scale PLM fine-tuning,
with much higher parameter efficiency. Our approach, ProtEGNN, pairs PLM
residue representations with a lightweight E(3)-Equivariant Graph Neural Net-
work, competing with or achieving state-of-the-art performance across seven dif-
ferent benchmarks in protein property, mutational effect and function prediction,
while needing 100–1000× fewer parameters than competing approaches. Notably,
even when paired with the smallest readily available PLM, ESM2-T6 (8M parame-
ters), ProtEGNN matches fine-tuned, sequence-only methods on mutational effect
prediction, despite training orders of magnitude fewer parameters. Together, these
results highlight geometric inductive bias as a powerful and scalable alternative to
task-specific fine-tuning of large PLMs for protein modeling.
Presenter: ~Qurat_ul_ain1
Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.
Funding: No, the presenting author of this submission does not fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 108
Loading