Keywords: Protein, Graph Neural Network
Abstract: Graph structures are widely leveraged to represent proteins. However, to a large extent, proteins fold into complex three-dimensional conformations that cannot be entirely well-captured by graphs built only from sequence adjacency or distance cutoffs. In this paper, we discover that a more faithful characterization comes from secondary structure elements—such as $\alpha$-helices and $\beta$-sheets—that reflect recurring local motifs and stabilizing hydrogen-bond patterns. To this end, we propose a new graph neural network framework that augments node representations with the secondary structure assignment of each residue and introduces a novel edge-construction strategy based on hydrogen bonds weighted by their energetic strength. This formulation captures both local structural context and long-range couplings essential to protein stability. On commonly used benchmarks, our model achieves the leading accuracy compared with state-of-the-art methods while providing improved interpretability through biologically meaningful edges. These results highlight the promise of secondary-structure-aware, energy-weighted graphs as an effective inductive bias for protein representation learning.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 21924
Loading