Abstract: Structural shifts pose a significant challenge for graph neural networks, as graph topology acts as a covariate that can vary across domains. Existing domain generalization methods rely on fixed structural augmentations or training on globally perturbed graphs, mechanisms that do not pinpoint which specific edges encode domain-invariant information. We argue that domain-invariant structural information is not rigidly tied to a single topology but resides in the consensus across multiple graph structures derived from topology and feature similarity. To capture this, we first propose EdgeMask-DG, a novel min-max algorithm where an edge masker learns to find worst-case continuous masks subject to a sparsity constraint, compelling a task GNN to perform effectively under these adversarial structural perturbations. Building upon this, we introduce EdgeMask-DG*, an extension that applies this adversarial masking principle to an enriched graph. This enriched graph combines the original topology with feature-derived edges, allowing the model to discover invariances even when the original topology is noisy or domain-specific. At equilibrium, the structural patterns that the task GNN relies upon are, by design, robust and generalizable. EdgeMask-DG* is the first to systematically combine adaptive adversarial topology search with feature-enriched graphs. We provide a formal justification for our approach from a robust optimization perspective. We demonstrate that EdgeMask-DG* achieves new state-of-the-art performance on diverse graph domain generalization benchmarks, including citation networks, social networks, and temporal graphs. Notably, on the Cora OOD benchmark, EdgeMask-DG\* lifts the worst-case domain accuracy to {78.0\%}, a {+3.8 pp} improvement over the prior state of the art (74.2\%). The source code for our experiments can be found here: \url{https://anonymous.4open.science/r/TMLR-EAEF/}
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Revised the Section 4.4 core-hypothesis sentence with a version explicitly grounded in the Graph-DG assumption that $P(\mathbf{X})$ and $P(\mathbf{Y}\mid \mathbf{X})$ remain stable while $P(\mathbf{A}\mid S)$ varies, clarifying that invariance is defined purely over source domains and does not require target access.
Revised the Section 4.6 augment-then-prune rationale to clarify that the objective does not indiscriminately remove universally useful edges, explained the role of redundancy across domains, restated dependence on the Graph-DG assumption, and connected the mechanism to empirical pruning statistics and augmentation ablations.
Updated Algorithm 1 input (line 1) to include the kNN sample ratio $\gamma_{knn}$ alongside the spectral sample ratio $\gamma_{spec}$.
Modified Algorithm 1 preprocessing (line 3) to precompute both kNN edges and spectral edges for all source graphs.
Revised Algorithm 1 sampling steps (lines 8–9) to sample both kNN and spectral edges using $\gamma_{knn}$ and $\gamma_{spec}$, and to combine original, kNN, and spectral edges in the coalesced union graph.
Strengthened the scalability discussion in Section 4.6 by explicitly stating that the spectral variant is the least scalable component, is suited to moderate-scale settings, and that larger graphs require approximate or sparse enrichment schemes.
Code: https://github.com/rbSparky/TMLR
Assigned Action Editor: ~Chao_Chen1
Submission Number: 5319
Loading