From Edge Detection to Regulatory Logic Discovery: Residual Set Models for Exact Regulator Recovery in Gene Regulatory Networks
Abstract: Single-cell gene regulatory network (GRN) inference is typically framed as pairwise link prediction, producing ranked regulator--target edges. However, transcriptional regulation is inherently combinatorial: targets are controlled by specific regulator teams, making edge-centric evaluation misaligned with biological mechanism.
We study GRN inference as Exact Regulator Set Recovery and introduce a two-stage Filter-and-Refine pipeline.
Stage-1 retrieves a high-recall candidate pool using a target-conditioned attention retriever, while Stage-2 selects the regulator team using Residual HOS2, which learns non-additive set interactions on top of a decomposable pairwise base.
Experiments on SERGIO DS3 (de-noised; $G{=}1200$, $C{=}2700$) show that the attention retriever substantially improves Recall@80 over a dot-product baseline:
$0.649 \rightarrow 0.895$ ($R{=}2$),
$0.718 \rightarrow 0.872$ ($R{=}3$), and
$0.842 \rightarrow 1.000$ ($R{=}4$).
With exact subset decoding under tractable caps ($K{=}80$, $M_R{=}\{80,80,55\}$),
Residual HOS2 improves unconditional exact recovery to
$0.281 \pm 0.061 / 0.342 \pm 0.059 / 0.193 \pm 0.110$ for $R{=}2/3/4$
(mean$\pm$std; 3 seeds),
outperforming decomposable PairS2.
Gains persist when conditioned on coverage, indicating improvements beyond retrieval ceilings, and oracle injection stress tests isolate the necessity of residual high-order modeling under strong confounding.
Submission Number: 76
Loading