From Edge Detection to Regulatory Logic Discovery: Residual Set Models for Exact Regulator Recovery in Gene Regulatory Networks
Abstract: Single-cell gene regulatory network (GRN) inference is typically framed as pairwise link prediction, producing ranked regulator--target edges. However, transcriptional regulation is inherently combinatorial: targets are controlled by specific regulator teams, making edge-centric evaluation misaligned with biological mechanism.
We study GRN inference as Exact Regulator Set Recovery and introduce a two-stage Filter-and-Refine pipeline.
Stage-1 retrieves a high-recall candidate pool using a target-conditioned attention retriever, while Stage-2 selects the regulator team using Residual HOS2, which learns non-additive set interactions on top of a decomposable pairwise base.
Experiments on SERGIO DS3 (de-noised; $G{=}1200$, $C{=}2700$) show that the attention retriever substantially improves Recall@80 over a dot-product baseline:
$0.649 \rightarrow 0.895$ ($R{=}2$),
$0.718 \rightarrow 0.872$ ($R{=}3$), and
$0.842 \rightarrow 1.000$ ($R{=}4$).
With exact subset decoding under tractable caps ($K{=}80$, $M_R{=}\{80,80,55\}$),
Residual HOS2 improves unconditional exact recovery to
$0.281 \pm 0.061 / 0.342 \pm 0.059 / 0.193 \pm 0.110$ for $R{=}2/3/4$
(mean$\pm$std; 3 seeds),
outperforming decomposable PairS2.
Gains persist when conditioned on coverage, indicating improvements beyond retrieval ceilings, and oracle injection stress tests isolate the necessity of residual high-order modeling under strong confounding.
Track: Main track
Keywords: Gene Regulatory Networks, Gene Representation Learning, Computational Genetics
TLDR: The paper introduces a two-stage pipeline that reframes gene regulatory network inference as exact regulator set recovery by using a residual set scorer to capture non-additive interactions between regulators.
AI Policy Confirmation: I confirm that this submission clearly discloses the role of AI systems and human contributors and complies with the ICLR 2026 Policies on Large Language Model Usage and the ICLR Code of Ethics.
Submission Number: 76
Loading