From Edge Detection to Regulatory Logic Discovery: Residual Set Models for Exact Regulator Recovery in Gene Regulatory Networks

Published: 02 Mar 2026, Last Modified: 08 May 2026MLGenX 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Single-cell gene regulatory network (GRN) inference is typically framed as pairwise link prediction, producing ranked regulator--target edges. However, transcriptional regulation is inherently combinatorial: targets are controlled by specific regulator teams, making edge-centric evaluation misaligned with biological mechanism. We study GRN inference as Exact Regulator Set Recovery and introduce a two-stage Filter-and-Refine pipeline. Stage-1 retrieves a high-recall candidate pool using a target-conditioned attention retriever, while Stage-2 selects the regulator team using Residual HOS2, which learns non-additive set interactions on top of a decomposable pairwise base. Experiments on SERGIO DS3 (de-noised; $G{=}1200$, $C{=}2700$) show that the attention retriever substantially improves Recall@80 over a dot-product baseline: $0.649 \rightarrow 0.895$ ($R{=}2$), $0.718 \rightarrow 0.872$ ($R{=}3$), and $0.842 \rightarrow 1.000$ ($R{=}4$). With exact subset decoding under tractable caps ($K{=}80$, $M_R{=}\{80,80,55\}$), Residual HOS2 improves unconditional exact recovery to $0.281 \pm 0.061 / 0.342 \pm 0.059 / 0.193 \pm 0.110$ for $R{=}2/3/4$ (mean$\pm$std; 3 seeds), outperforming decomposable PairS2. Gains persist when conditioned on coverage, indicating improvements beyond retrieval ceilings, and oracle injection stress tests isolate the necessity of residual high-order modeling under strong confounding.
Submission Number: 76
Loading