Keywords: neural program synthesis, weight-space learning, meta-learning, permutation-equivariant graph networks, zero-shot generalization
TL;DR: We show that a structure-aware hypernetwork can directly generate full neural-network weights—treating weights as a continuous program modality—and outperform baselines with strong zero-shot generalization across unseen tasks.
Abstract: We study the neural program synthesis of $\textit{parameterized}$ function families through the lens of meta-learning with hypernetworks. Given a user intent $U$, a meta-learner $M_{\phi}$ produces a full weight set $\hat{\theta}=M_{\phi}(U)$ for a target neural network with fixed architecture $S$, and the instantiated network $m_{S,\hat{\theta}}(X)\to Y$ realizes the behavior intended for $U$. Classical hypernetworks typically $\textit{ignore the target network’s structure}$ and emit a flat list of weights; as a consequence, they fail to account for $\textit{neuron-permutation symmetry}$—many distinct parameterizations of $S$ implement the same function—so equivalent solutions are treated as different targets, fragmenting supervision and hurting out-of-distribution generalization. To address this, we propose $\textit{Meta-GNN}$, a hypernetwork that constructs a $\textit{neural graph}$ from the target architecture $S$ and applies $\textbf{structure-aware}$ message passing with parameter-tied encoders and decoders. This design reduces the search space during learning by collapsing equivalent classes of target networks, without loss of expressivity. Empirically, across modular arithmetic ($\textit{AddMod}$-$p$), array operations ($\textit{SumFirst}$-$n$), and inverse-rule tasks from 1D-ARC, $\textit{Meta-GNN}$ substantially improves learning and $\textbf{out-of-distribution generalization}$ compared to classic hypernetworks and direct $(U,X)\to Y$ baselines. Mechanistic analyses reveal $\textit{what is learned}$: on $\textit{AddMod}$-$p$ the synthesized Transformers recover the canonical clock representation and admit a compact closed-form map $U\mapsto\theta$. These results demonstrate that structure-aware Meta-GNNs enable reliable generalization to $\textit{unseen program parameterizations}$, providing a critical advance for the nascent field of neural program synthesis.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 23148
Loading