Designing Continuous Conditioning for GANs from WAE Latent Structure

Published: 03 Mar 2026, Last Modified: 05 Mar 2026ICLR 2026 DeLTa Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: generative modelling, conditional GANs, embedding methods, representation learning, latent space analysis, data-efficient methods, inverse problems, scientific ML
TL;DR: WAE diagnostics reveal continuous labels often map to near-linear, monotone latent directions. Guided by this, the paper adds FiLM to R3GAN, improving label fidelity and sample quality; non-monotone labels need nonlinear encoders.
Abstract: Fast conditional generative models are used as surrogates in scientific workflows (e.g., parameter sweeps and inner-loop inference), but conditioning GANs on *continuous* scalar labels remains challenging: the label space is infinite, quantitative fidelity matters, and many conditioning mechanisms rely on conditional normalization incompatible with normalization-free backbones like R3GAN. We ask which conditioning mechanism is *structurally* appropriate for continuous labels under one-pass sampling constraints in a normalization-free GAN. Our approach uses Wasserstein autoencoders (WAEs) as a diagnostic tool: by analyzing the label-conditional aggregated posterior across eight datasets spanning scientific domains, we find that in many continuous-label settings, label variation aligns with simple, low-complexity directions in latent space well captured by feature-wise affine shifts and scales. Guided by this empirical structure and by R3GAN's constraints, we propose a lightweight FiLM-style conditioning module that maps a normalized scalar label to per-channel scale and shift parameters and injects them inside R3GAN bottleneck blocks, preserving normalization-free design while retaining single-network-evaluation sampling at inference. Across datasets, we link the latent label structure to practical conditioning choices and evaluate not only sample quality but also conditional fidelity and generalization to missing-label intervals. We show that FiLM-based modulation improves controllability and label interpolation compared with input concatenation, and that the same diagnostic predicts when simple affine conditioning is sufficient (e.g., label-interval holdout) and when additional embedding capacity is required under non-monotone label--attribute links.
Submission Number: 131
Loading