When does Observational Data Teach Latent Dynamics? Understanding Control Misalignment with Synthetic Tasks

Published: 02 Mar 2026, Last Modified: 23 Mar 2026Sci4DL 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: generative models, alignment, control, inverse problems, evaluation, diffusion models
TL;DR: We provide a precise mechanistic account as to why, and under what conditions, Control Misalignment (a quiet failure mode where models appear to fit the data marginal but struggle to align with salient latent controls) arises in generative pipelines.
Abstract: Deep generative models are increasingly embedded in applications in robotics, simulation, and image/video/audio synthesis. In these settings, data likelihoods may depend on hidden "control parameters" not directly observed at training time (e.g., speed, energy, transition rule). Although standard loss functions do not enforce distributional alignment over such hidden variables, practitioners often assume that models generate samples with controls reflecting those priors. We identify Control Misalignment (CM): model generations consistently violate distributional alignment with the control prior in patterned ways across tasks and architectures, posing significant safety and fairness concerns. We first catalogue the prevalence of CM in real-world vignettes: distribution shifts of movement speed in D4RL motion planning, total energy in double-pendulum physical simulation, and speaking rate in Tacotron2 speech synthesis. Next, we probe when, and why, such drifts emerge, testing confounds against carefully constructed synthetic tasks with known controls and tunable chaoticity. Through this, we characterize the mechanism behind CM: error signatures in data space are transported through ill-conditioned or ambiguous recovery procedures into coherent control-space malapportionment. We verify this mechanistic interpretation by constructing a minimal toy system that reproduces the defining characteristics of CM. Finally, we apply our learnings to propose mitigations, analyzing which ones are theoretically sound or empirically effective versus not. Overall, our work provides a precise mechanistic account as to why, and under what conditions, Control Misalignment arises in generative pipelines.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Style Files: I have used the style files.
Submission Number: 39
Loading