Sparse Disentangled VAE for Treatment Effect Estimation with Irrelevant Variables

Sparse Disentangled VAE for Treatment Effect Estimation with Irrelevant Variables

ICLR 2026 Conference Submission17925 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Treatment effect estimation, Irrelevant variables, Variational autoencoder, Disentangled representations, Sparsity, Causal inference

TL;DR: A VAE-based method that learns compact, disentangled representations for accurate treatment effect estimation.

Abstract: Treatment effect estimation from imbalanced observational data is challenging, requiring balanced latent representations to reduce selection bias and enable accurate causal estimates. Many state-of-the-art methods employ VAEs with predetermined latent dimensionality, but this often causes over- or underfitting since too little relevant or too much irrelevant information is encoded. As cross-validating latent dimensionality is impractical for complex models and high-dimensional data, automatic determination is needed. We address this by learning sparsity-inducing masks that sub-select dimensions for each task, using a differentiable $L_0$ objective to penalize active dimensions and a mutual exclusivity regularizer to prevent overlap, ensuring independent and disentangled representations. Conflicting goals of accuracy and sparsity are balanced via Generalized ELBO with Constrained Optimization (GECO), optimizing sparsity only once prediction quality exceeds a threshold. Our method thus infers task-relevant latent factors, yields compact representations, and isolates irrelevant variables in challenging high-dimensional data. Experiments on real-world and synthetic datasets demonstrate improved predictive accuracy, compactness, and disentanglement compared to state-of-the-art baselines.

Supplementary Material: zip

Primary Area: causal reasoning

Submission Number: 17925

Loading