Reduced-Rank Outcome Compression for Causal Policy Optimization

TMLR Paper6769 Authors

02 Dec 2025 (modified: 09 Jun 2026)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Evaluating the causal impacts of possible interventions is crucial for informing decision-making, especially towards improving access to opportunity. If causal effects are heterogeneous and predictable from covariates, then personalized treatment decisions can improve individual outcomes and contribute to both efficiency and equity. In practice, however, causal researchers do not have a single outcome in mind a priori and often collect multiple outcomes of interest that are noisy estimates of the true target of interest. For example, in government-assisted social benefit programs, policymakers collect many outcomes to understand the multidimensional nature of poverty. The ultimate goal is to learn an optimal treatment policy that in some sense maximizes multiple outcomes simultaneously. To address such issues, we present a data-driven dimensionality-reduction methodology for multiple outcomes in the context of optimal policy learning with multiple objectives. We learn a low-dimensional representation of the true outcome from the observed outcomes using reduced rank regression. We develop a suite of estimates that use the model to denoise observed outcomes, including commonly-used index weightings. These methods improve estimation error in policy evaluation and optimization, including on a case study of real-world cash transfer and social intervention data. Reducing the variance of noisy social outcomes can improve the performance of algorithmic allocations.
Submission Type: Long submission (more than 12 pages of main content)
Changes Since Last Submission: We greatly appreciate the review team's efforts. We have uploaded a revision that addresses the above reviewer comments. To facilitate comparison, relevant edits are highlighted in purple. We are happy to address the reviewer's remaining comments and have made the following minor adjustments in presentation to fix. 1. We have clarified the distinction between the RR-CV-$Y$ estimator in eq 11 and the RR-CV-$\hat Y$ in eq 14 using the imputed outcomes, and we have more clearly stated the claim that RR-CV-$Y$ shares this relationship with standard DR guarantees, while RR-CV-$\hat Y$ does not. These adjustments are on p.8-9 (minor adjustments elsewhere). Overall we have adjusted our language to present both these estimators: RR-CV-$Y$ enjoys stronger theoretical guarantees while our experiments indicate that RR-CV-$\hat Y$ shrinks more aggressively and does well in tailored experiments. 2. Yes, the statement is a comparison of MSE (bias^2 + variance) of IPW vs. denoised IPW. Although the original text indicated the MSE comparison inline, we have added an additional line indicating the bias^2 + variance decomposition (and therefore making more clear which terms correspond to which). Indeed, the final line separates the prediction bias terms on the left hand side from the outcome noise variance terms on the right hand side, and these are different: it ultimately depends on the data-generating process whether the inequality holds, it is not uniformly valid. However, this explicitly highlights what needs to be established for denoised IPW to improve.
Assigned Action Editor: ~Bryon_Aragam1
Submission Number: 6769
Loading