A Free Lunch with Influence Functions? Improving Neural Network Estimates of the Average Treatment Effect with Concepts from Semiparametric Statistics
Abstract: Parameter estimation in empirical fields is usually undertaken using parametric models, and such models readily facilitate statistical inference. Unfortunately, they are unlikely to be sufficiently flexible to be able to adequately model real-world phenomena, and may yield biased estimates. Conversely, non-parametric approaches are flexible but do not readily facilitate statistical inference and may still exhibit residual bias. Using causal inference (specifically, Average Treatment Effect estimation) as an application domain example, we explore the potential for Influence Functions (IFs) to (a) improve initial estimators without needing more data (b) increase model robustness and (c) facilitate statistical inference. We begin with a broad, tutorial-style introduction to IFs and causal inference, before proposing a neural network method `MultiNet', which seeks the diversity of an ensemble using a single architecture. We also introduce variants on the IF update step which we call `MultiStep', and provide a comprehensive evaluation of different approaches. The improvements are found to be dataset dependent, indicating an interaction between the methods used and nature of the data generating process. Our experiments highlight the need for practitioners to check the consistency of their findings, potentially by undertaking multiple analyses with different combinations of estimators. This finding is especially relevant to practitioners working in the domain of causal inference, where ground-truth with which one could assess the performance of one's estimators is not available.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Many thanks to all reviewers for their continued engagement and feedback - it is very much appreciated. We have made substantial additions/changes to the manuscript in response to their constructive criticisms and suggestions, but of course remain open to any further suggestions and comments that the reviewers are willing to share.
The principal changes (most of which are highlighted in blue in the updated pdf) can be summarised as follows:
- New 'meta-analysis' of the results using the Shapley value approach to identify possible interactions between method choices (this comes in response to reviewer **DiMQ** who correctly noted that the marginalized results will make it difficult to identify interactions, as well as reviewer **eFLE** who made a good point regarding looking for patterns of performance).
- Additional results in appendix, including a subset of the full-factorial analysis (this comes in response to reviewer **DiMQ**, for the same reasons as above)
- Incorporation of causal inference topic to title and abstract (in response to reviewer **eFLE**)
- Softening of conclusions and notion of 'free lunch' throughout (in response to reviewers **eFLE** and **DiMQ**)
- Explanation for the assumption of normality at the end of the introduction to section 3.2 (in response to reviewer **eFLE**)
- Clarification of and motivation from the task of statistical inference / null hypothesis significance testing in section 3.2.4 (in response to reviewer **eFLE**)
- Additional broader impact section at the end (in response to reviewer **GzoT**)
- Minor clarifications / corrections / additional references (in response to minor comments / points from **all reviewers**)
- Description of the ideal 'quantile-probability' curve/result in the introduction to section 7.2 to aid interpretation of these results (in response to reviewer **DiMQ**)
- Discussion of other ensemble methods in Section 2 (in response to reviewer **DiMQ**)
- Discussion of our results in the context of those from Farrell et al. (2021) in Section 8 (in response to reviewer **DiMQ**)
- Clarification of the use of MSE in 6.3 (in response to reviewer **GzoT**)
- Clarification of the choice of candidate estimators in 6.3.1 (in response to reviewer **GzoT**)
- Correction of subscripts in Eq. 16 (in response to reviewer **GzoT**)
- Moved Figure 3 earlier to page 13, and referenced it in the last line of the introduction to section 4 (in response to reviewer **GzoT**)
- Stronger motivation for MultiNet in section 5 (in response to **all reviewers**)
- Clearer explanation of the submodel approach in section 4.1 (in response to reviewer **eFLE**)
- Clearer explanation of the relationship between MultiStep and the influence function objective in section 4.2 (in response to reviewer **eFLE**)
- Use of 'bias' vs. 'consistency' check and improve throughout (in response to reviewer **eFLE**) - note that bias is necessary terminology because we attempt to correct for bias in our finite samples and this also affords us consistent estimators. Sometimes the use of bias is overloaded, but we have defined consistency e.g. at the end of page 14.
- We have checked the contributions summary of the paper in section 1 to ensure that all sections are motivated and 'advertised' up-front. To this end, we decided to keep the current structure, and have tried to make it clear that this paper provides (1) tutorial-style introductions to causal inference and influence functions, (2) motivated by these two topics and the requirements therefore, we make two new proposals (MultiNet and MultiStep), (3) an extensive evaluation of the methods including Shapley value approaches, the quantile probability summaries, as well as the raw results over a subset of the full-factorial design.
P.S. We noticed a small problem with the Shapley value code which we have fixed, and we also corrected some typos. The results are approximately the same. We therefore upload a second version accordingly (which is v3).
Assigned Action Editor: ~Pierre_Alquier1
Submission Number: 180
Loading