Abstract: We introduce a multimodal deep learning framework, Prescriptive Neural Networks (PNNs), that combines ideas from optimization and machine learning, and is, to the best of our knowledge, the first prescriptive method to handle multimodal data. The PNN is a feedforward neural network trained on embeddings to output an outcome-optimizing prescription. In two real-world multimodal datasets, we demonstrate that PNNs prescribe treatments that are able to greatly improve estimated outcomes in transcatheter aortic valve replacement (TAVR) procedures by reducing estimated postoperative complication rates by over 40\% and in liver trauma injuries by reducing estimated mortality rates by 25\%. In four real-world, unimodal tabular datasets, we demonstrate that PNNs outperform or perform comparably to other well-known, state-of-the-art prescriptive models; importantly, on tabular datasets, we also recover interpretability through knowledge distillation, fitting interpretable Optimal Classification Tree models onto the PNN prescriptions as classification targets, which is critical for many real-world applications. Finally, we demonstrate that our multimodal PNN models achieve stability across randomized data splits comparable to other prescriptive methods and produce realistic prescriptions across the different datasets.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=tPdIg0CK6B
Changes Since Last Submission: Dear TMLR and reviewers,
We are re-submitting our paper, Multimodal Deep Learning, which we previously submitted to TMLR. Our previous submission received revisions with comments, which we responded to with appropriate edits, and then finally a rejection with strong recommendation for re-submission from the editor.
With our resubmission, we have implemented changes based on the previous comments and have corresponding responses inline below.
1. Reviewer 7LhU mentions that the best performing methods lack the statistical significance to be claimed as such (e.g. 1.83-+0.23 is not better than 1.73-+0.21). This is a methodological issue that the reviewer had raised and has not been yet resolved. Response: The paper is the first and only, to the best of our knowledge, to propose a prescriptive approach on unstructured data that addresses all treatment scenarios (binary treatment, discrete multiple treatment, single continuous treatment, and multiple continuous treatment). On structured data, our claim is that our approach is comparable (Diabetes, Spleen and REBOA datasets) or better with statistical significance (Groceries dataset) than existing state-of-the-art methods. This claim on structured data is supported by our results in Section 3.6 and Appendix Section A.1.2, where we observe additional statistical significance of our method across different split ratios; for example, on 60%-40% and 70%-30% splits, PNNs perform best with statistical significance in the spleen dataset.
2. Reviewer 7LhU also points out the issue with 50%-50% splits as this is not very common and if the lack of data justifies the use of such dramatic split, at least the authors should attempt to do bootstrapping and provide a more robust estimate (what if the performance is very sensitive to different splits?) This is not captured anywhere at the moment. Response: In previous literature on prescriptive models ([1], [2]), 50%-50% are typical and used. We justify the use of such a split in Section 3.1, where it is explained that a larger test set is needed for quality reward estimation and therefore quality performance reporting. For completeness, we have rerun all experiments for 60%-40%, 70%-30%, and 80%-20% splits and included the results in the appendix, Sections A.2 and A.3. We would like to note that due to the smaller test sets employed for reward estimation in these split ratios, the results should be interpreted cautiously.
3. Reviewer NX5V points out that there were initial hesitations around the causal foundations and its proper explorations in the manuscript. Although some viable modifications have been proposed, the paper still requires major revisions before acceptance. Response: We have made some major revisions to address this comment. First and foremost, we have explicitly stated any assumptions (e.g. ignorability) used, and also taken steps to clean the dataset to ensure positivity. We have trimmed points from the datasets that have extreme propensity scores, in order to ensure adequate overlap between the treatment groups, and we have also clipped their propensity score wherever necessary, before applying doubly-robust estimation. We believe that the recent modifications strengthen the positioning of the paper and offer a clear justification of our methods using fundamental concepts of the causal inference literature (positivity, doubly robust estimators).
We have also experimented with training our models using different methods for treatment effect estimation, including TARnet [3] and Dragonnet [4] for unstructured datasets, and Causal Forests [5] for structured datasets. These experiments demonstrate the versatility of our approach, since the improvements and observations are consistent across different training set rewards. The results of these experiments are presented in Appendix Section A.2.
[1] Maxime Amram, Jack Dunn, and Ying Daisy Zhuo. Optimal policy trees. Machine Learning, 11:2741–2768, 2022. and Dimitris Bertsimas, Jack Dunn, and Nishanth Mundru.
[2] Optimal prescriptive trees. INFORMS Journal on Optimization, 1(2):164–183, 2019. URL https://jack.dunn.nz/papers/OptimalPrescriptiveTrees.pdf.
[3] Uri Shalit, Fredrik D. Johansson, and David Sontag. Estimating individual treatment effect: generalization bounds and algorithms, 2017. URL https://arxiv.org/abs/1606.03976.
[4] Claudia Shi, David M. Blei, and Victor Veitch. Adapting neural networks for the estimation of treatment effects, 2019. URL https://arxiv.org/abs/1906.02120.
Assigned Action Editor: ~Devendra_Singh_Dhami1
Submission Number: 5612
Loading