We have added a new section (Sec 4.2.2) that includes additional empirical comparisons requested by the reviewers. We have also added some clarifications and fixed several grammatical and typographical errors.
Second revision: we re-ran our experiments, this time tuning the hyperparameters of each non-BART-based method. We have updated the relevant figures in the Section 4.2 and tables in the Appendix. We also added a table showing the grid of hyperparameter values considered for each method
Third revision: corrected an issue with pre-processing categorical predictors for competing oblique ensemble methods. Re-ran experiments and updated the empirical results and exposition.
Final revision/camera-ready upload: we have fixed the identified typos, updated the caption of Figure 4, addressed the capitalization in Table A2, and added a de-anonymized link to our code. Added the OpenReview link.