Semi-Supervised Learning of Tree-Based Models Using Uncertain Interpretation of Data

Jack Henry Good; Shyla Bisht; Kyle Miller; Artur Dubrawski

Semi-Supervised Learning of Tree-Based Models Using Uncertain Interpretation of Data

Jack Henry Good, Shyla Bisht, Kyle Miller, Artur Dubrawski

24 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: semi-supervised learning, decision tree, tree ensemble, random forest

Abstract: Semi-supervised learning (SSL) learns an estimator from labeled and unlabeled data. While diverse methods based on various assumptions have been developed for parametric models, SSL for tree-based models is largely limited to variants of self-training, for which decision trees are not well-suited. We introduce an intrinsic semi-supervised learning algorithm that achieves state-of-the-art performance for tree-based models. The algorithm first grows a tree to minimize a semi-supervised notion of impurity, then assigns leaf values using a leaf similarity graph to optimize either for smoothness or adversarial robustness of the estimator near the data. Our methods can be viewed as natural extensions of conventional tree induction methods emerging from an uncertain interpretation of model input, or alternatively as inductive tree-based approximations of well-established graph-based SSL algorithms.

Primary Area: general machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 8579

Loading