Building Simple Models: A Case Study with Decision Trees

David D. Jensen, Tim Oates, Paul R. Cohen

1997 (modified: 21 Dec 2021)IDA 1997Readers: Everyone

Abstract: Building correctly-sized models is a central challenge for induction algorithms. Many approaches to decision tree induction fail this challenge. Under a broad range of circumstances, these approaches exhibit a nearly linear relationship between training set size and tree size, even after accuracy has ceased to increase. These algorithms fail to adjust for the statistical effects of comparing multiple subtrees. Adjusting for these effects produces trees with little or no excess structure.

0 Replies