Abstract: While the representation of decision trees is fully expressive theoretically, it has been observed that traditional decision trees has the replication problem. This problem makes decision trees to be large and learnable only when sufficient training data are available. In this paper, we present a new representation model, conditional independence trees (CITrees), to tackle the replication problem from probability perspective. We propose a novel algorithm for learning CITrees. Our experiments show that CITrees outperform naive Bayes (Langley, Iba, & Thomas 1992), C4.5 (Quinlan 1993), TAN (Friedman, Geiger, & Goldszmidt 1997), and AODE (Webb, Boughton, & Wang 2005) significantly in classification accuracy.
0 Replies
Loading