Abstract: The application of information entropy to decision tree algorithms has been shown to produce very accurate classifiers. Information entropy is utilized to ensure that the average distance of paths from the non-leaf node to each descendant leaf node of the decision tree is shortest. Therefore, it works well for data set which covers all the underlying rules. But it is lack of prediction ability when the training data set can not cover all the underlying rules. In this paper, we propose a novel indicator, information energy, to generate decision tree. Information energy describes the distance from the current state of a data set to its balance state. Proper selection of attribute can divide a data set into a state of higher information energy and produce classification rules of prediction ability. A generator of random sample sets and rules is designed to provide synthetic samples for experimental verification. Experimental results show that information energy outperforms information entropy in both speed and accuracy when the training data set can not cover all the underlying rules.
Loading