Abstract: Random subspace decision forests are commonly used machine learning methods in a wide range of application domains. How to set the random subspace dimensionality ds<math><msub is="true"><mrow is="true"><mi is="true">d</mi></mrow><mrow is="true"><mi is="true">s</mi></mrow></msub></math> in decision forests is a considerable issue that impacts classification quality and efficiency, especially for high dimensional cases. To obtain effective and efficient decision forests that are generally suitable for various classification cases, this paper proposes a novel framework, named Efficient Random Subspace decision forest (ERS). A Half-Range Discrete Uniform distribution-based Varied Dimensionality setting (HRDUVD) method is provided for determining the random subspace dimensionality, and the ERS is formed based on the HRDUVD method. In more detail, a simple discrete uniform distribution in a specific range is employed to set with a given probability the number of randomly selected features for each tree in random subspace decision forests. The HRDUVD method removes the hesitation which appropriate ds<math><msub is="true"><mrow is="true"><mi is="true">d</mi></mrow><mrow is="true"><mi is="true">s</mi></mrow></msub></math> value one should preset for different datasets, while also achieving adequate classification performance along with a relatively short running time. Therefore, setting ds<math><msub is="true"><mrow is="true"><mi is="true">d</mi></mrow><mrow is="true"><mi is="true">s</mi></mrow></msub></math> using the discrete uniform distribution is a highly useful strategy for the proposed ERS.
Loading