Coupling learning for feature selection in categorical data

Feng Wang, Jiye Liang, Peng Song

Published: 01 Jan 2023, Last Modified: 25 Jul 2025Int. J. Mach. Learn. Cybern. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Feature selection, which is a commonly used data prepossessing technique, focuses on improving model performance and efficiency by removing redundant or irrelevant features. However, an implicit assumption made by traditional feature selection approaches is that data are independent and identically distributed (IID). To further obtain more complex and significant information, an effective feature selection construction should consider the couplings (non-IIDness) contained within feature values and relevance between features. Hence, referring to rough set theory, this paper first introduces a new coupled similarity measure to discover the value-to-feature-to-class coupling information, which can be used to calculate object neighbor and update feature weights. Second, using mutual information, a new coupled relevance measure is defined to capture the feature-to-feature coupling relationships. On this basis, an effective feature-selection algorithm based on coupling learning is developed for categorical data. To demonstrate the proposed algorithm, four common classifiers and 12 UCI data sets are employed in the experiments. The experimental results confirm the feasibility of the new algorithm and its effectiveness.