Abstract: Feature selection is an important data preprocessing process in artificial intelligence, which aims to eliminate redundant features while retaining essential features. Measuring feature significance and relevance between features is a significant challenge. Fuzzy information entropy is an extension of Shannon entropy. It is widely used for quantifying the information of fuzzy divisions. However, it has significant limitations, notably the lack of monotonicity in fuzzy conditional entropy measure of decision uncertainty in the feature selection process. We introduce a novel measurement macrogranular entropy (ME) and construct some generalized forms, such as conditional ME, mutual macrogranular information, and joint ME. The conditional ME exhibits monotonicity when measuring decision uncertainty. In addition, we propose two feature selection algorithms: one based on monotonic conditional ME (MCME), and the other based on the degree of symmetric association (ADSA). The ADSA algorithm and the MCME algorithm are compared against eight other feature selection algorithms through a series of experiments. The comparison was conducted based on classification performance using SVM and NB classifiers, and evaluation metrics including F1-score and recall. In terms of all four evaluation metrics, ADSA and MCME achieved the top two rankings, respectively. Specifically, on the NB and SVM classifiers, the ADSA algorithm improves the average accuracy by 12.22% and 2.88% compared to the original feature set, while MCME improves the accuracy by 10.07% and 1.01%, respectively. Experimental comparisons demonstrate that ADSA algorithm effectively removes redundant information from the dataset during feature selection.
External IDs:dblp:journals/tai/ZhuZD25
Loading