Multi-Label Feature Selection for High-Dimensional Biological Data via Global Relevance and Redundancy Optimization based on JS Divergence
TL;DR: We incorporate JS divergence into the GRRO algorithm to measure the correlations between labels.
Abstract: In recent years, multi-label feature selection has been widely used in fields such as bioinformatics, information retrieval, and multimedia annotation. As an effective data pre-processing step, multi-label feature selection has shown its effectiveness in dealing with high-dimensional biological data in fields such as bioinformatics. Most of the previous multi-label feature selection methods are directly transformed from the traditional single-label feature selection methods, or they cannot make full use of label information. As a result, the selected feature subset involves features that are redundant or irrelevant to label information. Moreover, most algorithms do not use discretization when processing continuous data sets, so they cannot effectively eliminate the interference of abnormal data. In order to solve these problems, based on the GRRO (Global Relevance and Redundancy Optimization) algorithm introduces JS divergence to measure the correlation between labels and introduces data discretization pre-processing operations for continuous data sets. The experimental results after experimental verification on ten typical high-dimensional biological data sets show that the GRRO-JS algorithm is superior to the traditional multi-label feature selection method and the GRRO algorithm in terms of accuracy and efficiency and has high practical value.
Submission Number: 134
Loading