Abstract: In the era of big data, datasets often contain a large number of features with great uncertainty and ambiguity, which makes it challenging to identify features of value in downstream tasks. Traditional unsupervised feature selection methods struggle to effectively handle uncertain or fuzzy information, as they often treat information quality and information quantity separately, leading to suboptimal feature selection. To address this limitation, we propose a novel information representation system that integrates fuzzy relations with information source values, enabling a unified framework for quantifying both the quality and quantity of information. Within this system, we introduce two key feature selection criteria: the information evaluation score (IES), which assesses the quality and quantity of information, and the difference degree (DD), which measures the difference between selected and unselected features. Based on these criteria, we develop an unsupervised feature selection algorithm that accounts for the Information Quantity, Quality and Difference Degree of feature (I2QD). The I2QD algorithm effectively selects features by balancing information quality, quantity, and difference, even in the presence of uncertainty. Finally, experimental findings support the efficacy of our proposed I2QD algorithm, offering a promising solution for feature selection.
Loading