Is Feature Extraction the most informative dimensionality reduction technique? Revisiting Unsupervised Feature Selection from a Dynamic Approach

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: dynamic feature selection, unsupervised learning, dimensionality reduction
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: This paper compares unsupervised feature extraction and unsupervised feature selection techniques in the context of dimensionality reduction without using labeled data. Unsupervised feature extraction transforms the input space into a lower-dimensional representation by creating informative features that capture underlying patterns, leading to improved model performance. On the other hand, unsupervised feature selection chooses a subset of features based on predefined criteria, potentially overlooking important relationships and reducing the model's discriminative power. State-of-the-art researches suggest that feature extraction outperforms feature selection in terms of model accuracy and robustness. Leveraging the intrinsic structure of the data, unsupervised feature extraction provides richer representations, enhancing the model's ability to discern complex patterns. These paper proposes to revisit feature selection algorithms from a dynamic perspective, where the features are selected depending on the specific sample input. Through empirical evaluations, it will be demonstrated that unsupervised feature selection outperforms feature extraction, both in accuracy and data compression. These findings highlight the potential of unsupervised feature selection as a powerful approach for dimensionality reduction and improved model performance, particularly when labeled data is scarce or unavailable.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7099
Loading