Feature Selection in the Presence of Monotone Batch Effects

ICML 2023 Workshop SCIS Submission80 Authors

Published: 20 Jun 2023, Last Modified: 28 Jul 2023SCIS 2023 PosterEveryoneRevisions
Keywords: Monotone Batch Effects, Deep Neural Network, Maximum Mean Discrepancy, Lasso
Abstract: We study the problem of feature selection in the presence of monotone batch effects when merging datasets from disparate technologies and different environments affects the underlying causal dependence of data features. We propose two novel algorithms for this task: 1) joint feature selection and batch effect correction through transforming the data batches using Generative Adversarial Networks (GANs); 2) transforming data using a batch-invariant characteristic (i.e., feature rank) to append datasets. We assess the performance of the feature selection methods used in conjunction with our batch effect removal methods. Our experiments on synthetic data show that the former method combined with Lasso improves the $F_1$ score significantly, even with few samples per dataset. This method outperforms popular batch effect removal algorithms, including Combat-Seq, Limma, and PCA. Comparatively, while the ranking method is computationally more efficient, its performance is worse due to the information loss resulting from ignoring the magnitude of data.
Submission Number: 80
Loading