Abstract: High-throughput screening techniques are commonly used in many fields of biology. However, it is well known that non-biological artifacts arising from variability in the technical execution of different experimental batches confound high-throughput screens measurements. These batch effects obscure biological conclusions, and it is therefore necessary to account for them. While a number of techniques have been proposed, to our knowledge there is not a publicly available biological dataset designed specifically for the systematic study of batch effect correction. To this end we announce the release of RxRx1, a set of 125,514 high-resolution fluorescence microscopy images of human cells under 1,108 genetic perturbations in 51 experimental batches across four cell types. Visual inspection of the images by batch makes it clear that the set indeed demonstrates significant batch effects. In this paper we describe the image set in detail. We also propose a classification task designed to study batch effect correction on these images, and provide some baseline results for the task. Our goal in releasing this image set is to encourage researchers across various disciplines to develop effective methods for removing batch effects that generalize well to unseen experimental batches and to share these methods with the scientific community.
0 Replies
Loading