Weakly Supervised Latent Variable Inference of Proximity Bias in CRISPR Gene Knockouts from Single-Cell Images

Published: 06 Mar 2025, Last Modified: 18 Apr 2025ICLR 2025 Workshop LMRLEveryoneRevisionsBibTeXCC BY 4.0
Track: Full Paper Track
Keywords: CRISPR-Cas9, phenomics screening, functional gene perturbations, latent variable inference, single-cell image data
TL;DR: CRISPR-Cas9 induces off-target effects in perturbation screens, which biases gene embeddings to be correlated with chromosome arm neighbors. We remove these cells pre-aggregation using a latent variable inference model.
Abstract: High-throughput screening enables biologists to study cell perturbations by generating large, high-dimensional datasets, such as gene expression profiles and cell microscopy images. Particularly in CRISPR-Cas9 screens, where gene knockout effects are typically represented using perturbation-specific conditional mean embeddings, these representations can be distorted by off-target effects in which the knockouts impact not only the target gene but also neighboring genes on the same chromosome arm, introducing "proximity bias". To address this, we develop a discrete latent variable inference method that leverages correlations between neighboring perturbations as a weak supervision signal to detect single cells affected by off-target effects. Removing these cells reduces spurious correlations between adjacent gene embeddings, achieving comparable correction performance without relying on additional gene expression data. Moreover, we show that the identified cells exhibit chromosome-arm specificity, reinforcing the validity of our approach and its potential for scaling into a genome-wide proximity bias correction method.
Attendance: Kristina Ulicna
Submission Number: 18
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview