Keywords: Drug Discovery, Vision Agent, Automated Detection, Real-Time Phenotypic Screening, Prompt-Driven AI
Abstract: We present a vision agentic detection model for real-time identification of drug-cell interactions in microscopy data, aimed at accelerating drug discovery. Our approach leverages a prompt-driven AI agent to detect and classify phenotypic changes in cells caused by drug treatments without any task-specific training or fine-tuning. This zero-shot capability addresses a major limitation of state-of-the-art (SOTA) deep learning models like YOLO v8/v12, SAM 2, Vision Transformers (ViTs), CLIP, and ConvNeXt, which typically require extensive labeled data and retraining for new experiments. We evaluate our method on the BBBC021 and BBBC022 high-content imaging datasets and on a collection of live-cell YouTube-derived videos, demonstrating that our model achieves comparable or superior accuracy to SOTA supervised models while operating at real-time speeds. The proposed agentic detector outperforms conventional models in adaptability, efficiently generalizing to new cell types and treatments with no additional data collection. We also show significant advantages in efficiency (inferring at dozens of frames per second) and robustness to dataset shifts. Results indicate that our method not only matches SOTA accuracy in drug mechanism-of-action recognition but also offers unprecedented flexibility and speed, suggesting a new paradigm for AI-driven phenotypic screening in drug discovery.
Submission Type: Original Work
Submission Number: 7
Loading