PUBHOMICS: A Multispecies Biological Dataset to Catalyze AI-Driven Toxicity Assessment for Environmental and Public Health

Published: 24 Sept 2025, Last Modified: 26 Dec 2025NeurIPS2025-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Track 2: Dataset Proposal Competition
Keywords: Transcriptomics, ML, AI, Chemical safety, and Toxicology, Public and Enviromental health.
TL;DR: PUBHOMICS is a large-scale multispecies dataset designed to accelerate AI-driven toxicity assessment for public and environmental health.
Abstract: Environmental and public health remain under served by the recent data revolution that enabled major AI advances in drug discovery. Existing toxicity datasets are biased toward drug-like molecules and are fragmented across repositories, limiting their use for machine learning and cross-species translation. We propose PUBHOMICS, a scalable, openly shareable dataset capturing transcriptional responses to environmentally relevant chemical perturbations across cell types, organs, and species. PUBHOMICS will expand chemical coverage to classes absent from existing resources, enable AI models to predict transcriptomic responses to novel exposures, and support mechanism-based toxicity prediction with cross-species translation for regulatory decision-making. By advancing exposomics toward causation and providing a foundation for New Approach Methodologies (NAMs), PUBHOMICS aims to accelerate regulatory adoption and enable “benign-by-design” strategies that bridge exposure science with systems biology.
Submission Number: 185
Loading