Generating and decoding methylated DNA with a Human Epigenetic Foundation Model

Published: 02 Mar 2026, Last Modified: 02 Mar 2026MLGenX 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Main track
Keywords: AI, biology, genomics, methylation, epigenetics, pleiades, neuron, cell type, brain, transformer, language model, deconvolution, cell free dna
TL;DR: Epigenetic Foundation Model with Clinical and Biological Applications
Abstract: Gene expression in humans is regulated beyond the four-letter genetic code; cytosine methylation programs cell identity and regulates expression in response to environmental cues. We present Pleiades, a series of whole-epigenome foundation models (90M/600M/7B) trained on 1.9T tokens of methylated and unmethylated human DNA, establishing a new paradigm beyond the modeling of pure DNA sequences. Pleiades achieves state-of-the-art performance compared to leading DNA foundation models on human genomic annotation tasks, such as predicting histone modifications and gene regulatory elements; notably, we find that scaling model size yields consistent gains across all tasks, with the 7B model outperforming both smaller variants and DNA-only baselines. Finally, we show that Pleiades supports a number of cell-free DNA (cfDNA) tasks, opening the door to a new era of direct clinical application of biological foundation models via cfDNA.
AI Policy Confirmation: I confirm that this submission clearly discloses the role of AI systems and human contributors and complies with the ICLR 2026 Policies on Large Language Model Usage and the ICLR Code of Ethics.
Submission Number: 37
Loading