Genomic heterogeneity inflates the performance of variant pathogenicity predictions

Published: 02 Mar 2026, Last Modified: 13 Mar 2026Gen² 2026 PosterEveryoneRevisionsCC BY 4.0
Track: Full / long paper (5-8 pages)
Keywords: genomic heterogeneity, variant pathogenicity prediction, ClinVar benchmark, DNA language models, protein models
TL;DR: Performance of AI models predicting genetic variant pathogenicity is inflated by heterogeneity across genomic contexts; we build a standardized benchmark revealing true model strengths by variant type.
Abstract: Recent studies have reported unprecedented accuracy predicting pathogenic variants across the genome, including in noncoding regions, using large AI models trained on vast genomic data. We present a comprehensive evaluation of these frontier models, showing that performance is inflated by differences in the prevalence of pathogenic variants across genomic contexts. We identify the best-performing models for each variant type and establish a benchmark to guide future progress.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 61
Loading