Genomic heterogeneity inflates the performance of variant pathogenicity predictions

Baiyu Lu; Xueshen Liu; Po-Yu Lin; Nadav Brandes

Genomic heterogeneity inflates the performance of variant pathogenicity predictions

Baiyu Lu, Xueshen Liu, Po-Yu Lin, Nadav Brandes

Published: 02 Mar 2026, Last Modified: 13 Mar 2026Gen² 2026 PosterEveryoneRevisionsCC BY 4.0

Track: Full / long paper (5-8 pages)

Keywords: genomic heterogeneity, variant pathogenicity prediction, ClinVar benchmark, DNA language models, protein models

TL;DR: Performance of AI models predicting genetic variant pathogenicity is inflated by heterogeneity across genomic contexts; we build a standardized benchmark revealing true model strengths by variant type.

Abstract: Recent studies have reported unprecedented accuracy predicting pathogenic variants across the genome, including in noncoding regions, using large AI models trained on vast genomic data. We present a comprehensive evaluation of these frontier models, showing that performance is inflated by differences in the prevalence of pathogenic variants across genomic contexts. We identify the best-performing models for each variant type and establish a benchmark to guide future progress.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 61

Loading