Gene-centric evaluation of causal variant prediction for DNA models

Published: 17 Jun 2024, Last Modified: 16 Jul 2024ML4LMS PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: genetic variants, deep learning, genomics, gene expression, eqtl
Abstract: DNA models hold significant potential for linking genetic variation to transcriptional regulation, which is crucial for understanding disease mechanisms at the genetic and molecular level and developing targeted therapies. Supervised approaches, such as Enformer and Basenji, have shown promising results in predicting causal variants. Recently, self-supervised models like Nucleotide Transformer and HyenaDNA have made remarkable advancements, with variant-centric benchmarks suggesting competitive performance on the variant effect prediction task. In this study, we propose to evaluate models also on gene-centric benchmarks, which often are of higher relevance to the genetics community for mapping causal variants to affected genes.
Poster: pdf
Submission Number: 127
Loading