everyone
since 24 May 2024">EveryoneRevisionsBibTeXCC BY 4.0
DNA models hold significant potential for linking genetic variation to transcriptional regulation, which is crucial for understanding disease mechanisms at the genetic and molecular level and developing targeted therapies. Supervised approaches, such as Enformer and Basenji, have shown promising results in predicting causal variants. Recently, self-supervised models like Nucleotide Transformer and HyenaDNA have made remarkable advancements, with variant-centric benchmarks suggesting competitive performance on the variant effect prediction task. In this study, we propose to evaluate models also on gene-centric benchmarks, which often are of higher relevance to the genetics community for mapping causal variants to affected genes.