Abstract: Transformer models trained on genomic sequence data achieve strong predictive performance, but interpreting their learned representations remains difficult. We analyze DNABERT using Layer-wise Relevance Propagation (LRP), attention-based scores, and gradient-based methods on enhancer detection and non-TATA promoter classification tasks. Attribution quality is assessed via mutagenesis experiments that perturb positions ranked as important by each method. We find that the impact of targeted mutations varies substantially across tasks. In enhancer detection, mutating high-scoring positions leads to marked performance degradation, indicating reliance on localized sequence features. In contrast, for promoter classification against structured genomic background regions, model performance degrades more slowly, consistent with more distributed representations. These findings suggest that attribution-based evaluations are sensitive to dataset construction and task semantics, and that conclusions about interpretability methods should be conditioned on the structure of the underlying prediction problem.
Loading