Keywords: genomic models, DNA, transcription factor binding
Abstract: Genomic deep learning has rapidly advanced the prediction of transcription factor (TF) binding and other genomic profiles directly from DNA sequence, yet the biological mechanisms captured by these models remain largely unexplored. In this work, we investigate how state-of-the-art genomic models encode the regulatory logic underlying TF binding. We first systematically analyze the model-derived patterns for each TF, revealing that models frequently rely on broad contextual co-associations of motifs to predict a TF to bind. To quantify the dispersity of this association, we introduce the \textit{Jaccard Overlap Score (JOS)}, which distinguishes concentrated recognition of canonical motifs from more distributed binding signatures. Next, we investigate TF--TF cooperativity through \textit{in silico} knockout experiments, revealing pronounced self-dependence of key regulators and cell-type--specific cooperative grammars. Together, our results provide a mechanistic interpretation of genomic deep learning models, demonstrating both their ability to capture biologically meaningful combinatorial regulation and their reliance on contextual sequence features. Our code is publicly available at https://github.com/AlbertBay/EpiBinder.
Submission Number: 88
Loading