Decoding Histone Modification Signatures of Non-Coding RNAs via Foundation Models

Published: 06 Oct 2025, Last Modified: 06 Oct 2025NeurIPS 2025 2nd Workshop FM4LS PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Histone Modifications, ncRNA, Deep Learning, Transfer Learning
TL;DR: A sequence-only classifier predicts histone marks at ncRNA loci with high accuracy (AUROC up to 0.95), making the method a simple, efficient baseline.
Abstract: Histone modifications help regulate ncRNA genes, but measuring these interactions at scale is difficult. High-throughput experimental techniques such as ChIP-seq are costly and time-consuming, limiting their scalability for mapping histone modifications across diverse cell types and histone markers. We test whether sequence alone can predict histone–ncRNA regulation by training a single marker-conditioned classifier that outputs 50 histone marks. We evaluate two inputs: (i) spliced transcript RNA sequence and (ii) genomic context comprising the gene body plus up to 30 kb upstream DNA. On a curated benchmark with Ensembl coordinates, the context-based model attains a micro-AUROC of 0.95. Despite using frozen pretrained encoders and no cell-type-specific tracks, the approach is simple and data-efficient, providing a practical baseline for studying ncRNA–histone modification interactions.
Submission Number: 53
Loading