Abstract: Genomes fold into 3D organizational units that can influence critical biological functions. In particular, the organization of chromatin into A and B compartments segregates its active regions from inactive regions. Therefore, these compartments can shed light on cell type-specific activities of the genome. However, obtaining Hi-C data for all cell and tissue types of interest is prohibitively expensive, which has limited the widespread consideration of compartment status. We present a prediction tool called Compartment prediction using Recurrent Neural Network (CoRNN) that models the relationship between the compartmental organization of the genome and histone modification enrichment. Our model predicts A/B compartments, in a cross-cell type setting, with an average area under the ROC curve of 90.9%. Our cell type-specific compartment predictions show high overlap with known functional elements. We investigate our predictions by systematically removing combinations of histone marks and find that H3K27ac and H3K36me3 are the most predictive marks. We perform a detailed analysis of loci where compartment status cannot be accurately predicted from these marks. These regions represent chromatin with ambiguous compartmental status, likely due to variations in status within the population of cells. These ambiguous loci also show highly variable compartmental status between biological replicates in the same GM12878 cell type. Finally, we demonstrate the generalizability of our model by predicting compartments in independent tissue samples. Available at: https://github.com/rsinghlab/CoRNN
0 Replies
Loading